Your log data is a treasure-trove of information about your application, but it can be overwhelming. This post will dig into several strategies for extracting metrics and other helpful information from your logs. We’ll start with the basics of the heroku logs command, then we’ll dig into the real fun using a tool called Angle Grinder.
How to view your Heroku logs
heroku logs on its own just prints the most recent logs from your app and then exits. Generally that’s not very useful. I almost always want the -t (or --tail) option to continually tail my logs. Additionally I usually want it scoped to a specific dyno process, so I’ll include -d router, -d web, or -d worker so I’m only seeing relevant logs.
This uses a “pipe” to send the output of heroku logs as input to the grep utility. Grep searches plain text data using regular expressions. We’re not using any regexp syntax in this example, but we certainly could.
Chaining heroku logs -t with grep is how I’m most often interacting with my Heroku logs, but sometimes I need more power.
Extracting specific values
It’s time to meet Angle Grinder, a real-time log analysis power tool. It can parse many formats, extract values, and even perform aggregations. Let’s just dive right in with an example.
Heroku’s router logs are formatted the same for every Heroku app, so let’s use those. Here’s the same router log output we saw above:
You’ll notice that the metadata is all formatted as [key]=[value]. This format is called logfmt. It’s embraced heavily by Heroku, and it’s one of the many formats that Angle Grinder understands.
Now we’re getting somewhere! Let’s expand the formatting string to display even more data, along with some defined widths so we have control over the alignment. We’ll also split the command into multiple lines to make it more readable.
heroku logs -t -d router | agrind \
'* | logfmt' \
--format '{dyno:<6} {connect:6} {service:7} {status:4} {method:5} {path}'
web.3 0ms 52ms 204 POST /api/REDACTED/v2/reports?dyno=web.1&pid=37
web.23 1ms 17ms 204 POST /api/REDACTED/v2/reports?dyno=web.2&pid=4
web.4 1ms 40ms 204 POST /api/REDACTED/v2/reports?dyno=web.3&pid=186
web.1 1ms 17ms 204 POST /api/REDACTED/v2/reports?dyno=web.1&pid=86
web.13 1ms 10ms 204 POST /api/REDACTED/v2/reports?dyno=web.21&pid=65
Beautiful! We’ve broken the wall of text into a nicely-aligned, human-readable stream of output. You can use this to watch your web traffic in real-time. From this we could quickly see if a particular dyno is struggling, or if a particular path is being requested more frequently than we expect.
How does Angle Grinder work?
Okay, let’s peel apart that last example.
First we tail Heroku logs using the Heroku CLI, just like before:
heroku logs -t -d router
Then we pipe the output to agrind (just like we were piping to grep before). agrind takes a single argument—called the “query"—which tells Angle Grinder how to process the incoming text. Queries can get complex as we’ll soon see, but this example keeps it simple:
heroku logs -t -d router | agrind '* | logfmt'
Here, the query is * | logfmt, and we enclose it in single quotes to separate it from the rest of the command. It might look a bit odd seeing the "pipe” character within the query string, so take care to not confuse it with a command line pipe.
The Angle Grinder query syntax is a composition of an initial filter with a series of operators. Here’s how this query breaks down:
* – This is the initial filter, and it tells Angle Grinder to operate on every input line.
logfmt – This is an operator, and it extracts key/value fields using the logfmt syntax.
The first operator is generally a parser. Next we’ll see how subsequent operators would then operate on the extracted fields.
Aggregating your log data in real-time
With a steady volume of requests, what we’ve seen so far can still be an overwhelming amount of information. Depending on what we’re looking for, we can use Angle Grinder’s aggregate operators to summarize log data in real-time.
For example, here’s a snapshot of 50th and 95th percentile service times by dyno:
Now our Angle Grinder query has gotten a bit more complex, with several operations chained together:
* – Operate on every input line.
logfmt – Extract fields using the logfmt parser.
p50(service), p95(service), count by dyno – Calculate three aggregates grouped by the “dyno” field. This assumes a “dyno” field was extracted via logfmt, which it was. The output of this operation is three fields: “p50”, “p95”, and “_count”.
sort by p50 desc – Sort the aggregate results by the “p50” field.
We could change the order of these fields, rename them using “as”, or query multiple fields:
heroku logs -t -d router | agrind \
'* | logfmt | count as count, p95(connect) as connect, p95(service) as service by dyno | sort by p50 desc'
dyno count connect service
---------------------------------------------------------
web.3 618 1 84
web.1 693 1 93
web.6 623 2 62
web.4 624 1 20
web.5 674 2 561
web.2 673 2 573
web.7 621 4 191
The query is getting hard to read, so I’d split it to many lines:
heroku logs -t -d router | agrind \
'*
| logfmt
| count as count,
p95(connect) as connect,
p95(service) as service
by dyno
| sort by p50 desc'
Parsing more than just logfmt
One query I find very handy is grouping by the second. This let’s me see a live second-by-second summary of how my app is performing. I’ll use this when there’s a production bottleneck so I can see exactly when the issue is resolved.
heroku logs -t -d router | agrind \
'*
| parse "T*." as time
| logfmt
| count as count,
p95(connect) as connect
p95(service) as service
by time
| sort by time'
time count connect service
-----------------------------------------------------------
19:20:36 354 2 1452
19:20:37 492 2 825
19:20:38 519 1 740
19:20:39 513 3 270
19:20:40 435 2 74
19:20:41 451 4 76
19:20:42 468 4 82
This query introduces a new parser called parse. For this aggregation, I needed the time string truncated to the second, and that’s not available in the logfmt data. parse gives us the power of arbitrary string matching to pull out values from anywhere in the log line.
Notice that the timestamp precedes the logfmt data, and it is not formatted as a logfmt key/value pair.
parse "T*." as time extracts a single field, “time”, using the pattern T*.. The asterisk (*) is a wildcard, and anything it consumes is extracted as a field. So a single asterisk means a single field is extracted. The text between T and . is 16:23:07, so that becomes the value for “time”.
From there, it’s all what we’ve seen before: logfmt to parse the logfmt data, then aggregate operations, then sorting the output.
Conditional aggregates
One last example to show how we can use conditional logic in our queries.
heroku logs -t -d router | agrind \
'*
| parse "T*." as time
| logfmt
| count(status == 200) as successful,
count(status != 200) as failure,
p95(service) as service
p95(connect) as connect
by time
| sort by time desc'
time successful failure service connect
-------------------------------------------------------------------------------
19:41:58 189 244 1803 1
19:41:57 182 120 2076 1
19:41:56 191 163 2479 1
19:41:55 174 122 2300 1
19:41:54 176 97 2196 1
This example is taken from a different app, and it’s not doing so well. Over half of the requests are failing. 😱
Still, it’s pretty cool that we can use conditionals in our count aggregate using extracted logfmt fields like “status”.
Your turn
Hopefully by now you’ve installed Angle Grinder, and you’re having some fun slicing and dicing your own log data.
Our Heroku log stream can be completely overwhelming, but tools like heroku logs, grep, and especially agrind give us everything we need to get useful and actionable information from the firehose.