Graphite Tip - Group by Node

2013-04-14 18:23:29 by jdixon

In the process of setting up some graphs for Status Board, I thought it would be nice to render my GitHub activity (in terms of commits). As I demonstrated in a post last year, you can fire off a metric to Graphite using GitHub's post-commit webhook feature. Rendered with drawAsInfinite, this is nice for getting a rough visualization of your commit activity, but doesn't provide total counts. Alternatively, you could use group with summarize to get totals per interval, but you wouldn't be able to view per-repository numbers.

Enter the groupByNode function. This feature has two really nice features bulit-in: first, it lets you specify which node (segment of the target path) to perform aggregation on; second, it allows you choose the callback function that you prefer (e.g. sumSeries). This is exactly what we want for this situation; we have a ton of unique metrics under a handful of groupings, and we want to understand what the aggregate looks like. Here is the URI I'm using to visualize my public GitHub commits, per repository. I've indented it manually so it's a little clearer how the functions are applied in tandem.

               groupByNode(github.*.*.*.*.*.*, 1, "sumSeries"),

Starting with the innermost line, we apply groupByNode to our entire github branch of metrics (seven layers deep). The first argument (1) specifies to group based on the second node (0-indexed), which in this case is the repository field. This in turn gets passed via callback to the named function in our second argument (sumSeries).

The rest is straightforward. Each series is passed to summarize where they're grouped into 1-day intervals, and then into aliasByNode where we apply an alias based on the repository name (the only node returned from groupByNode). Here is the actual graph going back three months:

Sadly, I forgot to enable the webhook for a number of my repositories, so I've missed out on some useful metrics from my past activity. Going forward I should have all of this data available so perhaps I'll revisit this in the months to come.

Add a comment:




max length 4000 chars