The Story Behind Tasseo

2012-05-07 10:19:32 by jdixon

A little over a week ago I released the Tasseo dashboard. The response I got back was nothing short of astonishing. Tasseo is a Graphite dashboard, one of many to have been released in recent months. That fact alone led me to believe it would fly quietly under the radar. I couldn't have been more wrong; Tasseo (pronounced like Casio) tallied over 200 GitHub watchers in the first weekend, and should pass 300 today.

Tasseo was originally developed as a from-the-ground-up reimplementation of the Pulse dashboard we use at Heroku. Pulse has been a tremendously valuable tool for us; unfortunately, it has some drawbacks that make it a challenge to maintain.

First, Pulse's configuration is tightly coupled into the codebase. Second, the codebase is in Clojure; this is not a problem per se, but it means that anyone wanting to add new graphs has to ramp up on Clojure first. Third, it uses a custom aggregation mechanism powered by our event log stream; this in itself is not a bad thing, but it means we have to do more work than should be necessary just to render charts. Fourth, it only provides a limited window into our real-time metrics; it has no means to retrieve or render archived data. There were other issues driving our decision, but I think this is already a reasonable justification for alternatives.

We knew that we wanted something that could take advantage of our Graphite server. There are other dashboards out there that use Graphite, but none of them provide the real-time feedback that we get from Pulse. Configuration should be simple and not highly dependent on configuration management tools (e.g. Chef) nor woven into the source code. With those basic design criteria in mind, I started work on Tasseo.

Almost from the beginning I settled on JSON as the configuration format. Each dashboard view exists as a .js file in the project's public/d directory. Although this is not strictly JSON (variables are used to define the metrics as well as metric and dashboard-level attributes), these files are completely separate from the codebase; a misconfigured or missing file will not cause the app to break. The format is simple and predictable.

// sample dashboard config
var metrics =
[
  {
    "alias": "pulse-events-per-second",
    "target": "pulse.pulse-events-per-second",
    "warning": 100,
    "critical": 500
  }
];

A very basic Sinatra application serves mainly to serve up the Javascript libraries. It reads the list of available dashboards and presents links to each of them on its index view.

I decided to use the Rickshaw toolkit for constructing charts. Rickshaw is a pleasant abstraction layer on top of d3.js. Each chart is a Javascript object, which makes it far less painful to render a pageful of graphs. Objects can easily be constructed, modified, passed around, and (when necessary) destroyed and recreated again. Rickshaw's code is easy to read and the documentation and examples are plentiful.

The first "complete" version was finished in about two days. A day later Tasseo gained "night mode", suitable for large on-screen dashboards. The next couple of days added some dashboard-level attributes (e.g. refresh), bug fixes for null data and general user experience improvements. And because Tasseo uses a completely fluid layout (no tables or forced breaks), a new feature to "pad cells" was added to aid in providing some structure to layouts.

Over the past weekend I worked on a time panel that lets you pause the real-time feed and display an older snapshot of data instead. This is enormously useful for viewing a metric in context beyond the last 5 minutes (default real-time feed), to seeing performance over the last hour, day or week. When you're done the live feed can be resumed once again.

Hopefully this offers some insight into Tasseo's development and philosophy (even if you weren't asking for it). Although I haven't been able to find any obvious Javascript memory leaks, I'm sure there are plenty of opportunities to improve its performance. Fortunately, initial feedback leads me to believe that the d3 rendering is already a leaps-and-bounds improvement over the jQuery sparklines we used in Pulse. I would like to keep the project simple and focus on performance, and I'm grateful for pull requests that help us toward those goals.

Comments

at 2012-05-08 19:02:05, Chandan wrote in to say...

Hi, I was demploying your tool. But during setup I found the error

http://fpaste.org/BIIr/.

Appreciate if you could help. I am not a developer. In order to get around the problem I did bundler install. But could not get my example page open.

at 2012-05-08 19:35:35, Jason Dixon wrote in to say...

@Chandan - Your error tells you that your Ruby is missing the readline libs. http://niwos.com/2011/04/15/fixing-ruby-readline-errors-on-centos/ is for CentOS, but you should be able to adapt their instructions for your own install.

at 2012-06-13 15:08:12, Jin Chen wrote in to say...

Hi ,

Is it possible to deploy tasseo on my local apache2/passenger instead of Heroku

Thanks & regards

Jin

at 2012-06-13 15:19:24, Jason Dixon wrote in to say...

@Jin - Yes you should be able to, but I have no experience doing so myself.

at 2013-06-06 15:46:57, Eric wrote in to say...

when will the time panel be available?

at 2013-06-06 15:50:56, Jason Dixon wrote in to say...

@Eric - Not sure I follow. That feature's been public for a loooong time.

at 2013-10-21 02:11:34, javsha wrote in to say...

hi,

i have ganglia 3.5.x and it comes with a live dashboard i.e. tasseo.

so can ganglia metrics be used directly by tasseo or do i need to install graphite.

any help in this setup is welcome.

thanks.

at 2013-11-02 22:17:31, Jason Dixon wrote in to say...

@javsha - Ganglia has an embedded version of Tasseo built-in. As I understand it you should be able to use any metrics recognized within your Ganglia system with that Tasseo.

at 2014-03-24 11:24:51, Werner Otto wrote in to say...

I am interested to know if you can have a high water mark and low watermark alert configured on the same target. At the moment you will need a duplicate of the target with its own warning and critical values, one being an inverse of the other to achieve this?

Add a comment:

  name

  email

  url

max length 4000 chars