My Impressions of InfluxDB

2013-11-11 12:15:38 by jdixon

I mentioned last week that I was planning to look closer at InfluxDB this past weekend, and some folks asked me to do a writeup on my findings.

InfluxDB is a time-series metrics and events database based on the LevelDB key-value store. LevelDB was written and open sourced by Google, and is an optional backend for Riak. InfluxDB (or "Influx", for short) inherits many of LevelDB's default characteristics, which means it's optimized for writes and uses compression by default, but it can be slow for reads and deletes.

Influx was only recently released as an open source project under the MIT license by the team over at Errplane. Its authors are transparent about their goals and roadmap, publishing them on the project website.

Getting started with Influx is easy enough. They provide a free online sandbox where you can create a user account and database, then immediately begin writing metrics via their administration UI. You can also start sending metrics remotely, which is simple enough with curl and a bit of JSON.

$ for i in `seq 100`; do \
>   curl -X POST "http://sandbox.influxdb.org:9061/db/foobar/series?u=foo&p=bar" \
>     -d "[{\"name\":\"data\",\"columns\":[\"foo\",\"bar\"], \
>     \"points\":[[$((RANDOM%200+100)),$((RANDOM%300+50))]]}]"; \
>   sleep 1; \
> done

From there you can run queries and visualize your metrics within the same interface.

I don't care for their choice of accepting credentials via URL parameters, but it looks like they agree and intend to support HTTP Basic Auth soon. Otherwise, their HTTP API is largely param-driven, making it straightforward and easy to use, and is capable of sending chunked or non-chunked responses.

Influx accepts queries via an SQL-like query language, so if you're comfortable with traditional relational databases, you should feel right at home. It already supports filtering using where clauses, in addition to aggregates using group by, merge and join. There are a handful of mathematical functions (min, max, mean, mode, median, and percentile), making it suitable for routine analytics tasks.

They plan to add a feature called continuous queries, which will allow users to "precompute expensive queries into another time series in real-time". At face value this sounds very much like Graphite's carbon-aggregator. However, because you would have the entire query language at your disposal, it has the potential to be much more powerful.

There remain some questions around its potential to scale, as well as general benchmarks. Paul Dix, the founder of Errplane and apparent Influx project lead, says that both clustering and benchmarks are expected to be released in December. A work-in-progress GitHub pull-request is open for tracking the ongoing clustering work. Paul has given some anecdotal numbers of "20k-70k points per second" in the project mailing list.

Language bindings already exist for Javascript (front-end), Ruby, Python and Node.js. Personally, I found that interacting directly with the HTTP API was already simple enough, and managed to add backend support for InfluxDB to Tasseo within a couple hours. It would be nice to see a more efficient binary wire protocol for submitting metrics, and it seems the authors agree, but only time will tell if that happens.

As someone who loves metrics, there's a lot to like about InfluxDB. It's easy to get started with, there are no external service/component dependencies, and submitting and querying metrics are a breeze. Because Errplane already uses Influx in production (or something based on it), I think it's safe to assume that innovation will continue at a healthy pace.

However, there remain some very important gaps in functionality, particularly in the areas of scaling and high availability, complex and/or user-defined functions, and a more robust graphing / discovery UI. And yet, I think many of us want Influx to succeed, if only because scaling tools such as Graphite can be such a challenge. The next couple of months should reveal whether InfluxDB is the scalable time-series project that many of us believe it can be, or whether it falls by the wayside as an orphaned fork of the Errplane engine.

Comments

at 2013-11-11 13:15:09, Paul Dix wrote in to say...

Thanks for the writeup and feedback Jason! Some thoughts/comments.

We picked LevelDB because it was quick and easy to get something built that was reasonably fast and would support hundreds of thousands or millions of different time series. However, there are definitely more tests we need to run to make sure it'll be the right choice given our design goals, which is one of the reasons we haven't yet blessed any of the builds for production use. Of particular concern are large range deletions and compaction delays as mentioned in the Groups thread you linked. We structured the code in a way so that we'll be able to trade out storage engines later, which is something we'll probably test out after we have a good benchmark suite set up.

Good to know that you also wanted Basic HTTP Auth. I didn't know that one was an issue until last night. Should be added within the next few days. The binary wire protocol is definitely something we'll be adding soon. First we want to get to clustering and benchmarking different scenarios. No sense in optimizing a protocol until we have a good benchmark suite and numbers to compare it against.

The gaps in scaling and high availability are our highest priority right now. We should have an initial version of that in December. Very soon after that we'll target custom user defined functions. I think those paired with continuous queries will make Influx super useful for all sorts of real-time computing/analytics/metrics tasks.

For graphing and discovery we have an idea for that. We've separated the core database out from the administrative UI. The admin UI source can now be found here: https://github.com/influxdb/influxdb-admin. Our thought is that the core adminsitration parts (cluster admin, db admin) will be standard and we'll have a directory structure to split out different exploration/graphing interfaces.

The goal being that anyone can easily fork this repo and play around with building their own interface. We'll happily take pull requests and merge those different interfaces into the main admin UI so that you'll be able to select which explorer you want. So the admin interface releases will be decoupled from the database releases since the front end will probably have more churn and innovation in the long term. We want to make it easy for people to modify and deploy without having to touch the core database. We'll be figuring out the directory structure and writing instructions for contributing explorer/graphing interfaces later this week.

Thanks again for the feedback, keep it coming!

Paul

at 2013-11-12 09:56:07, Baron Schwartz wrote in to say...

I do not want a time-series database project to focus on graphing/discovery/visualization. This tendency to mush data and visualization tools together into one is a HUGE detriment of most of the existing tools. I am a big fan of doing one thing and doing it well -- and time-series data is the part that usually suffers in these Graphite/OpenTSDB/whatever tools in my opinion. I don't give a hoot how sexy the graphs look if the underlying data storage system can't do much of anything right.

at 2013-11-12 13:32:23, Jason Dixon wrote in to say...

@Baron - I largely agree. However I think that where existing TSDB tools fall down is not because they've focused on adding visualization/discovery (I'd like to hear more from you on why you feel that's a detriment), but rather because (particularly in the case of OpenTSDB) the retrieval experience (read: API) is so bad.

at 2013-11-12 20:21:01, Yann wrote in to say...

Hi folks, a quick question regarding data durability: is it possible to set data to expire? I wouldn't want to keep all my measurements forever :)

Thanks!

at 2013-11-13 14:07:30, Baron Schwartz wrote in to say...

IMO it's pretty simple: any effort spent on a UI for a database is effort that should have been spent on the database itself. Both are difficult projects that only a small portion of developers can do really well, but people who can develop a database are far rarer and more specialized in my experience.

at 2013-11-15 13:17:19, Paul Dix wrote in to say...

@Yann: sorry for the late response. Definitely, you can have periodic deletes that run. We're also thinking of making it so that you can set a TTL on data in a time series. So you can just say that whatever goes in there should only stick around for whatever period of time.

@Baron: we're definitely focused on building the core of the database. However, we want to create a common place where people can make their own dashboards, UIs, and custom visualizations that they can share with others. Kind of like having a package management system (like Ruby Gems or Pypi) for sharing libraries in a programming language.

at 2013-11-22 05:19:32, andrew ting md wrote in to say...

Great article.

at 2014-07-02 18:04:27, Alan Blount wrote in to say...

Thanks for the writeup.

We've been using a basic ELK stack and now we are looking into expanding that to included statsd and graphite...

We are looking at our options and InfluxDB has come up as an alternative with fewer dependencies than Graphite (django), and Graphana looks like a great graphing front-end which works on top of either InfluxDB or Graphite.

http://grafana.org/

So - if you were "starting fresh" which would you choose now?

Statsd --> Graphite

or

Statsd --> InfluxDB + Graphana

at 2015-01-31 08:06:08, Robert Mckeown wrote in to say...

Hi all,

Thx for this writeup too. I found it v. useful.

I completely agree on he observation about mushing database and visualization together - as in 'don't' do it. The burning need is really for the db (at least for selfish me!)

Are there updated performance figures available ?

Rob

Add a comment:

  name

  email

  url

max length 4000 chars