Now that we know how our data is sent from StatsD, lets take a look at how it is stored and processed in Graphite.
In the Graphite documentation we can find the Graphite overview. It sums up Graphite with these two simple points.
Graphite stores numeric time-series data.
Graphite renders graphs of this data on demand.
Graphite consists of three parts.
carbon - a daemon that listens for time-series data.
whisper - a simple database library for storing time-series data.
webapp - a (Django) webapp that renders graphs on demand.
The format for time-series data in graphite looks like this
<key> <numeric value> <timestamp>
Storage schemas
Graphite uses configurable storage schemas to define retention rates for storing data. It matches data paths with a pattern and tells what frequency and history for our data to store.
The following configuration example is taken from the StatsD documentation.
[stats]
pattern = ^stats\..*
retentions = 10:2160,60:10080,600:262974
Which means these retentions will be used for every entry with a key matching the pattern defined. The retention format is frequency:history. So this configuration lets us store 10 second data for 6 hours, 1 minute data for 1 week, and 10 minute data for 5 years.
Visualizing a timer in Graphite
Knowing all this, we can now take a look at my simple ruby-script that collects timings for a HTTP requests.
Lets take a look at the visualized Graphite render from this data. The data is from the last 2 minutes.
Image visualization
Render URL
Render URL used for the image below.
/render/?width=586&height=308&from=-2minutes&target=stats.timers.http_request.elapsed_time.sum
Rendered image from Graphite
Rendered image from Graphite, a simple graph visualizing elapsed_time for http requests over time.
Rendered image from Graphite
JSON-data
Render URL
Render URL used for the JSON-data below.
/render/?width=586&height=308&from=-2minutes&target=stats.timers.http_request.elapsed_time.sum&format=json
JSON-output from Graphite
In the results below, we can see the raw data from Graphite. It is data from 12 different data points which means 2 minutes with the StatsD 10-second flush interval. It is really this simple, Graphite just visualizes its data.
The JSON-data is beautified with JSONLint for viewing purposes.
[
{
"target": "stats.timers.http_request.elapsed_time.sum",
"datapoints": [
[
53.449951171875,
1343038130
],
[
50.3916015625,
1343038140
],
[
50.1357421875,
1343038150
],
[
39.601806640625,
1343038160
],
[
41.5263671875,
1343038170
],
[
34.3974609375,
1343038180
],
[
36.3818359375,
1343038190
],
[
35.009033203125,
1343038200
],
[
37.0087890625,
1343038210
],
[
38.486572265625,
1343038220
],
[
45.66064453125,
1343038230
],
[
null,
1343038240
]
]
}
]