Graphite

Now that we know how our data is sent from StatsD, lets take a look at how it is stored and processed in Graphite. 

Overview

In the Graphite documentation we can find the Graphite overview. It sums up Graphite with these two simple points.

Graphite stores numeric time-series data.

Graphite renders graphs of this data on demand.

Graphite consists of three parts.

carbon - a daemon that listens for time-series data.

whisper - a simple database library for storing time-series data.

webapp - a (Django) webapp that renders graphs on demand.

The format for time-series data in graphite looks like this

<key> <numeric value> <timestamp>

Storage schemas

Graphite uses configurable storage schemas to define retention rates for storing data. It matches data paths with a pattern and tells what frequency and history for our data to store.

The following configuration example is taken from the StatsD documentation.

[stats]

pattern = ^stats\..*

retentions = 10:2160,60:10080,600:262974

Which means these retentions will be used for every entry with a key matching the pattern defined. The retention format is frequency:history. So this configuration lets us store 10 second data for 6 hours, 1 minute data for 1 week, and 10 minute data for 5 years.

Visualizing a timer in Graphite

Knowing all this, we can now take a look at my simple ruby-script that collects timings for a HTTP requests.

 

Lets take a look at the visualized Graphite render from this data. The data is from the last 2 minutes.

Image visualization

Render URL

Render URL used for the image below.

/render/?width=586&height=308&from=-2minutes&target=stats.timers.http_request.elapsed_time.sum

Rendered image from Graphite

Rendered image from Graphite, a simple graph visualizing elapsed_time for http requests over time.

Rendered image from Graphite

JSON-data

Render URL

Render URL used for the JSON-data below.

/render/?width=586&height=308&from=-2minutes&target=stats.timers.http_request.elapsed_time.sum&format=json

JSON-output from Graphite

In the results below, we can see the raw data from Graphite. It is data from 12 different data points which means 2 minutes with the StatsD 10-second flush interval. It is really this simple, Graphite just visualizes its data.

The JSON-data is beautified with JSONLint for viewing purposes.

[

    {

        "target": "stats.timers.http_request.elapsed_time.sum",

        "datapoints": [

            [

                53.449951171875,

                1343038130

            ],

            [

                50.3916015625,

                1343038140

            ],

            [

                50.1357421875,

                1343038150

            ],

            [

                39.601806640625,

                1343038160

            ],

            [

                41.5263671875,

                1343038170

            ],

            [

                34.3974609375,

                1343038180

            ],

            [

                36.3818359375,

                1343038190

            ],

            [

                35.009033203125,

                1343038200

            ],

            [

                37.0087890625,

                1343038210

            ],

            [

                38.486572265625,

                1343038220

            ],

            [

                45.66064453125,

                1343038230

            ],

            [

                null,

                1343038240

            ]

        ]

    }

]