Statsd

  • buckets Each stat is in its own "bucket". They are not predefined anywhere. Buckets can be named anything that will translate to Graphite (periods make folders, etc)

  • values Each stat will have a value. How it is interpreted depends on modifiers. In general values should be integer.

  • flush After the flush interval timeout (defined by config.flushInterval, default 10 seconds), stats are aggregated and sent to an upstream backend service.

Metric types

 

Counters

 

Counters are simple. It adds a value to a bucket and stays in memory until the flush interval. 

First, StatsD iterates over any counters received, where it starts by assigning two variables. One variable holds the counter value, and one variable holds the per-second value. It then adds the values to the statString and increases the numStats variable.

If you have the default flush interval, 10 seconds, and send StatsD 7 increments on a counter with the flush interval, the counter would be 7 and the per-second value would be 0.7. No magic.

 

Timers

 

Timers collects numbers. They does not necessarily need to contain a value of time. You can collect bytes read, number of objects in some storage, or anything that is a number. A good thing about timer, is that you get the mean, the sum, the count, the upper and the lower values for free. Feed StatsD a timer and this gets automatically calculated for you before it is flushed to Graphite. Oh, I almost forgot to mention that you also get the 90 percentile calculated for the mean, sum and upper values as well. You can also configure StatsD to use an array of numbers as percentiles, which means you can get 50 percentile, 90 percentile and 95 percentile calculated for you if you want.

The source code for timer stats is a bit more advanced than the code for the counters.

StatsD iterates over each timer and processes the timer if the value is above 0. It then sorts the array of values and simply counts it and locates the minimum and maximum values. An array of the cumulative values is created and a few variables are assigned before it starts to iterate over the percentile thresholds array to calculate percentiles and creates the messages to assign to the statString variable. When percentile calculation is done, the final sum gets assigned and the final statString is created.

If you send the following timer values to StatsD during the default flush interval

450

120

553

994

334

844

675

496

StatsD will calculate the following values

mean_90 496

upper_90 844

sum_90 3472

upper 994

lower 120

count 8

sum 4466

mean 558.25

Gauges

A gauge simply indicates an arbitrary value at a point in time and is the most simple type in StatsD. It just takes any number and ships it to the backend.

The source code for gauge stats is just four lines.

for (key in gauges) {

  statString += 'stats.gauges.' + key + ' ' + gauges[key] + ' ' + ts + "\n";

  numStats += 1;

}

Feed StatsD a number and it sends it unprocessed to the backend. A thing to note is that only the last value of a gauge during a flush interval is flushed to the backend. That means that if you send the following gauge values to StatsD during a flush interval

643

754

583

The only value that gets flushed to the backend is 583. The value of this gauge will be kept in memory in StatsD and be sent to the backend at the end of every flush interval.