StatsD And Anomalies

Anomaly Detection

I had been looking for a tool to detect anomalies in data. I stumbled across two libraries from Twitter:



These are R libraries for analysis of data. I have written a quick script to take data exported from StatsD and plot a graph with the interesting parts highlighted.


I was writing R code in a text editor but then someone suggested RStudio which I would highly recommend.


Below is the graph I was able to generate with 28 days of data.


Spot the issue.

Spot the issue…


The circled areas are available in code as well:

> print(res$anoms)
            timestamp anoms
1 2016-08-24 22:25:00 419.4919
2 2016-08-24 22:55:00 546.4654
3 2016-08-24 23:00:00 276.6360
4 2016-08-26 16:15:00 106.3696


The code to make this is will convert your raw data and output it into format the Twitter library can read.



This is a basic set of scripts for doing some analysis of StatsD data. I would recommend learning some R if you want to do some serious data analysis.