Anomaly Detection
I had been looking for a tool to detect anomalies in data. I stumbled across two libraries from Twitter:
These are R libraries for analysis of data. I have written a quick script to take data exported from StatsD and plot a graph with the interesting parts highlighted.
I was writing R code in a text editor but then someone suggested RStudio which I would highly recommend.
Below is the graph I was able to generate with 28 days of data.
The circled areas are available in code as well:
> print(res$anoms) timestamp anoms 1 2016-08-24 22:25:00 419.4919 2 2016-08-24 22:55:00 546.4654 3 2016-08-24 23:00:00 276.6360 4 2016-08-26 16:15:00 106.3696
The code to make this is StatsDAnomalyDetection. convert_json.py will convert your raw data and output it into format the Twitter library can read.
Conclusion
This is a basic set of scripts for doing some analysis of StatsD data. I would recommend learning some R if you want to do some serious data analysis.