Category Archives: Coding

StatsD And Anomalies

Anomaly Detection

I had been looking for a tool to detect anomalies in data. I stumbled across two libraries from Twitter:

 

 

These are R libraries for analysis of data. I have written a quick script to take data exported from StatsD and plot a graph with the interesting parts highlighted.

 

I was writing R code in a text editor but then someone suggested RStudio which I would highly recommend.

 

Below is the graph I was able to generate with 28 days of data.

 

Spot the issue.

Spot the issue…

 

The circled areas are available in code as well:

> print(res$anoms)
            timestamp anoms
1 2016-08-24 22:25:00 419.4919
2 2016-08-24 22:55:00 546.4654
3 2016-08-24 23:00:00 276.6360
4 2016-08-26 16:15:00 106.3696

 

The code to make this is StatsDAnomalyDetectionconvert_json.py will convert your raw data and output it into format the Twitter library can read.

 

Conclusion

This is a basic set of scripts for doing some analysis of StatsD data. I would recommend learning some R if you want to do some serious data analysis.

The Internet Gong

The Idea

A while back I wanted something that made a noise to notify me of an event. The original plan was to replicate Andy Rubin’s gong doorbell. The large gong and mallet was a little bit out of my price range but a good idea is a good idea. I ordered one from Amazon and combined with a servo and a Arduino Nano clone I set about the same idea.

 

Gong

Gong

 

With a little bit of tape it was finished. The design is basic but functional. The beater is attached to a 9g servo which acts as the human arm. An Arduino Nano clone is used to to move the servo.

 

In action:

https://www.youtube.com/watch?v=-vFhrOvHxIs

 

Triggering the Gong

Communication to the gong is done over serial. Sending “gong\n” over serial, triggers the beater/mallet to strike the gong and then move out the way to avoid a second gong strike. The code for the Arduino is below:

Conclusion

For a mini project this is perfect for getting servos and Arduinos to work together. Mine is currently connected to my Google calendar to alert me five minutes before a meeting.

 

The door bell will have to wait.

Graphing and Predicting Blood Pressure

According to CDC statistics as many as 1 in 3 Americans suffer from high blood pressure. High blood pressure can contribute to a large range of conditions including a higher risk for heart disease and stroke.

A friend of mine was diagnosed with high blood pressure and with the help of medication was determined to lower it using data.

What gets measured, gets managed

First up they established a routine of taking their blood pressure and logging it. There are few blood pressure devices available to do this. They range from the standalone basic model to ones that talk to your phone. In my friends case, they opted for the cheaper basic standalone option.

The monitor shows the systolic, diastolic and pulse results on a simple LCD screen with no option to extract the readings to a laptop. Quickest solution was to use an app to collect the data by manually typing it in. After some experimentation we settled on the Withings app. Having an API to access the data made it a great choice. It also provided off site backup, just in case.

My friend logged their blood pressure morning and night in the Withings app and using the dashboard provided by Withings, track trends in the readings.

Over the last 5 months they have logged systolic and diastolic pressures along with heart rate from the standalone machine.

Below is a snapshot from the dashboard offered by Withings.

Withings blood pressure graph

Graph generated by Withings as part of their dashboard setup.

The graphs are OK but we both found them a little confusing. We also wanted to be able to predict future readings based on past readings.

Python, APIs and Graphing

I’m a Python programmer (mostly) and thought I could do better than the standard graphs offered by Withings.

With access via the API I set up basic Flask app to fetch the data from Withings and graph it locally. To access the data I needed I used a Withings Python lib (available on PyPi). For graphing I choose Plot.ly, just a few lines in the HTML and you can access a very powerful graphing tool.

First task was just to extract the raw data from Withings. Using the Python lib made this pretty simple. Where it got a little tricky was converting the fetched into something Plotly could graph. I went for a simple approach to build a string of text to render in a template using Jinja2 as part of Flask.

As with most little projects Bootstrap made the perfect tool for rendering the HTML with the graphs embedded into the normal row layout.

Blood pressure graph

Added a simple moving average to attempt to smooth the graph.

Blood pressure can have some great and unpredictable peaks here and there and so trends can be hard to spot. I have been rendering time series data with R but hadn’t really anything in Python that was ready to use. This is Algorithmia comes in.

Predicting the Future

As this was a pretty quick and simple project I needed to make sense of the data as easily as possible. I explored a few services on line that did machine learning and data analysis. I slowly found out most were targeted at text classification.

Then I found Algorithmia which allows a large range of algorithms to be run on supplied data. You upload your data, they run your chosen algorithm over the data and supply you the results in realtime. They even have a Python library.

I chose two algorithms for this project:

Simple Moving Average

Simple moving average was my attempt to smooth the data and make it a bit simpler to spot trends. The raw data is very “peaky” and using the average does help but the extremes do remain.

Forecast

This is where things get really fun. I had about 5 months of data to work with. Generally speaking the more data you have the better!

The entire dataset started after a course of Perindopril had been prescribed so there is no data of the high readings that triggered the course of treatment. Perindopril kicked in literally the next day reducing regular readings of > 190/85 to the much better < 140/85.

Lets see if Perindopril was going to be a viable long term treatment. Taking the existing readings I fed them into the Forecast algorithm from Algorithmia and graphed the results below.

Complete graph of blood pressure.

Complete graph of blood pressure.

Below is the output from the Forecast algorithm but using the simple moving average data instead of the raw data for the next 5 months.

Forecast results using moving average data.

Forecast results using moving average data.

Blood pressure data is harder to work from as it can be quite erratic at times. There are a number of algorithms for smoothing and normalise the data, which I intend to use to improve the predictions.

Conclusions

The main takeaway is that it appears my friends blood pressure isn’t going to get worse. It should stay within an acceptable range for the next few months.

Also my friend now has a set of nice graphs they can take to their doctor to discuss long term treatment. Blood pressure is something that can be influenced by a range of factors so regular reviews are important for long term management.

This was my first attempt at making forecasts and understanding the many, many ideas behind this kind data processing. I have barely scratched the service with what can be done with the data.

Sites used:

I have made the code available here Github repo.

Maplin Arm

Decided to power up my Maplin arm and was having issues getting it going on OS X El Capitan.

I kept seeing the error “usb.core.NoBackendError: No backend available” looks like at some point while upgrading I no longer had the correct usb libraries. Installing libusb-compat via brew fixed the issue.

Improving the Server Density Service Map

In my first attempt at Physical Website Monitoring I used some 8 bit shift registers and tri-colour LEDs. This while fun, was hard work. Each green, red and blue along with the earth legs of the LED had to be soldered and wired up. That’s OK for a small project but I wanted to expand to 10+ monitoring locations.

So plan B was to use Neopixels, as usual Adafruit has a great guide for them. I had some no brand ones lying around from a while back. You can buy the “branded” ones from any of the major online retailers or you can go cheap from eBay.

Neopixel close up

Close up of Neopixel

Wiring them up is nice and simple. Just solder them together in serial following the arrows on their backs. They only have 3 pins: 5v, earth and data.

I roughly measured the distance between locations I am monitoring from  to measure the length of cable, and then solder the Neopixels together. I ran out of black wire pretty quick, which is why there is more red, sorry for any confusion. Below shows them attached to the back of the map.

Rear view of the Neopixels

The code for controlling the lights is listed below:

I borrowed most of it from the example in Adafruit’s library for the Neopixel. I can’t stress enough how excellent Adafruit is a resource of how to do stuff!

You can test everything is working from the serial port of the Arduino. Send “sydney down\n” and Sydney will change to red, “slow” will change it to orange instead. “up” resets it to green.

The script below is some basic Python to fetch the last response time from Server Density and decided what if at all the colour of a location should be changed to. It’s all a bit rough around the edges but you get the idea.

Below is it in action from the serial port:

Physical Website Monitor based on Server Density’s Monitoring

I wanted to make something to make monitoring more tangible. So I made a board to display the current status of this website chrishannam.co.uk as monitored from a number of remote “actors” provided by Server Density. Below is a snapshot of the monitoring setup from Server Density’s service page.

website monitoring setup

Remote monitoring actors.

The build was pretty basic and luckily I had the parts lying around from previous projects. Rather than explain the setup I’ll give you the link I used as it covers everything better than I could explain. Adafruit Shift Register is an excellent guide on wiring and programming 8 bit shift registers. The only difference is I used tri colour LEDs. The LEDs I used were almost identical to these Tri Colour LEDs from eBay. They do red, green and blue light. I just removed the blue leg as I didn’t need it.

The LEDs are mounted in a 6mm thick panel of MDF. The map was just a simple one printed off Wikipedia.

I used an Arduino Uno but any basic Arduino is up to the job. The code to control the board is listed below. It’s based on the Adafruit example from their excellent guide.

View from behind.

Wired up LEDs

Wiring for Arduino

Here it is in action.

Next I needed to talk to Server Density’s API which luckily is pretty simple. I get the last time from each actor, and test to see if it’s below 0.4 seconds. 0.4 ensures at least 1 server usually Sydney will be down, so it makes for a better display. The Arduino code flips the colour of the LED to make updates a binary change with a message over serial.

One interesting thing about the code is the number of sleeps.

WHEN OPENING A SERIAL CONNECTION TO AN ARDUINO IT RESTARTS!

Bear that in mind. You need to allow time for the Arduino to setup the connection over the serial connection is initiated. This also applies to sending data backwards and forwards as well. Adding the sleeps ensures everything runs smoothly and nothing gets lost.

This was a basic prototype, I am hoping to expand it to an A3 sized map with more locations.

How to Monitor Hard Drive SMART Status on Linux

Following on monitoring hard drive temperatures I thought I would add a check for the current SMART assessment on the status of the drive. The pySMART lib makes this trivial. The code below will output the status.

As a failure could seriously impact the life expectancy of the drive, I have added a call to the Yo service to “Yo me” if a failure is detected.

How to Monitor Hard Drive Temperatures

Monitoring hard drives is pretty similar to the work I touched on for onboard sensors. First we need the right tool for the job. In this case it’s smartctl.

smartctl is available in the smartmontools package for Ubuntu smartmontools. To install:

Once installed you will need to use sudo smartctl to test it.

By default the command outputs a lot of very useful information. Checkout a disk for example using the following:

Lots of info is outputted, have a search for “Temperature_Celsius” to find the temperature in Celsius.

To make life simpler for collecting this information I use pySMART. This Python library makes processing the output from the command so much simpler. The script below extracts all temperatures for the disks stored in the dict DISKS. The pattern is device name (just the disk not the full path) and whatever you want to know the disk as.

I run the script from root’s cron as the smartctl command requires root privileges. Do the following to add it to root’s crontab. The example below will run the script every minute.

How to Monitor Linux Server Sensors

Following on from my last post on monitoring fan speed I found PySensors.This library providers a simple method for extracting data from the sensors command. Below shows the basic usage on my server:

Extending the script from the last article, it’s now simple to record all the sensor data shown here on Grafana. I have updated the script and listed it below.

Temperature Graph from Main Board

Grafana Graphs

How to Monitor Fan Speeds

My biggest concern relocating my server to the garage was dust clogging up the fans. The server has two fans, one CPU and one mounted at the rear of the case.

The best guide for reading sensor data I have found for Ubuntu is Sensors How To. This guides you through setting up the command sensors to access the hardware sensors located on the motherboard.

I am using an ASUS main board and by default sensors wont pick up the fans attached to the board. A bit of Googling found the solution installing the correct kernel module.

$ sudo modprobe nct6775

Make sure you add nct6775 to /etc/modules to ensure it’s loaded at boot time.

sensors command now gives the following output:

$ sensors
acpitz-virtual-0
Adapter: Virtual device
temp1:        +27.8°C  (crit = +105.0°C)
temp2:        +29.8°C  (crit = +105.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +16.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:         +15.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:         +16.0°C  (high = +80.0°C, crit = +100.0°C)

nct6791-isa-0290
Adapter: ISA adapter
in0:                    +0.88 V  (min =  +0.00 V, max =  +1.74 V)
in1:                    +1.01 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in2:                    +3.31 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in3:                    +3.30 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in4:                    +1.01 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in5:                    +2.02 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in6:                    +0.64 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in7:                    +3.44 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in8:                    +3.25 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in9:                    +1.01 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in10:                   +0.21 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in11:                   +0.16 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in12:                   +1.01 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in13:                   +1.01 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
in14:                   +0.20 V  (min =  +0.00 V, max =  +0.00 V)  ALARM
fan1:                     0 RPM  (min =    0 RPM)
fan2:                  1008 RPM  (min =    0 RPM)
fan3:                     0 RPM  (min =    0 RPM)
fan4:                     0 RPM  (min =    0 RPM)
fan5:                     0 RPM  (min =    0 RPM)
SYSTIN:                  +9.0°C  (high =  +0.0°C, hyst =  +0.0°C)  ALARM  sensor = thermistor
CPUTIN:                 +11.5°C  (high = +80.0°C, hyst = +75.0°C)  sensor = thermistor
AUXTIN0:                +47.0°C    sensor = thermistor
AUXTIN1:               +111.0°C    sensor = thermistor
AUXTIN2:               +109.0°C    sensor = thermistor
AUXTIN3:               +110.0°C    sensor = thermistor
PECI Agent 0:           +15.5°C
PCH_CHIP_CPU_MAX_TEMP:   +0.0°C
PCH_CHIP_TEMP:           +0.0°C
PCH_CPU_TEMP:            +0.0°C
intrusion0:            ALARM
intrusion1:            ALARM
beep_enable:           disabled

Only the case fan is reporting, it’s fan2 on the list. I’m not sure why only one fan is reported, still better than none…

Using the script at the bottom I add the data into an InfluxDB database and use Grafana to view it. Below is a sample graph:

Fan speed

Fan Speed

This is some basic graphing that allows me to track any changes that might indicate a problem with the fan.

Below is a script to output the fan data from the above output: