Remote Development on Raspberry Pi
We all know that ISPs have a habit of overselling their connections, and this sometimes leads our connections to not be as good as we’d like them to be. Also, many of us have Raspberry Pi’s laying around waiting for cool projects. So let’s put that Pi to use to check on our internet connection!
One of the key metrics of an internet connection is its ping time to other hosts on the internet. So let’s write a program that regularly pings another host and records the results.
For those of you wanting to execute code remotely on other remote computers, like an AWS instance or a DigitalOcean droplet, the process is exactly the same.
This tutorial has been created on Windows, and the code was executed on the RPi, which is a Linux computer. On the other supported operating systems, some of the paths and workflows might be slightly different.
Before you start
For performing this tutorial, make sure that you meet the following prerequisites:
- You have the Professional Edition of PyCharm.
- The code will run on any Linux machine. You can also use a local VM as well.
You'll also need some additional software:
PostgreSQL, because it will be used to create a database.
Libpq-dev, needed for
Python-dev, needed to compile
To install all of these at once, run the following command:
sudo apt-get update && sudo apt-get install -y postgresql libpq-dev python-dev
The application we will build consists of two parts:
- First, we actually do the measurements. To measure the results, we can just call the ping command-line tool that every Linux machine (including the Raspberry Pi) ships with. We will then store the results in a PostgreSQL database.
- Second, we'll visualize the results using a Flask app, which uses Matplotlib to draw a graph of recent results.
Preparing your Raspberry Pi
As we will want to be able to view the webpage with the results later, it’s important to give our Pi a fixed IP within our network. To do so, edit
/etc/network/interfaces. See this tutorial for additional details.
Creating a project
Create a pure Python project: to do that, choose . See Creating Empty Project.
Next, in PyCharm, press Ctrl+Shift+A, begin typing ‘Start SSH session’:
Then pick your Raspberry Pi from the list of hosts, and you should be connected.
Next, install additional software.
Setting up permissions and creating a database in PostgreSQL
We now need to set up the permissions in PostgreSQL. The easiest way to do this is to go back to our SSH terminal, and run the following command to get an SQL prompt as the Postgres user:
sudo -u postgres psql
Now let's create a user (a ‘role’ in Postgres terminology) with the same name as the user that we run the process with:
CREATE ROLE pi WITH LOGIN PASSWORD ‘hunter2’;
Important note! Make sure that the role in PostgreSQL has the same name as your Linux username. You might also want to substitute a better password. It is important to end your SQL statements with a semicolon (;) in psql, because it will assume you’re writing a multi-line statement until you terminate with a semicolon. We’re granting the pi user login rights, which just means that the user can log in. Roles without login rights are used to create groups.
Next, create a database:
CREATE DATABASE pi WITH OWNER pi;
To get information on the quality of the internet connection, let’s ping a server using the system’s
ping utility, and then read the result with a regex. So let’s take a look at the output of
PING jetbrains.com (184.108.40.206) 56(84) bytes of data. 64 bytes from ec2-54-217-236-18.eu-west-1.compute.amazonaws.com (220.127.116.11): icmp_seq=1 ttl=47 time=32.9 ms 64 bytes from ec2-54-217-236-18.eu-west-1.compute.amazonaws.com (18.104.22.168): icmp_seq=2 ttl=47 time=32.9 ms 64 bytes from ec2-54-217-236-18.eu-west-1.compute.amazonaws.com (22.214.171.124): icmp_seq=3 ttl=47 time=32.9 ms 64 bytes from ec2-54-217-236-18.eu-west-1.compute.amazonaws.com (126.96.36.199): icmp_seq=4 ttl=47 time=32.9 ms --- jetbrains.com ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3003ms rtt min/avg/max/mdev = 32.909/32.951/32.992/0.131 ms
We can first get the output of ping, and then iterate over the lines, picking the ones that start with a number followed by the words ‘bytes from’:
host = 'jetbrains.com' ping_output = subprocess32.check_output(["ping", host, "-c 5"]) for line in ping_output.split('\n'): if re.match("\d+ bytes from", line): print(line)
At this point if you run the code (Ctrl+Shift+F10), you should see this code running remotely on the Raspberry Pi:
Storing the pings
We wanted to store our pings in
PostgreSQL, so let’s create a table for them. First, we need to create a PostgreSQL database:
When PostgreSQL is selected, the datasource page opens:
As our database is only exposed to localhost, we will need to use an SSH tunnel:
Next, after you’ve connected, create the table by executing the setup_db.sql script. To do this, copy-paste the script from GitHub into the SQL console that opened up right after connecting, and then use the green play button to execute.
Now that we’ve got this working, let’s expand our script to record the pings into the database. To connect to the database from Python we’ll need to install
psycopg2. To do it, point to (for Windows and Linux) or (for macOS users), then click the page Project Interpreter, and then use to install the package. If you’d like to see the full script, you can have a look on GitHub.
To make sure that we actually regularly record the pings, we need to schedule this script to be run. For this, we will use
cron. As we’re using peer authentication to the database, we need to make sure that the script is run as the
pi user. So let’s open an SSH session (making sure we’re logged in as
pi), and then run
crontab -e to edit our user crontab. Then at the bottom of the file add the following line:
*/5 * * * * /home/pi/raspberryping/ping.py jetbrains.com >> /var/log/raspberryping.log 2>&1
*/5 means that the script will be run every 5 minutes. If you’d like a different frequency, you can learn more about crontabs. Now we also need to create the log file and make sure that the script can write to it:
sudo touch /var/log/raspberryping.log sudo chown pi:pi /var/log/raspberryping.log
At this point take your time, and when you come back later, some ping times should have been logged. To verify, let’s check with PyCharm’s database tools. Open the Database tool window (on the right of the screen) and double click the table. You should see that it contains values:
Creating a Flask application
It’s no good to just record pings, if we can’t see some statistics about them. So let’s write a small Flask application (to install Flask, use
pip install in the Terminal), and use matplotlib to draw a graph of recent ping times.
In our Flask application, we’ll create two routes:
/, we’ll list the destinations that we’ve pinged in the last hour with basic stats (min, average, max time in the last hour).
/graphs/<destination>we’ll draw a graph of the pings in the last 3 hours.
/ route is simple: we just execute a query to get the data we’re interested in and pass that to the template. See the full code on GitHub.
To make sure that everything works right, let's put a breakpoint on the call to
Then start the debugging session () and see the result in your browser.
/graphs/<destination> route is a lot more complex. First, we have to get the ping averages for the past three hours in reasonably sized bins (let’s say, 10 minutes). Second, we have to draw the graph.
Querying the data
So the data we’re looking for is:
- For each 10 minute period in the last 3 hours
- Get the minimum, average, and maximum ping time to a specified destination
The first part makes this a fairly complex query. Even though PostgreSQL has support for intervals, date ranges, and a way to generate a series of dates, there is no way to generate a series of ranges. One solution to this problem is a common table expression (CTE), this is a way to execute a subquery which you can later refer to as if it were a real table.
To get a series of timestamps for the last three hours at 10 minute intervals is easy:
select begin_time from generate_series(now() - interval '3 hours', now(), interval '10 minutes') begin_time;
The generate_series function takes three arguments:
step. The function works with numbers and with timestamps, so that makes it easy. If we wanted pings at exactly these times, we’d be done now. However, we need times between the two timestamps. So we can use another bit of SQL magic: window functions which allow us to do things with rows before or after the row that we’re currently on. So let’s add
end_time to our query:
select begin_time, LEAD(begin_time) OVER (ORDER BY begin_time ASC) as end_time from generate_series(now() - interval '3 hours', now(), interval '10 minutes') begin_time;
LEAD takes the value of the next row in the results, as ordered in the way specified in the
OVER clause. You can use
LAG to get the previous row in a similar way. So now we can wrap this query with
WITH intervals as ( … query goes here … ) to make it a CTE. Then we can join our pings table and get the results we’re looking for:
WITH intervals AS ( SELECT begin_time, LEAD(begin_time) OVER ( ORDER BY begin_time ) AS end_time FROM generate_series( now() - INTERVAL '3 hours', now(), INTERVAL '10 minutes' ) begin_time ) SELECT i.begin_time AT TIME ZONE 'Europe/Berlin' AS begin_time, i.end_time AT TIME ZONE 'Europe/Berlin' AS end_time, p.destination, count(p.pingtime), round(avg(p.pingtime),2) AS avg, max(p.pingtime), min(p.pingtime) FROM intervals i LEFT JOIN pings p ON p.recorded_at >= i.begin_time AND p.recorded_at < i.end_time WHERE i.end_time IS NOT NULL AND destination = %s GROUP BY i.begin_time, i.end_time, p.destination ORDER BY i.begin_time ASC;
By the way, you can make the query work 30 times faster. To achieve this significant speedup, add this index to your query:
CREATE INDEX pings_recorded_at ON pings(recorded_at);
Graphing the data
After getting the data, matplotlib is used to generate a line graph with lines for the minimum, average, and maximum ping time per bin. Matplotlib makes it easy to plot time-based data using the plot_date function.
When the plot is ready, it’s ‘saved’ as a
.png file to a StringIO object, which is then used to create an HTTP response. By setting the
content_type header to
image/png, everything is arranged.
So the final result is:
If you want to see the full code, check out analyze.py on GitHub.
Congrats for passing this tutorial! Let's repeat what has been done with the help of PyCharm:
- Created a project
- In the Terminal, set permissions up and created a user.
- Created a database.
- Captured pings.
- Stored pings in a special database.
- Created a Flask application
- Queried the data
- Plotted the data