Processing, storing and visualizing time series data is getting more and more important. From IoT data, system logs to process statistics, the need for handling all this information is ever growing.
Systems like time series databases and interactive dashboards help make all this data manageable and explorable. Two prime examples that you see used all the time are InfluxDB and Grafana. InfluxDB for storing the data and Grafana for visualizing the stored information.
In this project you learn how to process time series data from CSV files containing air quality data. You are also going to use an external weather API to extract and bring in live weather data. You will learn how to write data into influxDB 2.0 and read data from InfluxDB using Python and Flux query language.
For visualization you are going to learn how to use Grafana dashboards, where you set up the server, configure dashboards and set up the user management.
First of all, I will give you a quick overview of what you are going to build and which data set you are using. In this context, you get to know how the user interface of Influx DB with its graphs and queries looks like. I also show you how your platform is built up by going through the platform blueprint. This way, you learn how the different components and tools work together. We will also go through the attribute information of your data, like temperature, pressure, and wind direction.
Before we get into schema design, you will learn about the features of relational databases and time series databases and also about the difference between them to understand when to use which best.
Then we are going through the steps of how you actually get to your schema. Therefore, it is important to have a look at your used data again and also at the access patterns to decide how you actually want to access your data. By having a look into the Influx DB key concepts and basics, you learn how your data is created and stored in Influx DB. For your full understanding of the up- and downsides, you also learn how you store your data in a relational database versus a time series database.
Next, I show you how to set up and run InfluxDB and the Grafana dashboard as your development environment using a Docker container. You also learn how to install a Python client library in your WSL2 as you are going to use that to communicate with Influx DB. Furthermore, you will create an Influx DB Python access token so you can actually work with your data and set up VS Code.
Working with Test Data
In the first hands-on processing part, you write CSV files and test data from the weather data pool to Influx DB with Python. After that, we explore a problem you could experience when writing your data into Influx DB and I show you a way of solving that.
Working with Air Quality Data
Then, you visualize air quality data with your Grafana dashboard. For this, you write air quality data into Influx DB and query your data from Influx DB with Python first. In order to connect Influx DB with Grafana, you are going to set up a Grafana data source and then create and configure your Grafana dashboard for Influx DB.
Working with External Weather API
After a short weather API introduction, where you learn about the API key and the interactive API explorer for example, we will have a quick look at how to manage time zones. Then you do the second hands-on part of the project, where you implement an external API for weather data, put it into Influx DB and visualize it on the dashboard.
Finally, I want to show you why Grafana is so useful and what you can actually do with it. As you can add a multi-tenant system with Grafana, we will have a look at the structure and the Grafana user and rights management that are part of such a system. Then you get into your last hands-on part of this project. Here you create two organizations within your Grafana administration by adding the bucket for the Beijing data and the bucket for the weather API and making use of what you learned beforehand.
- Data platform design basics (included in Academy)
Relational databases basics (in Fundamentals training)
Choosing Data Stores is included in our Data Engineering Academy