Analyze data using TimescaleDB continuous aggregates and hyperfunctions
This tutorial is a step-by-step guide on how to use TimescaleDB for analyzing time-series data. We will show you how to utilize TimescaleDB’s continuous aggregates and hyperfunctions for faster and more efficient queries. We will also take advantage of a unique capability of TimescaleDB: the ability to join time-series data with relational data.
The dataset that we’re using is provided by the National Football League (NFL) and contains player and tracking data for all the passing plays of the 2018 NFL season. We’re going to ingest this dataset with Python into TimescaleDB and start exploring it to uncover insights about players and teams.
If you happen to be a NFL fantasy football player, using some of this analysis on past data could be helpful in selecting the most effective players for the upcoming year. And, as the NFL releases new data throughout the upcoming season, you can ingest that data to help you make better decisions from week to week.
Even if you aren’t an NFL fan, this tutorial provides a great example of how to ingest time-series data into TimescaleDB (even when it doesn’t seem like time-series data), how you can use plain SQL and TimescaleDB hyperfunctions to do powerful data analysis, and also visualize the data with Python.
This tutorial has a few sections to help you on your journey:
-
Download the data, create tables in TimescaleDB, and run your first query on NFL tracking data.
Analyze data using continuous aggregates and hyperfunctions
Examine the data at a deeper level with more advanced queries, using features of TimescaleDB to make queries faster and effective. You’ll also see examples of some visualizations you can create using the data.
Join time-series data with relational data
Gain further insight into your time-series data by joining it with relational data.
Visualize time-series play-by-play data
For a little extra fun, create images that plot the movement of every player on the field for any play using Python and MatPlotlib.
Prerequisites
- Python 3
- TimescaleDB (see installation options)
- Psql or any other PostgreSQL client (e.g. DBeaver)
Download the dataset
- The NFL dataset is available for download on Kaggle.
- Additional stadium and scores dataset (.zip) (source: wikipedia.com).