Introduction


Description

dbt is a SQL-first transformation workflow. The tool makes it easy to transform, test, and document your data with ease. Another big plus: your team can work directly within the warehouse to create trusted datasets for reporting, machine learning modeling, and operational workflows. Therefore, dbt is definitely a tool that you should look into as a data engineer. And this training is the perfect starting point for doing so!


Introduction to dbt

Before diving into the main part of this training, we first talk about the current challenges and opportunities with ELT tools and processes before going through some ETL and ELT basics. Then we introduce to you the dbt core and dbt Cloud, talk about the dbt benefits and special features.


Setup Snowflake, dbt core & GitHub 

For our hands-on part, we need to do some preparation. You are going to create a Github repository as well as a Snowflake warehouse and an account on dbt cloud. Furthermore, you will do a basic dbt setup and choose your data platform as well as a dbt model. This can be a single .sql or .py file. 


Creating data pipelines with dbt

Throughout the course, you learn how to create a series of pipelines (models) using e-commerce data, dbt Core, dbt Cloud, and Snowflake.


dbt materializations

After creating your pipelines, the next step is to store the transformed data to the target. For this, you could configure a dbt materialization, which could be a table, a view, an incremental model or an ephemeral model. 

In the hands-on part of this section you work with the materializations and create your first .sql and .py model. Additionally, you will learn about and work with the dbt external and internal sources and their dependencies. 


Testing dbt models

The next part of the training is about testing your dbt models. Tests are assertions you make about your models and other resources in your project. For this, you have two types of tests available: schema/generic tests and data/bespoke tests. After learning about the different tests that can be run with dbt, you will also test your dbt models. 


Deploying and scheduling dbt models

Now that you have dbt models running on your local machine, you learn how you can make them accessible to your team members, how you can run them repeatedly and how you can keep the models updated. For doing that, you learn about some common ways of how to deploy and schedule your dbt models in dbt Cloud. 


Advanced dbt features

In the last part of this training, you get to know some of the advanced features of dbt. You learn how continuous integration and deployment (CI/CD) works by implementing CI/CD pipelines in dbt Cloud hands-on. You also learn about dbt documentation, how it works and what you can do with it, and in the end generate documentation for your own project.


Provided material

  • GitHub repository with all the source codes
  • E-commerce dataset for this course
  • Hands-on explainer videos
  • curated list of links to more knowledge in each lesson


Requriements

You should have done our Snowflake for Data Engineers course or similar before starting this course.