Introduction
Description
Azure is rapidly becoming the go-to platform for companies leveraging their existing Microsoft365 ecosystem. If you want to level up your Data Engineering skills, mastering Azure and Infrastructure as Code (IaC) with Terraform is essential. That’s why we created this "Azure ETL with Terraform" course for you. This hands-on training guides you in building a real-world, end-to-end Data Engineering solution on Azure, combining the power of Terraform, Azure Data Factory, Synapse Analytics, and Power BI.
You’ll create a fully automated data pipeline, extracting data from an external API, processing it with Azure’s powerful tools, and preparing it for visualization. Along the way, you’ll implement Lakehouse and Medallion architectures (Bronze, Silver, Gold layers), ensuring your pipeline is efficient and scalable.
By the end of the course, you’ll not only understand how to build and automate modern data pipelines, but you’ll also have a complete, hands-on project to showcase your skills.
Introduction to Azure and Terraform
Kick off your journey by understanding Azure. Learn about its role in the modern data ecosystem and its key services for data engineering, including Data Factory, Data Lake, and Synapse Analytics. Learn how Terraform’s Infrastructure as Code (IaC) capabilities make managing these resources scalable and efficient. This foundational knowledge will set the stage for building a complete data pipeline.
Hands-On Setup
Start building your project by setting up the essential tools and environments. You’ll install Terraform and configure it to work seamlessly with Azure. Learn to create a Service Principal and set up authentication for secure, automated resource provisioning. Also, build the knowledge needed to manage Azure resources effectively.
Terraform Basics
Dive deeper into Terraform by mastering its project structure, essential commands, and modularization techniques. In this module, you’ll begin deploying Azure resources for the project:
- Provision Azure Data Factory to orchestrate your data pipeline.
- Set up Azure Data Lake Storage to store incoming data in the Bronze layer.
- Deploy Synapse Analytics for advanced data processing.
Learn to write reusable and maintainable Terraform code, ensuring your infrastructure is scalable and ready for future modules.
Real-World Deployment
Put your Terraform skills into action by deploying the first components of your pipeline. You’ll provision an Azure Data Factory instance and configure it to connect to an external Soccer API, enabling data ingestion. Set up the Azure Data Lake to store raw data (Bronze layer) and prepare it for further processing.
This module mimics real-world deployment workflows, combining manual and automated approaches to give you a comprehensive understanding of infrastructure deployment in professional data engineering projects.
CI/CD Concepts
Learn how CI/CD principles apply to infrastructure automation using Terraform and Azure DevOps. We’ll cover Continuous Integration (CI), where code changes are automatically built, tested, and validated before merging, and Continuous Deployment (CD), which automates infrastructure provisioning and application updates. You’ll see how Terraform integrates with CI/CD pipelines to ensure consistent and repeatable deployments, reducing human errors and increasing development speed. By the end, you’ll have a fully automated workflow for managing Azure resources efficiently.
Azure Data Factory
Start with a short introduction to working with external APIs, using the Football API as practical example. This gives you a basic understanding of how to bring API data into your Azure pipeline. After that, you'll dive deeper into Azure Data Factory: you'll learn the key components (pipelines, datasets, linked services, and triggers) and build your own batch pipelines step by step, automating data ingestion and processing.
Databricks Infrastructure & End-to-End Pipeline
Deploy a complete Databricks infrastructure with Terraform and explore the Databricks user interface. Then you'll tie everything together by executing an end-to-end pipeline that retrieves API data, transforms it, and prepares it for analysis. This complete pipeline combines API integration, Azure Data Factory, and Databricks into realistic Data Engineering workflow.
Course Curriculum
- Why CI/CD (5:17)
- CI/CD Process Basics (4:54)
- CI/CD Steps (5:27)
- CI/CD Workflow Example (5:23)
- CI/CD Bascis Summary (1:22)
- Azure CI/CD Pipelines Terminology (10:21)
- Single YAML Pipeline Approach (7:30)
- Azure Dev Ops & Azure Cloud setup (8:26)
- CI/CD Pipeline Implementation (11:57)
- Pipeline Source Code explained & Job Analysis (14:07)
- Executing the CI/CD Pipeline (2:19)
Pricing
Azure pipelines with Terraform and Databricks is included in our Data Engineering Academy