Introduction
Description
Azure is rapidly becoming the go-to platform for companies leveraging their existing Microsoft365 ecosystem. If you want to level up your Data Engineering skills, mastering Azure and Infrastructure as Code (IaC) with Terraform is essential. That’s why we created this "Azure ETL with Terraform" course for you. This hands-on training guides you in building a real-world, end-to-end Data Engineering solution on Azure, combining the power of Terraform, Azure Data Factory, Synapse Analytics, and Power BI.
You’ll create a fully automated data pipeline, extracting data from an external API, processing it with Azure’s powerful tools, and preparing it for visualization. Along the way, you’ll implement Lakehouse and Medallion architectures (Bronze, Silver, Gold layers), ensuring your pipeline is efficient and scalable.
By the end of the course, you’ll not only understand how to build and automate modern data pipelines, but you’ll also have a complete, hands-on project to showcase your skills.
Introduction to Azure and Terraform
Kick off your journey by understanding Azure. Learn about its role in the modern data ecosystem and its key services for data engineering, including Data Factory, Data Lake, and Synapse Analytics. Learn how Terraform’s Infrastructure as Code (IaC) capabilities make managing these resources scalable and efficient. This foundational knowledge will set the stage for building a complete data pipeline.
Hands-On Setup
Start building your project by setting up the essential tools and environments. You’ll install Terraform and configure it to work seamlessly with Azure. Learn to create a Service Principal and set up authentication for secure, automated resource provisioning. Also, build the knowledge needed to manage Azure resources effectively.
Terraform Basics
Dive deeper into Terraform by mastering its project structure, essential commands, and modularization techniques. In this module, you’ll begin deploying Azure resources for the project:
- Provision Azure Data Factory to orchestrate your data pipeline.
- Set up Azure Data Lake Storage to store incoming data in the Bronze layer.
- Deploy Synapse Analytics for advanced data processing.
Learn to write reusable and maintainable Terraform code, ensuring your infrastructure is scalable and ready for future modules.
Real-World Deployment
PPut your Terraform skills into action by deploying the first components of your pipeline. You’ll provision an Azure Data Factory instance and configure it to connect to an external Soccer API, enabling data ingestion. Set up the Azure Data Lake to store raw data (Bronze layer) and prepare it for further processing.
This module mimics real-world deployment workflows, combining manual and automated approaches to give you a comprehensive understanding of infrastructure deployment in professional data engineering projects.
What’s Coming Next?
As this course progresses, you can look forward to more exciting modules that will take your skills to the next level:
Part 2 Highlights
- Building CI/CD Pipelines: Learn to integrate Terraform with GitHub for Continuous Integration/Continuous Deployment. Automate pipeline updates seamlessly across development, test, and production environments.
- API Integration: Gain a deeper understanding of working with APIs, using the Soccer API as a practical example.
- Advanced Azure Data Factory Features: Dive deeper into batch data integration and orchestration capabilities.
Part 3 Highlights
- Advanced Synapse Processing: Explore more complex transformations and optimizations using Synapse Spark.
- Scaling the Lakehouse: Learn how to optimize the Medallion architecture for large-scale workloads and multi-team collaboration.
- Automated Deployment Across Environments: Fully automate deployment pipelines for consistent infrastructure replication in production scenarios.
Course Curriculum
Pricing
Choosing Data Stores is included in our Data Engineering Academy