Description
Building pipelines in Databricks often starts in notebooks. But as soon as transformations grow, dependencies multiply, and stakeholders rely on your outputs, structure becomes critical.
In this course, you’ll learn how to build Spark Declarative Pipelines on Databricks using a clean Bronze → Silver → Gold (Medallion) architecture, and how to orchestrate and operate them with Lakeflow Designer.
This is a fully hands-on course. You’ll start with a raw e-commerce dataset, ingest it into Delta Lake, progressively refine it into analytics-ready tables, and then run and manage the entire pipeline through Lakeflow’s visual orchestration interface.
Along the way, you’ll understand how Delta Lake ensures reliability, how Unity Catalog provides governance and lineage, and where the Databricks Assistant naturally supports pipeline development.
By the end of this course, you’ll be able to design structured, production-oriented data pipelines on Databricks. Not just transformations inside notebooks, but end-to-end declarative workflows that can be scheduled, monitored, and extended.
What You’ll Build in This Course
The Declarative Pipeline Mental Model (Medallion Architecture Done Right)
You’ll start by building a clear mental model for Spark Declarative Pipelines. You’ll understand how the Bronze, Silver, and Gold layers work together, how dependencies are resolved automatically, and how declarative definitions differ from imperative notebook workflows.
You’ll also see where the Databricks Assistant fits into the workflow — validating schemas, drafting transformation logic, and helping you iterate safely.
Delta Lake + Unity Catalog Foundations
Before building the pipeline, you’ll establish the technical foundation:
You’ll learn how Delta Lake provides ACID guarantees, schema enforcement, versioning, and time travel, and why those features matter for reproducible pipelines.
You’ll also work with Unity Catalog to organize catalogs, schemas, and tables properly, and understand how governance, security, and lineage integrate into your pipeline design from the beginning.
Building the Bronze, Silver, and Gold Layers
You’ll then build a complete medallion pipeline using a realistic e-commerce dataset.
You’ll ingest the raw data into a Bronze Delta table, preserving its original structure.
Next, you’ll transform it into a clean Silver table by standardizing data types, fixing timestamps, improving column naming, handling null values, and applying lightweight data-quality improvements.
Finally, you’ll create Gold-level business aggregations such as daily sales metrics, revenue by product, and customer activity summaries, producing analytics-ready outputs that can be queried and visualized directly.
Orchestrating with Lakeflow Designer
Once your declarative pipeline is built, you’ll move into Lakeflow Designer.
You’ll import your pipeline into the visual interface, configure scheduling, and monitor runs through the Lakeflow UI.
You’ll also modify the pipeline directly inside Lakeflow Designer — adding an additional transformation step or enhancing the Silver/Gold logic, and validate the changes end-to-end.
This is where development transitions into operational workflow management.
Bonus: Streaming Declarative Pipelines
To extend the model beyond batch processing, you’ll explore streaming modes in declarative pipelines.
You’ll connect Databricks to a simple AWS message queue (such as Kinesis or SQS), configure authentication and ingestion settings, and build a minimal streaming declarative pipeline.
Using Lakeflow Designer, you’ll ingest streaming messages into a Bronze table and optionally process them into Silver, understanding how triggered and continuous modes integrate with declarative pipeline logic.
Who This Course is For
This course is for Data Engineers and Analytics Engineers who want to:
- build structured Bronze/Silver/Gold pipelines on Databricks
- understand Delta Lake fundamentals in a practical context
- use Unity Catalog correctly for governance and organization
- move from notebook-based transformations to declarative workflows
- orchestrate pipelines visually with Lakeflow Designer
- explore how streaming integrates into declarative pipeline design
No prior experience with Spark Declarative Pipelines or Lakeflow Designer is required. You’ll learn everything step by step while building a complete end-to-end pipeline.
Course Curriculum
Start Now:
Spark Declarative Pipelines & Lakeflow Designer on Databricks will be included in our Data Engineering Academy and also available as Free Lab