This is a full end to end capstone project. The project uses e-commerce data that contains invoices for customers and items on these invoices.
Our goal is to ingest the invoice data, as it is getting generated, and visualize it in a user interface.
The technologies we use are FastAPI, Apache Kafka, Apache Spark, MongoDB and Streamlit. Tools you already learned in the academy individually. I recommend you take a look into these courses first.
What will I learn?
- You learn step by step how to set up this streaming pipeline
- You will be transforming the base CSV file into JSONs and send them into your API
- You build an API that writes data into Apache Kafka with FastAPI and deploy it as a Docker container.
- You deploy and configure Apache Kafka as a Docker container
- You use Apache Spark Docker container with Jupyter notebooks to process the streaming data from Kafka and write it into MongoDB
- You setup and use MongoDB and Mongo-Express UI with docker as the storage backend of your pipeline
- You create a interactive user interface with Streamlit to view invoices for customers and the items on these invoices
- 2 hours 45 minutes of videos
- Source Codes
- Test cases to test your pipeline
- Supporting web links