In this hands-on project, you’ll build a private, fully local Retrieval-Augmented Generation (RAG) chatbot from scratch — no APIs, no cloud dependencies. Just pure local compute and real-world stack.
- Chunk and embed documents using
llama-index
- Store and query vector data in Elasticsearch
- Run a fully local LLM using Ollama (e.g. Mistral, Phi-3)
- Build a chatbot UI with Streamlit
- Implement Retrieval-Augmented Generation (RAG) end-to-end
- Debug real-world issues like timeouts and embedding mismatches