Production RAG Demo
Ask Warren Buffett
Chat with 47 years of Warren Buffett's shareholder letters using RAG
Ask Your Question
💰 Free • Powered by Gemini 2.0 Flash • RAG with 47 years of letters
Warren's Answer
Searching through letters...
RAG pipeline in progress
Ready to search Warren's wisdom
Ask a question to see RAG in action
Error
Retrieved Sources
Performance Metrics
Retrieval:
LLM:
Total Chunks:
Model: Gemini 2.0
How RAG Works
Pipeline
1. Embed query with Gemini
2. Vector search across 673 chunks
3. Retrieve top 5 similar passages
4. Augment LLM context
5. Generate answer with citations
Why This Matters
- ✓ Production-ready RAG architecture
- ✓ Semantic search & vector retrieval
- ✓ Cost-optimized (bundled embeddings)
- ✓ Source attribution prevents hallucinations
Production RAG Stack
Architecture
- • RAG System: Semantic search + LLM generation
- • Vector DB: 47 years of letters embedded with Gemini
- • Source Attribution: Every answer cites specific letter years
- • Backend: AWS Lambda + API Gateway + bundled embeddings
Technical Details
- ▸ Embeddings: Gemini text-embedding-004 (768 dims)
- ▸ LLM: Gemini 2.0 Flash (free tier)
- ▸ Storage: Bundled embeddings (~5MB) in Lambda
- ▸ Corpus: 47 years of letters (1977-2023), 673 chunks