Production RAG Demo

Ask Warren Buffett

Chat with 47 years of Warren Buffett's shareholder letters using RAG

Ask Your Question

💰 Free • Powered by Gemini 2.0 Flash • RAG with 47 years of letters

Warren's Answer

Ready to search Warren's wisdom

Ask a question to see RAG in action

How RAG Works

Pipeline

1. Embed query with Gemini
2. Vector search across 673 chunks
3. Retrieve top 5 similar passages
4. Augment LLM context
5. Generate answer with citations

Why This Matters

  • Production-ready RAG architecture
  • Semantic search & vector retrieval
  • Cost-optimized (bundled embeddings)
  • Source attribution prevents hallucinations

Production RAG Stack

Architecture

  • RAG System: Semantic search + LLM generation
  • Vector DB: 47 years of letters embedded with Gemini
  • Source Attribution: Every answer cites specific letter years
  • Backend: AWS Lambda + API Gateway + bundled embeddings

Technical Details

  • Embeddings: Gemini text-embedding-004 (768 dims)
  • LLM: Gemini 2.0 Flash (free tier)
  • Storage: Bundled embeddings (~5MB) in Lambda
  • Corpus: 47 years of letters (1977-2023), 673 chunks