Top Lessons

Most Recent

Most Popular

Highest Rated

Reset

lesson

How RAG Finetuning and RLHF Fits in Production

- End-to-End LLM Finetuning & Orchestration using RL - Prepare instruction-tuning datasets (synthetic + human) - Finetune a small LLM on your RAG tasks - Use RL to finetune the same dataset and compare results across all approaches - Select the appropriate finetuning approach and build RAG - Implement orchestration patterns (pipelines, agents) - Set up continuous monitoring integration using Braintrust - RL Frameworks in Practice - Use DSPy, OpenAI API, LangChain's RLChain, OpenPipe ART, and PufferLib for RLHF tasks - Rubric-Based Reward Systems - Design interpretable rubrics to score reasoning, structure, and correctness - Real-World Applications of RLHF - Explore applications in summarization, email tuning, and web agent fine-tuning - RL and RLHF for RAG - Apply RL techniques to optimize retrieval and generation in RAG pipelines - Use RLHF to improve response quality based on user feedback and preferences - Exercises: End-to-End RAG with Finetuning & RLHF - Finetune a small LLM (Llama 3.2 3B or Qwen 2.5 3B) on ELI5 dataset using LoRA/QLoRA - Apply RLHF with rubric-based rewards to optimize responses - Build production RAG with DSPy orchestration, logging, and monitoring - Compare base → finetuned → RLHF-optimized models

lesson

Advanced RAG with Multi-Media RAG

- Advanced RAG Reranker Training & Triplet Fundamentals - Learn contrastive loss vs triplet loss approaches for training retrievers - Understand tri-encoder vs cross-encoder performance trade-offs - Master triplet-loss fundamentals and semi-hard negative mining strategies - Fine-tune rerankers using Cohere Rerank API & SBERT (sbert.net, Hugging Face) - Multimodal & Metadata RAG - Index and query images, tables, and structured JSON using ColQwen-Omni (ColPali-based late interaction for audio, video, and visual documents) - Implement metadata filtering, short vs long-term indices, and query routing logic - Cartridges RAG Technique - Learn how Cartridges compress large corpora into small, trainable KV-cache structures for efficient retrieval (~39x less memory, ~26x faster) - Master the Self-Study training approach using synthetic Q&A and context distillation for generalized question answering - Cartridge-Based Retrieval - Learn modular retrieval systems with topic-specific "cartridges" for precision memory routing - Late Interaction Methods - Study architectures like ColQwen-Omni that combine multimodal (text, audio, image) retrieval using late interaction fusion - Multi-Vector vs Single-Vector Retrieval - Compare ColBERT/Turbopuffer vs FAISS, and understand trade-offs in granularity, accuracy, and inference cost - Query Routing & Hybrid Memory Systems - Explore dynamic routing between lexical, dense, and multimodal indexes - Loss Functions for Retriever Training - Compare contrastive loss vs triplet loss, and learn about semi-hard negative mining - Reranker Tuning with SBERT or APIs - Fine-tune rerankers (SBERT, Cohere API), evaluate with MRR/nDCG, and integrate into retrieval loops - Exercises: Advanced RAG Techniques - Implement triplet loss vs contrastive loss for reranker training with semi-hard negative mining - Build multimodal RAG systems with images, tables, and query routing - Compare single-vector (FAISS) vs multi-vector (ColBERT) retrieval - Create cartridge-based RAG with topic-specific memory routing

https://image.mux.com/5frPE6hCeANdTulsSSn2nP1k02ncU1t3zc00SPWgIADrU/thumbnail.png?time=0

lesson

Additional Discussion on RAG: Emphasis on Systemic Way of Approaching/Experimenting RAG Application Power AI course

zaoyang

https://image.mux.com/AGt6tgjaIqX3N5GID4dRC6tExxnYshqX301601lSYlIsk/thumbnail.png?time=0

lesson

Step 11 - The Storage Power AI course

zaoyang

https://image.mux.com/7Nrk00Iu01uMR00DuMTIkcxZR4Yb100eXPPc8A5pGdUlVUM/thumbnail.png?time=0

lesson

Advanced RAG Power AI course

- Intro to RAG and Why LLMs Need External Knowledge - LLM Limitations and How Retrieval Fixes Hallucinations - How RAG Combines Search + Generation Into One System - Fresh Data Retrieval to Overcome Frozen Training Cutoffs - Context Engineering for Giving LLMs the Right Evidence - Multi-Agent RAG and Routing Queries to the Right Tools - Retrieval Indexes: Vector DBs, APIs, SQL, and Web Search - Query Routing With Prompts and Model-Driven Decision Logic - API Calls vs RAG: When You Need Data vs Full Answers - Tool Calling for Weather, Stocks, Databases, and More - Chunking Long Documents Into Searchable Units - Chunk Size Trade-offs for Precision vs Broad Context - Metadata Extraction to Link Related Chunks Together - Semantic Search Using Embeddings for Nearest-Neighbor Retrieval - Image and Multimodal Handling for RAG Pipelines - Text-Based Image Descriptions vs True Image Embeddings - Query Rewriting for Broad, Vague, or Ambiguous Questions - Hybrid Retrieval Using Metadata + Embeddings Together - Rerankers to Push the Correct Chunk to the Top - Vector Databases and How They Index Embeddings at Scale - Term-Based vs Embedding-Based vs Hybrid Search - Multi-Vector RAG and When to Use Multiple Embedding Models - Retrieval Indexes Beyond Vector DBs: APIs, SQL, Search Engines - Generation Stage: Stitching Evidence Into Final Answers - Tool Calling With Multiple Retrieval Sources for Complex Tasks - Synthetic Data for Stress-Testing Retrieval Quality Early - RAG vs Fine-Tuning: When to Retrieve and When to Update the Model - Prompt Patterns for Retrieval-Driven Generation - Evaluating Retrieval: Recall, Relevance, and Chunk Quality - Building End-to-End RAG Systems for Real Applications

zaoyang

lesson

RAG

lesson

Step 11 - The Storage 30-Minute Fullstack Masterplan