Lessons

Explore all newline lessons

Tags
Author
Pricing
Sort By
Video
Dr. Dipen
Most Recent
Most Popular
Highest Rated
Reset
https://s3.amazonaws.com/assets.fullstack.io/n/20250812141855606_twitter.jpg

lesson

Mini-lab - Compare decoding methods on a complex promptAI bootcamp 2

- Run the same input prompt using Top-k, Top-p, and Beam search decoding - Measure differences in diversity, accuracy, repetition, and latency across the methods - Discuss which strategy works best for each context and explain why

https://s3.amazonaws.com/assets.fullstack.io/n/20250812141855606_twitter.jpg

lesson

Tokenization deep dive - Byte-level language modeling vs traditional tokenizationAI bootcamp 2

- Learn how byte-level models process raw UTF-8 bytes directly, with a vocabulary size of 256 - Understand how this approach removes the need for subword tokenizers like BPE or SentencePiece - Compare byte-level models to tokenized models with larger vocabularies (e.g., 30k–50k tokens) - Analyze the trade-offs between the two approaches in terms of simplicity - Evaluate how each approach handles multilingual text - Assess the impact on model size - Examine differences in performance

https://s3.amazonaws.com/assets.fullstack.io/n/20250812141855606_twitter.jpg

lesson

Hard-negative mining strategiesAI bootcamp 2

- Implement pipelines that automatically surface confusing negatives

https://s3.amazonaws.com/assets.fullstack.io/n/20250812141855606_twitter.jpg

lesson

Cohere Rerank API & SBERT fine-tuning ([sbert.net], Hugging Face)AI bootcamp 2

- Learn to use off-the-shelf rerankers like Cohere’s API or fine-tune SBERT models to optimize document ranking post-retrieval

https://s3.amazonaws.com/assets.fullstack.io/n/20250812141855606_twitter.jpg

lesson

Triplet-loss fundamentals and semi-hard negative miningAI bootcamp 2

- Dive into triplet formation strategies - Focusing on how to find semi-hard negatives (similar but incorrect results that challenge the model)

https://s3.amazonaws.com/assets.fullstack.io/n/20250812141855606_twitter.jpg

lesson

Tri-encoder vs cross-encoder performance trade-offsAI bootcamp 2

- Explore the architectural trade-offs between Bi/tri-encoders vs cross-encoders - Learn when to use hybrid systems (e.g., bi-encoder retrieval + cross-encoder reranking)

https://s3.amazonaws.com/assets.fullstack.io/n/20250812141855606_twitter.jpg

lesson

Contrastive loss vs triplet lossAI bootcamp 2

- Compare the two core objectives used for fine-tuning retrievers - Understand how each behaves in hard-negative-rich domains like code or finance

https://s3.amazonaws.com/assets.fullstack.io/n/20250812141855606_twitter.jpg

lesson

Query routing logic and memory-index hybridsAI bootcamp 2

- Implement index routing systems where queries are conditionally routed: - short factual query → lexical index - long reasoning query → dense retriever - visual question → image embedding index - Learn how to fuse local memory with global vector stores for agentic long-term retrieval

https://s3.amazonaws.com/assets.fullstack.io/n/20250812141855606_twitter.jpg

lesson

Multi-vector DB vs standard DBAI bootcamp 2

- Understand how multi-vector databases (e.g., ColBERT, Turbopuffer) store multiple vectors per document to support fine-grained relevance - Contrast this with standard single-vector-per-doc retrieval (e.g., FAISS), and learn when multi-vector setups are worth the extra complexity

https://s3.amazonaws.com/assets.fullstack.io/n/20250812141855606_twitter.jpg

lesson

Late interaction methods (ColQwen-Omni, audio+image chunks)AI bootcamp 2

- Study late interaction architectures (like ColQwen-Omni) that separate dense retrieval from deep semantic fusion - Explore how these models support chunking and retrieval over image, audio, and video-text combinations using attention-based fusion at scoring time

https://s3.amazonaws.com/assets.fullstack.io/n/20250812141855606_twitter.jpg

lesson

Cartridge-based retrieval (self-study distillation)AI bootcamp 2

- Learn how to modularize retrieval into topic- or task-specific “cartridges.” - Understand that cartridges are pre-distilled context sets for self-querying agents - Study how this approach is inspired by OpenAI’s retrieval plugin and LangChain’s retriever routers - See how cartridges improve retrieval precision by narrowing memory to high-relevance windows

https://s3.amazonaws.com/assets.fullstack.io/n/20250812141855606_twitter.jpg

lesson

RL in decoding, CoT prompting, and feedback loopsAI bootcamp 2

- Understand how RL ideas are used without training by introducing dynamic feedback in inference - Apply reward scoring or confidence thresholds to adjust CoT (Chain-of-Thought) reasoning steps - Use external tools (e.g., validators or search APIs) as part of a feedback loop that rewards correct or complete answers - Understand how RL concepts power speculative decoding verification, scratchpad agents, and dynamic rerouting during generation