Top Lessons - Lesson Page 12

Most Recent

Most Popular

Highest Rated

Reset

lesson

Introduction to Building an LLM Power AI course

- Intuition for decoder-only LLMs - Tokens, embeddings, transformer pipeline - Autoregressive next-token generation - Generative AI modalities overview - Diffusion vs transformer model families - Inference flow and prompt processing - Build a real LLM inference API - Architecture: attention, context, decoding - Training phases: pretrain to RLHF - Vertical vs generic LLM design - Distillation, quantization, efficient scaling - Reasoning models: Chain of Thought and Test Time Compute - Hands on Exercises

zaoyang

lesson

Technical Orientation (Python, Numpy, Probability, Statistics, Tensors)Power AI course

- How AI Thinks in Numbers: Dot Products and Matrix Logic - NumPy Power-Tools: The Math Engine Behind Modern AI - Introduction To Machine Learning Libraries - Two and Three Dimensional Arrays - Data as Fuel: Cleaning, Structuring, and Transforming with Pandas - Normalization in Data Processing: Teaching Models to Compare Apples to Apples - Probability Foundations: How Models Reason About the Unknown - The Bell Curve in AI: Detecting Outliers and Anomalies - Evaluating Models Like a Scientist: Bootstrapping, T-Tests, Confidence Intervals - Transformers: The Architecture That Gave AI Its Brain - Diffusion Models: How AI Creates Images, Video, and Sound - Activation Functions: Teaching Models to Make Decisions - Vectors and Tensors: The Language of Deep Learning - GPUs, Cloud, and APIs: How AI Runs in the Real World

zaoyang

lesson

Attention Layer Power AI course

- Why context is fundamental in LLMs - Limits of n-grams, RNNs, embeddings - Self-attention solves long-range context - QKV: query–key–value mechanics - Dynamic contextual embeddings per token - Attention weights determine word relevance - Multi-head attention = parallel perspectives - GQA reduces attention compute cost - Mixture-of-experts for specialized attention - Editing and modifying transformer layers - Decoder-only vs encoder–decoder framing - Building context-aware prediction systems

zaoyang

lesson

Multimodal Embeddings Power AI course

- Foundations of multimodal representation learning - Text, image, audio, video embeddings - Contrastive learning for cross-modal alignment - Shared latent spaces across modalities - Vision encoders and patch tokenization - Transformer encoders for text meaning - Audio preprocessing and spectral features - Time-series tokenization via SAX or VQ - Fusion modules for modality alignment - Cross-attention for integrated reasoning - Zero-shot retrieval and multimodal search - Real-world multimodal applications overview

zaoyang

lesson

Tokens and Embeddings Power AI course

- Tokenization as dictionary for model input - Tokens → IDs → contextual embeddings - Semantic meaning emerges only in embeddings - Transformer layers reshape embeddings by context - Pretrained embeddings accelerate domain understanding - Good tokenization reduces loss, improves learning - Tokenizer choice impacts RAG chunking - Compression tradeoffs differ by domain needs - Tokenization affects inference cost and speed - Compare BPE, SentencePiece, custom tokenizers - Emerging trend: byte-level latent transformers - Generations of embeddings add deeper semantics - Similarity measured via dot products, distance - Embeddings enable search, clustering, retrieval systems

zaoyang