LLM Production Chain (Inference, Deployment, CI/CD)
- Map the end-to-end LLM production chain: data, serving, latency, monitoring - Explore multi-tenant LLM APIs, vector databases, caching, rate limiting - Understand tradeoffs between hosting vs using APIs, and inference tuning - Plan a scalable serving stack (e.g., LLM + vector DB + API + orchestrator) - Learn about LLMOps roles, workflows, and production-level tooling