1) AGI vs ADI, 行业专家大模型 00:00
2) Agent 05:35
2.1 RAG 05:39
2.2 Deep Research 06:38
2.3 Self-memory 09:03
2.4 Multi-agents RL 10:57
3) 大模型与大算力 12:11
3.1 Transformer 架构与算力消耗 13:38
3.2 GPU 内存与硬盘 14:57
3.3 FlashAttention 17:02
4) Adapter 19:24
4.1 Low Rank Adapter (LoRA) 19:51
4.2 GaLore 21:33
4.3 K-adapter 23:14
5) Mixture of Experts 24:14
5.1 Mistral 24:29
5.2 Deepseek-V3 26:12
6) RL for reasoning 28:00
6.1 RLHF & PPO 28:22
6.2 GRPO 29:13
6.3 多轮对话的话术 30:56
7) 像专家那样说话 33:56
7.1 Direct Preference Optimization (DPO) 34:32
7.2 Kahneman-Tversky Optimization (KTO) 36:44
8) 数据与标注 37:51
8.1 数据蒸馏 38:29
8.2 Monto Carlo Tree Search 做推理标注 40:28
9) GRPO + LoRA 实战 43:02
9.1 编程和数据 43:17
9.2 测试 44:50