主播
节目简介
来源:小宇宙
你有没有想过,我们如何给AI这头“吞金兽”来一次彻底的瘦身和压缩?如何为它设计一张分工明确的“组织架构图”,让“调度员”和“专家”各司其职?我们又该如何给它装上一个既能规避灾难性风险,又能动态调整预算的“安全大脑”?当AI自己当上“裁判”时,我们如何确保它不是在抛硬币?本期节目,我们将通过几篇最新的研究,一起探索如何让AI变得更高效、更聪明,也更可靠。
00:00:31 驯服AI的新兵法,“共享”与“压缩”
00:06:36 给AI画一张“组织架构图”,谁是调度员,谁是专家?
00:13:07 如何让AI既能干,又不出事?
00:18:11 AI当裁判,是明察秋毫,还是抛硬币?
00:23:49 给AI上好“紧箍咒”,它才能学得又快又稳
本期介绍的几篇论文:
[LG] Gefen: Optimized Stochastic Optimizer
[Reichman University & Tel Aviv University]
https://arxiv.org/abs/2606.13894
---
[LG] A theoretical model for task routing in mixture-of-expert transformers
[University of Sydney & Zhejiang University]
https://arxiv.org/abs/2606.14398
---
[LG] Utility-Constrained Policy Optimization
[York University & Google DeepMind]
https://arxiv.org/abs/2606.14029
---
[CL] The Coin Flip Judge? Reliability and Bias in LLM-as-a-Judge Evaluation
[A Yagubyan]
https://arxiv.org/abs/2606.13685
---
[LG] Diffusion Policy Optimization without Drifting Apart
[UC Berkeley]
https://arxiv.org/abs/2606.13795
00:00:31 驯服AI的新兵法,“共享”与“压缩”
00:06:36 给AI画一张“组织架构图”,谁是调度员,谁是专家?
00:13:07 如何让AI既能干,又不出事?
00:18:11 AI当裁判,是明察秋毫,还是抛硬币?
00:23:49 给AI上好“紧箍咒”,它才能学得又快又稳
本期介绍的几篇论文:
[LG] Gefen: Optimized Stochastic Optimizer
[Reichman University & Tel Aviv University]
https://arxiv.org/abs/2606.13894
---
[LG] A theoretical model for task routing in mixture-of-expert transformers
[University of Sydney & Zhejiang University]
https://arxiv.org/abs/2606.14398
---
[LG] Utility-Constrained Policy Optimization
[York University & Google DeepMind]
https://arxiv.org/abs/2606.14029
---
[CL] The Coin Flip Judge? Reliability and Bias in LLM-as-a-Judge Evaluation
[A Yagubyan]
https://arxiv.org/abs/2606.13685
---
[LG] Diffusion Policy Optimization without Drifting Apart
[UC Berkeley]
https://arxiv.org/abs/2606.13795