时长:
28分钟
播放:
192
发布:
1个月前
主播...
简介...
你有没有想过,当AI独自“思考”时,它的小脑袋里都在发生什么?本期节目,我们将深入AI的“内心世界”,看看最新论文是如何教会AI像武林高手一样“左右互搏”来自我进化,如何给它装上一个懂得“反思”的脑子来攻克数学难题,又是如何发现它在画画时竟然会悄悄“抄近道”的。更神奇的是,我们还会聊到如何用“坏指令”教出“好模型”,以及如何为AI请来一位绝对公正的“铁面裁判”。准备好了吗?让我们一起揭开AI“内心戏”的神秘面纱!
00:00:39 顶级高手的训练秘籍,AI的“左右互搏术”
00:06:00 AI也会算错数?给它一个“反思”的脑子
00:11:10 AI训练的“左右互搏”,用坏指令,教出好模型
00:16:29 如何让AI拥有一个既出题、又陪练、还绝对公正的“完美教练”?
00:22:47 你的AI听话吗?它可能在悄悄“抄近道”
本期介绍的几篇论文:
[AI] Toward Training Superintelligent Software Agents through Self-Play SWE-RL
[Meta FAIR & Meta TBD Lab]
https://arxiv.org/abs/2512.18552
---
[CL] MDToC: Metacognitive Dynamic Tree of Concepts for Boosting Mathematical Problem-Solving of Large Language Models
[University of Maryland]
https://arxiv.org/abs/2512.18841
---
[LG] Recontextualization Mitigates Specification Gaming without Modifying the Specification
[MATS]
https://arxiv.org/abs/2512.19027
---
[AI] Propose, Solve, Verify: Self-Play Through Formal Verification
[CMU]
https://arxiv.org/abs/2512.18160
---
[LG] Is Your Conditional Diffusion Model Actually Denoising?
[MIT & Yale University]
https://arxiv.org/abs/2512.18736
00:00:39 顶级高手的训练秘籍,AI的“左右互搏术”
00:06:00 AI也会算错数?给它一个“反思”的脑子
00:11:10 AI训练的“左右互搏”,用坏指令,教出好模型
00:16:29 如何让AI拥有一个既出题、又陪练、还绝对公正的“完美教练”?
00:22:47 你的AI听话吗?它可能在悄悄“抄近道”
本期介绍的几篇论文:
[AI] Toward Training Superintelligent Software Agents through Self-Play SWE-RL
[Meta FAIR & Meta TBD Lab]
https://arxiv.org/abs/2512.18552
---
[CL] MDToC: Metacognitive Dynamic Tree of Concepts for Boosting Mathematical Problem-Solving of Large Language Models
[University of Maryland]
https://arxiv.org/abs/2512.18841
---
[LG] Recontextualization Mitigates Specification Gaming without Modifying the Specification
[MATS]
https://arxiv.org/abs/2512.19027
---
[AI] Propose, Solve, Verify: Self-Play Through Formal Verification
[CMU]
https://arxiv.org/abs/2512.18160
---
[LG] Is Your Conditional Diffusion Model Actually Denoising?
[MIT & Yale University]
https://arxiv.org/abs/2512.18736
评价...
空空如也
小宇宙热门评论...
行了撒
1个月前
北京
0
01:43 AlphaGo左右手互搏几千万次▲▲▲▲▲▲