时长:
26分钟
播放:
117
发布:
4天前
主播...
简介...
今天我们来聊一个特别有意思的话题:AI是如何学习和思考的?我们不再满足于AI能做什么,而是想知道它怎样才能做得更好。本期节目,我们将通过几篇最新论文,揭秘AI如何拥有自己的“私教系统”实现共同进化,如何通过“训练吃苦”换来我们使用时的“一步到位”,甚至如何在信息不全时“拜师学艺”,以及在思考时如何像高手一样进行“全局推演”。准备好了吗?让我们一起潜入AI的大脑深处。
00:00:34 如何打造一个完美的“AI私教”系统?
00:06:13 为什么说最快的AI,都在训练时“吃苦”?
00:11:23 不开“上帝视角”,如何成为高手?
00:15:52 想让机器人变聪明?别只教它“干活”
00:21:13 AI思考的秘密,为什么有的模型更会解谜?
本期介绍的几篇论文:
[LG] RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System
[Princeton University]
https://arxiv.org/abs/2602.02488
---
[LG] Generative Modeling via Drifting
[MIT]
https://arxiv.org/abs/2602.04770
---
[LG] Privileged Information Distillation for Language Models
[ServiceNow]
https://arxiv.org/abs/2602.04942
---
[RO] A Systematic Study of Data Modalities and Strategies for Co-training Large Behavior Models for Robot Manipulation
[Toyota Research Institute]
https://arxiv.org/abs/2602.01067
---
[LG] Reasoning with Latent Tokens in Diffusion Language Models
[CMU]
https://arxiv.org/abs/2602.03769
00:00:34 如何打造一个完美的“AI私教”系统?
00:06:13 为什么说最快的AI,都在训练时“吃苦”?
00:11:23 不开“上帝视角”,如何成为高手?
00:15:52 想让机器人变聪明?别只教它“干活”
00:21:13 AI思考的秘密,为什么有的模型更会解谜?
本期介绍的几篇论文:
[LG] RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System
[Princeton University]
https://arxiv.org/abs/2602.02488
---
[LG] Generative Modeling via Drifting
[MIT]
https://arxiv.org/abs/2602.04770
---
[LG] Privileged Information Distillation for Language Models
[ServiceNow]
https://arxiv.org/abs/2602.04942
---
[RO] A Systematic Study of Data Modalities and Strategies for Co-training Large Behavior Models for Robot Manipulation
[Toyota Research Institute]
https://arxiv.org/abs/2602.01067
---
[LG] Reasoning with Latent Tokens in Diffusion Language Models
[CMU]
https://arxiv.org/abs/2602.03769
评价...
空空如也
小宇宙热门评论...
暂无小宇宙热门评论