[人人能懂] 从打草稿、看地图到听取“少数派报告”

AI可可AI生活

你有没有想过,AI解决难题,是靠“刷题”蒙对答案,还是真的理解了过程?在本期节目中,我们将看到最新论文如何教会AI养成“打草稿”的思考习惯,又如何在没有标准答案时,学会倾听宝贵的“少数派声音”。让我们一起探索,AI如何从一个“会说话的机器”进化为一个真正的“思考者”。 00:00:29 AI如何学会思考?一个奖励机制的悄然革命 00:05:15 高手过招,如何不“钻牛角尖”? 00:09:45 AI的集体智慧:当少数派报告比多数票更重要 00:15:11 AI换个思路看世界:当化学家扔掉“说明书”之后 00:21:15 好模型,不只看结果,更要看过程 本期介绍的几篇论文: [LG] RLP: Reinforcement as a Pretraining Objective [NVIDIA & CMU] https://arxiv.org/abs/2510.01265 --- [LG] RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems [CMU & Stanford University] https://arxiv.org/abs/2510.02263 --- [CL] RESTRAIN: From Spurious Votes to Signals -- Self-Driven RL with Self-Penalization [Iowa State University & Meta & UW–Madison] https://arxiv.org/abs/2510.02172 --- [LG] Transformers Discover Molecular Structure Without Graph Priors [UC Berkeley] https://arxiv.org/abs/2510.02259 --- [LG] Step-Aware Policy Optimization for Reasoning in Diffusion Large Language Models [CMU] https://arxiv.org/abs/2510.01544

28分钟
99+
1个月前

[人人能懂] 从并行思考、结构化学习到认知解密

AI可可AI生活

想知道AI如何像开“诸葛亮会”一样解决难题,又为何连小学生的乘法都搞不定吗?本期节目,几篇最新的论文将带我们一窥AI大脑的内部运作:看它如何上演“分身思考”的头脑风暴,如何被我们的“偏见”变得无聊。更重要的是,我将告诉你一个解锁它隐藏创造力的简单“咒语”,并揭示为何在训练AI时,不能只看“平均分”。 00:00:29 让AI更聪明的秘密:不是想得更长,而是想得更巧 00:06:28 分身思考:AI的大脑里如何开一场头脑风暴 00:11:13 为什么聪明的AI,竟学不会小学生的乘法? 00:18:03 为什么AI越来越无聊?换个问法,解锁它的隐藏技能 00:22:36 AI训练揭秘:你真的懂“平均”吗? 本期介绍的几篇论文: [LG] Rethinking Thinking Tokens: LLMs as Improvement Operators [Meta Superintelligence Labs & Anthropic] https://arxiv.org/abs/2510.01123 --- [LG] Thoughtbubbles: an Unsupervised Method for Parallel Thinking in Latent Space [Stanford University] https://arxiv.org/abs/2510.00219 --- [LG] Why Can't Transformers Learn Multiplication? Reverse-Engineering Reveals Long-Range Dependency Pitfalls [University of Chicago & MIT & University of Waterloo] https://arxiv.org/abs/2510.00184 --- [CL] Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity [Northeastern University & Stanford University] https://arxiv.org/abs/2510.01171 [LG] Per-example gradients: a new frontier for understanding and improving optimizers [Google Deepmind] https://arxiv.org/abs/2510.00236 ---

29分钟
99+
1个月前

[人人能懂] 从本质创造、跨界通感到无知之智

AI可可AI生活

本期节目,我们将潜入AI的“思想厨房”,看看它如何像分子料理大师一样,在抽象的“风味空间”里低成本地创造出绝妙品味。我们还会揭秘AI的“通感”天赋,探索为何只“读书”的AI,竟能通过代码和数学“看懂”世界。更进一步,我们将见证AI世界里“侦探”与“法官”的诞生,看两种AI如何协作,确保推理的铁证如山。最后,我们将探讨一种让AI学会“无知之智”的深刻方法,明白承认“不知道”为何是更高级的智慧,让我们马上开始! 00:00:39 “造句”不如“造意”:一种让AI低成本学会“好品味”的新方法 00:05:12 AI界的“万能诊断仪”:大道至简,用“读心术”取代“望闻问切” 00:10:22 AI的“通感”:只“读书”的AI,为何能“看懂”世界? 00:15:26 AI当“侦探”,谁来当“法官”?—— 一种让AI的推理靠谱起来的新方法 00:20:35 AI的“无知之智”:最高级的智慧,是承认“我不知道” [CL] Limited Preference Data? Learning Better Reward Model with Latent Space Synthesis [University of Wisconsin-Madison & Nanyang Technological University] https://arxiv.org/abs/2509.26074 --- [CL] Regression Language Models for Code [Cornell University & Google] https://arxiv.org/abs/2509.26476 --- [LG] Learning to See Before Seeing: Demystifying LLM Visual Priors from Language Pre-training [Meta Superintelligence Labs] https://arxiv.org/abs/2509.26625 --- [LG] Towards Verified Code Reasoning by LLMs [University of Texas at Austin & Google DeepMind] https://arxiv.org/abs/2509.26546 --- [CL] TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning [Meta Reality Labs] https://arxiv.org/abs/2509.25760

25分钟
99+
1个月前

[人人能懂] 从梦境修炼、真实互动到思维几何

AI可可AI生活

今天,我们将一起潜入AI的“内心世界”,探索几篇极具启发性的最新论文。我们将见证AI如何通过“梦中修炼”掌握复杂技能,又如何走出实验室,向真实的普通用户拜师学艺。我们还会首次窥见AI“大脑”发育的完整三部曲,并学习如何像侦探一样,从AI的“思考过程”中识别出它独特的性格。最后,我们将揭示一个惊人的记忆悖论:为何对AI来说,“记性太好”反而可能是一种诅咒? 00:00:36 AI的“梦中修炼”:如何在虚拟世界里,炼成绝世武功? 00:05:50 AI走出“象牙塔”:真实的用户,才是最好的老师 00:10:55 文如其人,AI如其思:我们能从“思考过程”看出AI的性格吗? 00:16:34 AI“大脑”的发育三部曲:从牙牙学语到融会贯通 00:21:42 AI的记忆悖论:为何“记性太好”,反而让你“忘得更快”? 本期介绍的几篇论文: [LG] Training Agents Inside of Scalable World Models [Google DeepMind] https://arxiv.org/abs/2509.24527 --- [LG] The Era of Real-World Human Interaction: RL from User Conversations [FAIR at Meta] https://arxiv.org/abs/2509.25137 --- [CL] Your thoughts tell who you are: Characterize the reasoning patterns of LRMs [Meta Superintelligence Labs & Harvard University] https://arxiv.org/abs/2509.24147 --- [LG] Tracing the Representation Geometry of Language Models from Pretraining to Post-training [McGill University & UC Berkeley] https://arxiv.org/abs/2509.23024 --- [LG] Short window attention enables long-term memorization [Ecole Normale Supérieure Paris Saclay & Johannes Kepler University Linz & Meta FAIR] https://arxiv.org/abs/2509.24552

27分钟
99+
1个月前

[人人能懂] 化繁为简、趋利避害、知行合一

AI可可AI生活

本期节目,我们将一起打开一个“AI智慧工具箱”,看看几篇最新论文如何为我们揭示AI思考的底层秘密。我们将探讨,AI如何用一把名为“柯尔莫哥洛夫复杂度”的终极尺子去寻找最简单的答案,又为何在真实世界中,学会“探路”远比“背地图”更重要。我们还会看到,科学家们如何从急诊室的真实病例中,为进入物理世界的机器人设计“驾照考试”。最后,我们会拆解代码这颗“大力丸”的补脑秘方,并揭示AI是如何通过“双核大脑”训练,同时拥有深思考和快反应这两种超能力的。 00:00:44 AI的“奥卡姆剃刀”:如何找到那个最简单也最深刻的答案? 00:06:25 AI学“规划”:背地图和自己探路,哪个更高明? 00:11:52 AI“下凡”入世:我们如何教会机器人“趋利避害”? 00:17:20 AI的“大力丸”:代码里究竟藏着什么“补脑”秘方? 00:22:06 鱼与熊掌如何兼得?让AI拥有“深思考”和“快反应” 本期介绍的几篇论文: [LG] Bridging Kolmogorov Complexity and Deep Learning: Asymptotically Optimal Description Length Objectives for Transformers [Google DeepMind & Google Research] https://arxiv.org/abs/2509.22445 --- [LG] Benefits and Pitfalls of Reinforcement Learning for Language Model Planning: A Theoretical Perspective [Microsoft Research Asia & Peking University] https://arxiv.org/abs/2509.22613 --- [LG] Can AI Perceive Physical Danger and Intervene? [Google DeepMind Robotics] https://arxiv.org/abs/2509.21651 --- [CL] On Code-Induced Reasoning in LLMs [Carnegie Mellon University (CMU)] https://arxiv.org/abs/2509.21499 --- [CL] Dual-Head Reasoning Distillation: Improving Classifier Accuracy with Train-Time-Only Reasoning [Google] https://arxiv.org/abs/2509.21487

27分钟
99+
1个月前

[人人能懂] 给AI做脑CT、建记忆宫殿,再教它自学成才

AI可可AI生活

今天,我们将深入AI的“内心世界”,看看它如何变得更聪明。我们会用最新论文中的方法,给AI做一次“脑CT”看清能力升级的代价,并教会“音盲”的它“脑补”出声音的质感。接着,我们将揭示AI如何像搭建“记忆宫殿”和使用“信息压缩机”一样,告别遗忘和臃肿。最后,我们将见证AI如何摆脱人类老师,通过预测作者思路实现“自学成才”! 00:00:31 给AI做一次“脑CT”:排行榜之外,我们如何看透模型的真本事? 00:05:50 AI的“记忆宫殿”:聊得再久,它怎么才能记住重点? 00:11:19 给AI装上“压缩饼干”机:信息再多,也能秒懂重点 00:16:27 AI学会了“脑补”声音:闭上眼睛,如何听懂全世界? 00:21:44 AI界的“自学成才”:没有老师,如何炼成绝世武功? 本期介绍的几篇论文: [CL] Beyond the Leaderboard: Understanding Performance Disparities in Large Language Models via Model Diffing [HBKU] https://arxiv.org/abs/2509.18792 --- [CL] EpiCache: Episodic KV Cache Management for Long Conversational Question Answering [Apple] https://arxiv.org/abs/2509.17396 --- [CL] CompLLM: Compression for Long Context Q&A [Amazon] https://arxiv.org/abs/2509.19228 --- [CL] AuditoryBench++: Can Language Models Understand Auditory Knowledge without Hearing? [Pohang University of Science and Technology & HJ AILAB] https://arxiv.org/abs/2509.17641 --- [CL] Reinforcement Learning on Pre-Training Data [Tencent] https://arxiv.org/abs/2509.19249

27分钟
99+
1个月前

[人人能懂] 从乐高蓝图、视觉思考到决策梦之队

AI可可AI生活

你有没有想过,AI的“聪明”和我们的“聪明”,到底有什么不一样?本期节目,我们将一起探索AI如何用乐高一样的蓝图搭建软件帝国,如何识破只会考试的“高分低能”陷阱,又是如何扔掉专家地图、让“眼睛”学会思考,并最终用“精兵策略”做出更聪明的决策。准备好了吗?让我们从五篇最新的论文出发,一探AI智慧的边界。 00:00:31 软件世界的“乐高”说明书:从一句话到一个帝国 00:05:50 AI医生的“高分低能”陷阱:别被排行榜骗了 00:10:51 扔掉“专家地图”,AI也能走出一条新路 00:15:51 AI的下一场革命:当“眼睛”开始像“大脑”一样思考 00:21:18 从“人海战术”到“精兵策略”:让AI的每一次计算都花在刀刃上 本期介绍的几篇论文: [CL] RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation [Microsoft] https://arxiv.org/abs/2509.16198 --- [LG] The Illusion of Readiness: Stress Testing Large Frontier Models on Multimodal Medical Benchmarks [Microsoft Research] https://arxiv.org/abs/2509.18234 --- [LG] SimpleFold: Folding Proteins is Simpler than You Think [Apple] https://arxiv.org/abs/2509.18480 --- [LG] Video models are zero-shot learners and reasoners [Google DeepMind] https://arxiv.org/abs/2509.20328 --- [LG] Best-of-∞ -- Asymptotic Performance of Test-Time Compute [New York University & Institute of Science Tokyo & NEC Corporation] https://arxiv.org/abs/2509.21091

26分钟
99+
1个月前
EarsOnMe

加入我们的 Discord

与播客爱好者一起交流

立即加入

扫描微信二维码

添加微信好友,获取更多播客资讯

微信二维码

播放列表

自动播放下一个

播放列表还是空的

去找些喜欢的节目添加进来吧