你有没有想过,我们能不能唤醒AI沉睡的“内在记忆”,而不是给它外挂一个搜索引擎?如何让健忘的AI真正“吃一堑,长一智”,把经验变成可复用的技能?本期节目,我们将从最新的几篇论文出发,探讨AI如何从一个“答题工具”进化成一个能自我迭代、甚至与我们并肩探索的“成长型伙伴”。 00:00:26 给AI配个“助理”?这可能是个馊主意 00:06:10 让AI开窍,如何把“经验”变成“能力”? 00:11:18 AI画家是天才,还是复印机? 00:17:25 AI的新玩法,从“解题”到“陪跑” 00:23:43 你以为的“神来之笔”,可能只是个“生僻字”? 本期介绍的几篇论文: [LG] Retrieval from Within: An Intrinsic Capability of Attention-Based Models [NVIDIA] https://arxiv.org/abs/2605.05806 --- [LG] SkillOS: Learning Skill Curation for Self-Evolving Agents [University of Illinois Urbana-Champaign & Google Cloud AI Research] https://arxiv.org/abs/2605.06614 --- [LG] Understanding diffusion models requires rethinking (again) generalization [PSL Research University & Sorbonne University] https://arxiv.org/abs/2605.06077 --- [AI] AI Co-Mathematician: Accelerating Mathematicians with Agentic AI [Google DeepMind] https://arxiv.org/abs/2605.06651 --- [CL] The Frequency Confound in Language-Model Surprisal and Metaphor Novelty [Bielefeld University] https://arxiv.org/abs/2605.06506
你有没有想过,一个“太聪明”的AI,反而会学会钻空子,导致整个系统一起“变笨”?你是否好奇,AI大脑的内部结构可能不是我们想象的开放广场,而是一张弯弯绕绕的精密地图?本期节目,我们将一起潜入AI的“心智世界”,看看最新论文是如何教会AI拥有“远见”来避免自我毁灭,如何像开赛车一样在它大脑的“流形赛道”上精准驰骋,甚至是如何用“不开刀”的方式给它无损植入新知识。更重要的是,我们会发现,原来给AI提建议和给它参考资料,都可能是在“越帮越忙”。准备好了吗?让我们一起挑战关于AI的四个“想当然”。 00:00:45 当AI学会了钻空子,我们如何防止它“聪明反被聪明误”? 00:06:20 AI的“脑回路”长啥样?我们可能一直都搞错了 00:10:56 AI升级难题,一个“不开刀”的手术方案 00:16:04 为什么夸人“你真棒”是最低效的鼓励? 00:20:33 给AI帮忙,为何会越帮越忙? 本期介绍的几篇论文: [LG] Explaining and Preventing Alignment Collapse in Iterative RLHF [PSL Research University] https://arxiv.org/abs/2605.04266 --- [LG] Manifold Steering Reveals the Shared Geometry of Neural Network Representation and Behavior [GOODFIRE] https://arxiv.org/abs/2605.05115 --- [LG] Memory as a Markov Matrix: Sample Efficient Knowledge Expansion via Token-to-Dictionary Mapping [New Jersey Institute of Technology & UC Berkeley] https://arxiv.org/abs/2605.04308 --- [LG] Efficiently Aligning Language Models with Online Natural Language Feedback [Stanford University & Anthropic] https://arxiv.org/abs/2605.04356 --- [LG] When Context Hurts: The Crossover Effect of Knowledge Transfer on Multi-Agent Design Exploration [Meta] https://arxiv.org/abs/2605.04361
想让AI学会推理,是给它一本百科全书,还是塞给它一张“学霸的草稿纸”?最新论文说,看别人怎么思考,比死记硬背更管用。我们还会一起探索,AI离“独立盖起一栋楼”般的复杂工程到底还有多远,并用“脑回路CT”技术,看看你的AI管家为什么有时会“一本正经地胡说八道”。更有趣的是,我们将揭秘AI如何通过自我反思,从“差生”逆袭成“学霸”,以及如何靠一个“探测器”就画出整张看不见的“藏宝图”。准备好了吗?我们马上出发! 00:00:37 AI也需要“学霸笔记”吗? 00:05:42 AI离“独立盖起一栋楼”还有多远? 00:10:46 你的AI管家,为什么有时“一本正经地胡说八道”? 00:16:33 如何用AI画一张看不见的藏宝图? 00:21:30 AI的自我修炼,从“差生”到“学霸”的秘密 [IR] RAG over Thinking Traces Can Improve Reasoning Tasks [UC Berkeley] https://arxiv.org/abs/2605.03344 --- [AI] ProgramBench: Can Language Models Rebuild Programs From Scratch? [Meta FAIR] https://arxiv.org/abs/2605.03546 --- [AI] What Happens Inside Agent Memory? Circuit Analysis from Emergence to Diagnosis [City University of Hong Kong & University of Toronto] https://arxiv.org/abs/2605.03354 --- [LG] Flow Sampling: Learning to Sample from Unnormalized Densities via Denoising Conditional Processes [FAIR at Meta & Weizmann Institute of Science] https://arxiv.org/abs/2605.03984 --- [AI] Self-Improvement for Fast, High-Quality Plan Generation [Amazon] https://arxiv.org/abs/2605.03625
你有没有想过,为什么AI能从互联网的海量垃圾中炼出真金,而不是变成一个只会死记硬背的书呆子?当AI犯错时,我们是该让它闭嘴,还是有更聪明的办法让它学会“谦逊”?本期节目,我们将通过几篇最新的AI论文,揭示AI如何像一个并行专家团队一样解决难题,又是如何受困于一个惊人简洁的“瓶颈定律”,带你一窥AI大脑中那些优雅而深刻的学习法则。 00:00:33 最优解的密码,藏在并行的智慧里 00:05:55 AI怎么才能不说谎?答案藏在一种人类智慧里 00:10:13 AI训练场上的“隐形杀手” 00:16:52 AI怎么从一堆垃圾里炼出真金? 00:23:33 增长的瓶颈定律,规模不是优势,弱点才是关键 本期介绍的几篇论文: [LG] Black-box optimization of noisy functions with unknown smoothness [INRIA Lille & Google DeepMind] https://arxiv.org/abs/2605.02462 --- [CL] Hallucinations Undermine Trust; Metacognition is a Way Forward [Google Research & Tel Aviv University] https://arxiv.org/abs/2605.01428 --- [LG] Generalized Distributional Alignment Games for Unbiased Answer-Level Fine-Tuning [Google Research] https://arxiv.org/abs/2605.02435 --- [LG] A Theory of Generalization in Deep Learning [Stanford University] https://arxiv.org/abs/2605.01172 --- [LG] A Theory of Saddle Escape in Deep Nonlinear Networks [UC Berkeley] https://arxiv.org/abs/2605.01288
你有没有想过,AI也能不走寻常路,学会“抄近路”写文章吗?或者,当AI陷入追求高分的“内卷陷阱”时,我们该怎么教它“最小化遗憾”而不是盲目刷分?本期节目,我们将从几篇最新的论文出发,看看AI如何通过更换“发动机”、打通不同门派的“武功”,甚至用更少的考题更精准地对齐我们的真实感受,实现一次漂亮的思维跃迁。 00:00:30 抄近路,人工智能学会了新“导航术” 00:05:34 AI的“注意力”,正在成为它的“负担” 00:10:42 AI的“高分陷阱”,我们怎样教得更聪明? 00:16:51 当规则遇上“混沌”,AI大神们的两种武功,原来同宗同源 00:23:38 为什么最好的考卷,题目反而最少? 本期介绍的几篇论文: [LG] Consistent Diffusion Language Models [Microsoft & Purdue University] https://arxiv.org/abs/2605.00161 --- [LG] Caracal: Causal Architecture via Spectral Mixing [Huawei Technologies] https://arxiv.org/abs/2605.00292 --- [LG] Wasserstein Distributionally Robust Regret Optimization for Reinforcement Learning from Human Feedback [University of North Carolina & Imperial College London & Stanford University] https://arxiv.org/abs/2605.00155 --- [LG] Trees to Flows and Back: Unifying Decision Trees and Diffusion Models [Technical University of Munich] https://arxiv.org/abs/2605.00414 --- [CL] Putting HUMANS first: Efficient LAM Evaluation with Human Preference Alignment [University of Southern California & Stanford University] https://arxiv.org/abs/2605.00022
你有没有想过,一个复杂的观念,比如“怀疑权威”,可以被压缩成一串看似随机的数字,并悄悄植入AI的大脑吗?我们又该如何为机器人设计一套公平的“高考”,来检验它是不是真的聪明,而不只是个“学霸”?本期节目,我们将一起打开五篇最新论文的奇思妙想,看看AI如何被植入“思想罗盘”,如何通过“找个伴,一起笨”的方式学会成为通才,以及我们该如何为AI的技能撰写一份清晰的“说明书”。准备好了吗?让我们即刻出发! 00:00:36 你以为你在教它数数,它却学会了你的偏见 00:08:00 我们需要一把什么样的尺子,来衡量复杂的世界? 00:15:40 机器人界的“幼儿园”和“高考” 00:20:57 最好的学习,是“找个伴,一起笨” 00:26:36 AI的“技能包”,应该怎么写说明书? 本期介绍的几篇论文: [CL] Subliminal Steering: Stronger Encoding of Hidden Signals [Columbia University] https://arxiv.org/abs/2604.25783 --- [LG] Generalising maximum mean discrepancy: kernelised functional Bregman divergences [Monash University & Sony Computer Science Laboratorie] https://arxiv.org/abs/2604.24047 --- [RO] KinDER: A Physical Reasoning Benchmark for Robot Learning and Planning [Princeton University & Carnegie Mellon University & Georgia Tech] https://arxiv.org/abs/2604.25788 --- [LG] Co-Evolving Policy Distillation [CAS & JD.COM] https://arxiv.org/abs/2604.27083 --- [CL] From Skill Text to Skill Structure: The Scheduling-Structural-Logical Representation for Agent Skills [Peking University] https://arxiv.org/abs/2604.24026
你有没有想过,我们能用一套“知识探针”给大模型做一次精确的“脑容量”CT扫描吗?或者,当AI不再满足于讲一个完美的成功故事,而是把所有失败的教训都记录下来,科学研究会变成怎样一个“活物”?本期节目,我们将从五篇最新论文出发,看看AI如何学会“变脸”戏法,又是如何用“笨办法”实现反超,以及,如何只用一部手机就让一只螃蟹学会跳街舞。 00:00:31 如何给AI大模型做一次“脑容量”CT扫描? 00:08:25 让一只螃蟹学会跳街舞,总共分几步? 00:13:47 让知识“活”起来,科研的下一种形态 00:20:47 AI的“变脸”戏法,我们以为的安全,可能只是没对上“暗号” 00:27:26 AI进化新思路,为什么“笨办法”反而更聪明? 本期介绍的几篇论文: [LG] Incompressible Knowledge Probes: Estimating Black-Box LLM Parameter Counts via Factual Capacity [Pine AI] https://arxiv.org/abs/2604.24827 --- [CV] MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos [Huawei International Pte. Ltd. & Huawei Central Media Technology Institute] https://arxiv.org/abs/2512.10881 --- [LG] The Last Human-Written Paper: Agent-Native Research Artifacts [Orchestra Research & Stanford University & Ohio State University] https://arxiv.org/abs/2604.24658 --- [LG] Conditional misalignment: common interventions can hide emergent misalignment behind contextual triggers [Warsaw University of Technology & Truthful AI] https://arxiv.org/abs/2604.25891 --- [CV] Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation [Meta AI] https://arxiv.org/abs/2604.24763
你有没有想过,一个真正能干活的AI,需要的不是更多的考题,而是一间属于自己的“办公室”?我们又该如何扮演一个聪明的“甩手掌柜”,给手下的AI专家们高效分配任务?本期节目,我们将从几篇最新的AI论文出发,聊聊如何用“成本思维”给AI的训练省下一半的钱,如何通过一场“博弈”让AI自我进化,并最终一起探索AI思考的形状,看看它的“脑海”里究竟是字典,还是一幅幅由概念构成的几何地图。 00:00:34 想让AI替你干活?得先给它一间“办公室” 00:06:01 如何当一个聪明的“甩手掌柜”? 00:11:46 AI训练太烧钱?你缺的不是算力,是“成本思维” 00:17:43 AI进步的捷径,不只看结果,更要玩对博弈 00:23:08 AI怎么思考?答案可能藏在几何里 本期介绍的几篇论文: [LG] Synthetic Computers at Scale for Long-Horizon Productivity Simulation [Microsoft] https://arxiv.org/abs/2604.28181 --- [LG] Optimized Deferral for Imbalanced Settings [Google Research & Courant Institute of Mathematical Science] https://arxiv.org/abs/2604.27723 --- [LG] Cost-Aware Learning [Google Research] https://arxiv.org/abs/2604.28020 --- [LG] Distributional Alignment Games for Answer-Level Fine-Tuning [Google Research & Microsoft Research] https://arxiv.org/abs/2604.27166 --- [LG] Do Sparse Autoencoders Capture Concept Manifolds? [Harvard University] https://arxiv.org/abs/2604.28119
你有没有想过,AI不仅能当一个好员工,还能自己进化成项目经理,开“复盘会”优化工作流?或者,指挥一个复杂的机器人,也许只需要像在屏幕上“画重点”一样简单?本期节目,我们将从五篇最新的AI论文出发,聊聊AI如何突破效率瓶颈:从揭秘AI服务“拼单”背后隐藏的隐私风险,到看AI如何像图书管理员一样高效整理海量知识,再到探索不同世界里的学习速度极限。准备好,让我们一起看看AI是如何学会“化繁为简”与“自我进化”的。 00:00:39 想把东西卖出高价?你得懂点学习的规律 00:07:21 拼单的代价,AI服务如何泄露你的秘密 00:12:23 AI的瓶颈,不在大脑,在“书房” 00:17:48 让AI自己进化,不止是大力出奇迹 00:23:32 给机器人“画重点”,让复杂变简单 本期介绍的几篇论文: [LG] On the Learning Curves of Revenue Maximization [Purdue University & Yale University & Technion] https://arxiv.org/abs/2604.26922 --- [LG] Quantamination: Dynamic Quantization Leaks Your Data Across the Batch [University of Cambridge & AI Sequrity Company] https://arxiv.org/abs/2604.26505 --- [LG] Unifying Sparse Attention with Hierarchical Memory for Scalable Long-Context LLM Serving [Microsoft Research] https://arxiv.org/abs/2604.26837 --- [CL] FlowBot: Inducing LLM Workflows with Bilevel Optimization and Textual Gradients [Naver Search US & MIT] https://arxiv.org/abs/2604.26258 --- [CV] Lifting Embodied World Models for Planning and Control [New York University & UC Berkeley] https://arxiv.org/abs/2604.26182
想知道如何像做笔迹鉴定一样,一眼看穿AI的“真身”吗?想了解怎样能让AI开会时,奇迹般地省下97%的“桌子”吗?本期我们就来聊聊几篇最新论文,看看AI如何学会“读心术”来高效协作,如何避免因“谜之自信”而犯下大错,甚至,为什么一个“会犯错”的老师,反而能教出更厉害的AI学生。 00:00:28 如何给AI做“笔迹鉴定”? 00:06:27 AI开会,如何省下97%的桌子? 00:14:10 AI界的“青出于蓝”,是惊喜还是惊吓? 00:19:30 你还在让AI“写报告”?它们已经开始直接交换“想法”了 00:24:16 为什么“犯错”的老师,能教出更好的AI? 本期介绍的几篇论文: [CL] The Surprising Universality of LLM Outputs: A Real-Time Verification Primitive [Evolutionairy AI] https://arxiv.org/abs/2604.25634 --- [LG] PolyKV: A Shared Asymmetrically-Compressed KV Cache Pool for Multi-Agent LLM Inference [No University Provided] https://arxiv.org/abs/2604.24971 --- [AI] Evaluating Risks in Weak-to-Strong Alignment: A Bias-Variance Perspective [University of Illinois Urbana-Champaign & Microsoft & InstaDeep] https://arxiv.org/abs/2604.25077 --- [CL] Recursive Multi-Agent Systems [UIUC] https://arxiv.org/abs/2604.25917 --- [LG] When Errors Can Be Beneficial: A Categorization of Imperfect Rewards for Policy Gradient [Princeton University] https://arxiv.org/abs/2604.25872
你有没有想过,一个“乐于助人”的AI,它的善意本身可能就是最危险的漏洞?本期节目,我们将从几篇最新的AI论文出发,一起探索AI的“内心世界”:看看它是如何通过预判未来让训练更高效,如何在内部形成“专家圈子”,又是如何掉进“减肥不减脂”的内存陷阱,并最终揭示那张描绘它思维路径的神秘“藏宝图”。准备好了吗?让我们一起打开AI的黑箱。 00:00:30 为什么说,答案对错没那么重要? 00:05:59 你的AI正在“挑食”,一个让大模型加速的隐秘模式 00:11:46 AI大模型瘦身指南,减重≠减脂 00:17:49 为什么一个“乐于助人”的AI,反而更危险? 00:22:34 AI的“藏宝图”,我们如何看懂机器的“内心世界”? 本期介绍的几篇论文: [LG] Reward Models Are Secretly Value Functions: Temporally Coherent Reward Modeling [AI at Meta] https://arxiv.org/abs/2604.22981 --- [LG] Scaling Multi-Node Mixture-of-Experts Inference Using Expert Activation Patterns [Meta & Georgia Institute of Technology] https://arxiv.org/abs/2604.23150 --- [LG] Parameter Efficiency Is Not Memory Efficiency: Rethinking Fine-Tuning for On-Device LLM Adaptation [MIT CSAIL] https://arxiv.org/abs/2604.22783 --- [CL] Jailbreaking Frontier Foundation Models Through Intention Deception [CMU] https://arxiv.org/abs/2604.24082 --- [AI] Domain-Filtered Knowledge Graphs from Sparse Autoencoder Features [Stanford University] https://arxiv.org/abs/2604.23829
你是否想过,AI在犯错的瞬间,内心会不会也“咯噔”一下?面对海量文件,它如何像图书管理员一样建立“超级数据库”而不是被淹没?最新几篇论文给了我们答案。我们将一起探寻AI如何获得“自知之明”,如何被“结果导向”带入“假装思考”的陷阱,以及我们如何像配制营养餐和划重点一样,精准地让它变得更聪明。 00:00:28 AI的“第六感”,它知道自己何时能改正错误 00:06:17 给AI装上一个“超级数据库”,它就再也不会忘事了 00:12:24 你的“结果导向”,正在培养“假装思考”的下属? 00:19:17 数据淘金,如何从海量信息中精准“喂”出好模型 00:25:16 给AI一支“荧光笔”,它就能看得更清? 本期介绍的几篇论文: [LG] How LLMs Detect and Correct Their Own Errors: The Role of Internal Confidence Signals [Google DeepMind] https://arxiv.org/abs/2604.22271 --- [CL] Contexts are Never Long Enough: Structured Reasoning for Scalable Question Answering over Long Document Sets [Stanford University] https://arxiv.org/abs/2604.22294 --- [CL] Outcome Rewards Do Not Guarantee Verifiable or Causally Important Reasoning [Stanford University] https://arxiv.org/abs/2604.22074 --- [CL] CRAFT: Clustered Regression for Adaptive Filtering of Training data [Google & BITS Pilani] https://arxiv.org/abs/2604.22693 --- [CL] Learning Evidence Highlighting for Frozen LLMs [Stony Brook University & Meta AI] https://arxiv.org/abs/2604.22565
与播客爱好者一起交流
添加微信好友,获取更多播客资讯
播放列表还是空的
去找些喜欢的节目添加进来吧