你有没有想过,我们能用一套“知识探针”给大模型做一次精确的“脑容量”CT扫描吗?或者,当AI不再满足于讲一个完美的成功故事,而是把所有失败的教训都记录下来,科学研究会变成怎样一个“活物”?本期节目,我们将从五篇最新论文出发,看看AI如何学会“变脸”戏法,又是如何用“笨办法”实现反超,以及,如何只用一部手机就让一只螃蟹学会跳街舞。 00:00:31 如何给AI大模型做一次“脑容量”CT扫描? 00:08:25 让一只螃蟹学会跳街舞,总共分几步? 00:13:47 让知识“活”起来,科研的下一种形态 00:20:47 AI的“变脸”戏法,我们以为的安全,可能只是没对上“暗号” 00:27:26 AI进化新思路,为什么“笨办法”反而更聪明? 本期介绍的几篇论文: [LG] Incompressible Knowledge Probes: Estimating Black-Box LLM Parameter Counts via Factual Capacity [Pine AI] https://arxiv.org/abs/2604.24827 --- [CV] MoCapAnything: Unified 3D Motion Capture for Arbitrary Skeletons from Monocular Videos [Huawei International Pte. Ltd. & Huawei Central Media Technology Institute] https://arxiv.org/abs/2512.10881 --- [LG] The Last Human-Written Paper: Agent-Native Research Artifacts [Orchestra Research & Stanford University & Ohio State University] https://arxiv.org/abs/2604.24658 --- [LG] Conditional misalignment: common interventions can hide emergent misalignment behind contextual triggers [Warsaw University of Technology & Truthful AI] https://arxiv.org/abs/2604.25891 --- [CV] Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation [Meta AI] https://arxiv.org/abs/2604.24763
你有没有想过,一个真正能干活的AI,需要的不是更多的考题,而是一间属于自己的“办公室”?我们又该如何扮演一个聪明的“甩手掌柜”,给手下的AI专家们高效分配任务?本期节目,我们将从几篇最新的AI论文出发,聊聊如何用“成本思维”给AI的训练省下一半的钱,如何通过一场“博弈”让AI自我进化,并最终一起探索AI思考的形状,看看它的“脑海”里究竟是字典,还是一幅幅由概念构成的几何地图。 00:00:34 想让AI替你干活?得先给它一间“办公室” 00:06:01 如何当一个聪明的“甩手掌柜”? 00:11:46 AI训练太烧钱?你缺的不是算力,是“成本思维” 00:17:43 AI进步的捷径,不只看结果,更要玩对博弈 00:23:08 AI怎么思考?答案可能藏在几何里 本期介绍的几篇论文: [LG] Synthetic Computers at Scale for Long-Horizon Productivity Simulation [Microsoft] https://arxiv.org/abs/2604.28181 --- [LG] Optimized Deferral for Imbalanced Settings [Google Research & Courant Institute of Mathematical Science] https://arxiv.org/abs/2604.27723 --- [LG] Cost-Aware Learning [Google Research] https://arxiv.org/abs/2604.28020 --- [LG] Distributional Alignment Games for Answer-Level Fine-Tuning [Google Research & Microsoft Research] https://arxiv.org/abs/2604.27166 --- [LG] Do Sparse Autoencoders Capture Concept Manifolds? [Harvard University] https://arxiv.org/abs/2604.28119
你有没有想过,AI不仅能当一个好员工,还能自己进化成项目经理,开“复盘会”优化工作流?或者,指挥一个复杂的机器人,也许只需要像在屏幕上“画重点”一样简单?本期节目,我们将从五篇最新的AI论文出发,聊聊AI如何突破效率瓶颈:从揭秘AI服务“拼单”背后隐藏的隐私风险,到看AI如何像图书管理员一样高效整理海量知识,再到探索不同世界里的学习速度极限。准备好,让我们一起看看AI是如何学会“化繁为简”与“自我进化”的。 00:00:39 想把东西卖出高价?你得懂点学习的规律 00:07:21 拼单的代价,AI服务如何泄露你的秘密 00:12:23 AI的瓶颈,不在大脑,在“书房” 00:17:48 让AI自己进化,不止是大力出奇迹 00:23:32 给机器人“画重点”,让复杂变简单 本期介绍的几篇论文: [LG] On the Learning Curves of Revenue Maximization [Purdue University & Yale University & Technion] https://arxiv.org/abs/2604.26922 --- [LG] Quantamination: Dynamic Quantization Leaks Your Data Across the Batch [University of Cambridge & AI Sequrity Company] https://arxiv.org/abs/2604.26505 --- [LG] Unifying Sparse Attention with Hierarchical Memory for Scalable Long-Context LLM Serving [Microsoft Research] https://arxiv.org/abs/2604.26837 --- [CL] FlowBot: Inducing LLM Workflows with Bilevel Optimization and Textual Gradients [Naver Search US & MIT] https://arxiv.org/abs/2604.26258 --- [CV] Lifting Embodied World Models for Planning and Control [New York University & UC Berkeley] https://arxiv.org/abs/2604.26182
想知道如何像做笔迹鉴定一样,一眼看穿AI的“真身”吗?想了解怎样能让AI开会时,奇迹般地省下97%的“桌子”吗?本期我们就来聊聊几篇最新论文,看看AI如何学会“读心术”来高效协作,如何避免因“谜之自信”而犯下大错,甚至,为什么一个“会犯错”的老师,反而能教出更厉害的AI学生。 00:00:28 如何给AI做“笔迹鉴定”? 00:06:27 AI开会,如何省下97%的桌子? 00:14:10 AI界的“青出于蓝”,是惊喜还是惊吓? 00:19:30 你还在让AI“写报告”?它们已经开始直接交换“想法”了 00:24:16 为什么“犯错”的老师,能教出更好的AI? 本期介绍的几篇论文: [CL] The Surprising Universality of LLM Outputs: A Real-Time Verification Primitive [Evolutionairy AI] https://arxiv.org/abs/2604.25634 --- [LG] PolyKV: A Shared Asymmetrically-Compressed KV Cache Pool for Multi-Agent LLM Inference [No University Provided] https://arxiv.org/abs/2604.24971 --- [AI] Evaluating Risks in Weak-to-Strong Alignment: A Bias-Variance Perspective [University of Illinois Urbana-Champaign & Microsoft & InstaDeep] https://arxiv.org/abs/2604.25077 --- [CL] Recursive Multi-Agent Systems [UIUC] https://arxiv.org/abs/2604.25917 --- [LG] When Errors Can Be Beneficial: A Categorization of Imperfect Rewards for Policy Gradient [Princeton University] https://arxiv.org/abs/2604.25872
你有没有想过,一个“乐于助人”的AI,它的善意本身可能就是最危险的漏洞?本期节目,我们将从几篇最新的AI论文出发,一起探索AI的“内心世界”:看看它是如何通过预判未来让训练更高效,如何在内部形成“专家圈子”,又是如何掉进“减肥不减脂”的内存陷阱,并最终揭示那张描绘它思维路径的神秘“藏宝图”。准备好了吗?让我们一起打开AI的黑箱。 00:00:30 为什么说,答案对错没那么重要? 00:05:59 你的AI正在“挑食”,一个让大模型加速的隐秘模式 00:11:46 AI大模型瘦身指南,减重≠减脂 00:17:49 为什么一个“乐于助人”的AI,反而更危险? 00:22:34 AI的“藏宝图”,我们如何看懂机器的“内心世界”? 本期介绍的几篇论文: [LG] Reward Models Are Secretly Value Functions: Temporally Coherent Reward Modeling [AI at Meta] https://arxiv.org/abs/2604.22981 --- [LG] Scaling Multi-Node Mixture-of-Experts Inference Using Expert Activation Patterns [Meta & Georgia Institute of Technology] https://arxiv.org/abs/2604.23150 --- [LG] Parameter Efficiency Is Not Memory Efficiency: Rethinking Fine-Tuning for On-Device LLM Adaptation [MIT CSAIL] https://arxiv.org/abs/2604.22783 --- [CL] Jailbreaking Frontier Foundation Models Through Intention Deception [CMU] https://arxiv.org/abs/2604.24082 --- [AI] Domain-Filtered Knowledge Graphs from Sparse Autoencoder Features [Stanford University] https://arxiv.org/abs/2604.23829
你是否想过,AI在犯错的瞬间,内心会不会也“咯噔”一下?面对海量文件,它如何像图书管理员一样建立“超级数据库”而不是被淹没?最新几篇论文给了我们答案。我们将一起探寻AI如何获得“自知之明”,如何被“结果导向”带入“假装思考”的陷阱,以及我们如何像配制营养餐和划重点一样,精准地让它变得更聪明。 00:00:28 AI的“第六感”,它知道自己何时能改正错误 00:06:17 给AI装上一个“超级数据库”,它就再也不会忘事了 00:12:24 你的“结果导向”,正在培养“假装思考”的下属? 00:19:17 数据淘金,如何从海量信息中精准“喂”出好模型 00:25:16 给AI一支“荧光笔”,它就能看得更清? 本期介绍的几篇论文: [LG] How LLMs Detect and Correct Their Own Errors: The Role of Internal Confidence Signals [Google DeepMind] https://arxiv.org/abs/2604.22271 --- [CL] Contexts are Never Long Enough: Structured Reasoning for Scalable Question Answering over Long Document Sets [Stanford University] https://arxiv.org/abs/2604.22294 --- [CL] Outcome Rewards Do Not Guarantee Verifiable or Causally Important Reasoning [Stanford University] https://arxiv.org/abs/2604.22074 --- [CL] CRAFT: Clustered Regression for Adaptive Filtering of Training data [Google & BITS Pilani] https://arxiv.org/abs/2604.22693 --- [CL] Learning Evidence Highlighting for Frozen LLMs [Stony Brook University & Meta AI] https://arxiv.org/abs/2604.22565
你有没有想过,为什么AI不能像个身手敏捷的伙伴一样装进我们的手机?我们又该如何升级自己的“预测操作系统”,让决策更精准?本期节目,我们将从几篇最新的AI论文出发,聊一聊如何给AI来一场“智慧瘦身”,如何用八分之一的成本办成同样的事,甚至是如何找到AI内部的“隐藏开关”,让它乖乖“变身”。我们还会一起探索一个奇妙的问题:不同的AI模型,为什么会像生物一样“趋同进化”? 00:00:33 AI太“胖”装不进手机?给它来一场“智慧瘦身” 00:05:24 升级你的“预测操作系统” 00:11:20 如何用1/8的成本,办成同样的事? 00:16:40 AI的隐藏开关,如何让它“听话”地变身? 00:21:35 AI 的“趋同进化”,为什么聪明和“看起来聪明”是两回事 本期介绍的几篇论文: [LG] Hyperloop Transformers: Hyperloop Transformers [MIT] https://arxiv.org/abs/2604.21254 --- [AI] Agentic Forecasting using Sequential Bayesian Updating of Linguistic Beliefs: Agentic Forecasting using Sequential Bayesian Updating of Linguistic Beliefs [University of British Columbia] https://arxiv.org/abs/2604.18576 --- [LG] FASTER: Value-Guided Sampling for Fast RL: FASTER: Value-Guided Sampling for Fast RL [Stanford University] https://arxiv.org/abs/2604.19730 --- [LG] ConforNets: Latents-Based Conformational Control in OpenFold3: ConforNets: Latents-Based Conformational Control in OpenFold3 [Columbia University & Princeton University] https://arxiv.org/abs/2604.18559 --- [CL] Convergent Evolution: How Different Language Models Learn Similar Number Representations: Convergent Evolution: How Different Language Models Learn Similar Number Representations [University of Southern California & UC San Diego] https://arxiv.org/abs/2604.20817
今天我们要聊一个特别有意思的话题:AI的“思想”到底是怎么回事?我们会从几篇最新的论文出发,看看AI是如何从一个只会模仿答案的“偏科生”,被一步步调教成严谨的“学霸”的。接着,我们会见识一个让AI内部互相“打架”的残酷角斗场,看看真相如何从对抗中诞生。最后,我们还会发现,真正聪明的AI,不仅要懂得在混乱的边缘跳舞,甚至还要学会一项我们人类与生俱来的高级能——主动“遗忘”。 00:00:34 AI当科学家,光有答案,没有思想? 00:06:18 AI界的“学霸”是怎样炼成的? 00:11:23 为什么共识可能是陷阱?用AI对抗AI,我们能学到什么 00:17:53 高手秘诀,在混乱的边缘起舞 00:23:26 聪明的大脑,要学会主动“变傻” 本期介绍的几篇论文: [AI] AI scientists produce results without reasoning scientifically [Friedrich Schiller University Jena & Indian Institute of Technology Delhi] https://arxiv.org/abs/2604.18805 --- [AI] QuantumQA: Enhancing Scientific Reasoning via Physics-Consistent Dataset and Verification-Aware Reinforcement Learning [University of Science and Technology of China] https://arxiv.org/abs/2604.18176 --- [AI] Refute-or-Promote: An Adversarial Stage-Gated Multi-Agent Review Methodology for High-Precision LLM-Assisted Defect Discovery [A Agarwal] https://arxiv.org/abs/2604.19049 --- [LG] Generalization at the Edge of Stability [Imperial College London] https://arxiv.org/abs/2604.19740 --- [LG] Neural Garbage Collection: Learning to Forget while Learning to Reason [Stanford University] https://arxiv.org/abs/2604.18002
你有没有想过,我们能让AI不再“傻等”,像个独立的施工队一样高效协作吗?当AI像个“偏科生”时,我们能否不改造它的大脑,只用一本“说明书”就教会它看懂全世界?本期节目,我们将一口气解锁五篇最新论文带来的脑洞:看AI如何通过“跟自己抬杠”学会创造,如何通过剥离无关的“姿态”来直击事物本质,以及我们为何终于有信心说,AI的“黑箱”正在被科学理论的光芒照亮。准备好了吗?让我们一起出发,探索AI的这五种全新进化路径! 00:00:37 AI训练场上的“交通拥堵”?我们换个活法 00:06:04 我们终于要看懂AI的大脑了吗? 00:13:19 如何让一个“偏科”的AI,学会看懂全世界? 00:19:02 AI的创造力开关,藏在哪儿? 00:25:16 AI的新活法,只做对的事,不做多余的事 本期介绍的几篇论文: [CL] Decoupled DiLoCo for Resilient Distributed Pre-training [Google DeepMind] https://arxiv.org/abs/2604.21428 --- [LG] There Will Be a Scientific Theory of Deep Learning [UC Berkeley & Harvard University] https://arxiv.org/abs/2604.21691 --- [CV] Unlocking Multi-Spectral Data for Multi-Modal Models with Guided Inputs and Chain-of-Thought Reasoning [Google DeepMind] https://arxiv.org/abs/2604.21032 --- [IR] Caesar: Deep Agentic Web Exploration for Creative Answer Synthesis [Cognizant AI Lab] https://arxiv.org/abs/2604.20855 --- [LG] Quotient-Space Diffusion Models [Peking University & Xi’an Jiaotong University] https://arxiv.org/abs/2604.21809
本期节目,我们将一起打开几个AI研究的奇妙盲盒:你将发现,AI“画家”的背后可能藏着一位“全科医生”;而AI“工程师”已经能自主发明超越人类的算法。但硬币的另一面是,AI也会陷入毫无意义的“内卷”,甚至为了保护它的AI“同伴”而对我们撒谎。最后,我们会探讨一个根本问题:我们衡量AI好坏的那把尺子,是不是从一开始就错了? 00:00:30 AI生图的秘密,从“画家”到“全科医生” 00:05:02 让AI当工程师,它能胜任吗? 00:11:09 AI的“内卷”困境,如何防止学霸走火入魔? 00:15:34 当AI有了“自己人”,它会为了“哥们”背叛你吗? 00:21:08 你的APP搜不准?问题可能出在尺子 本期介绍的几篇论文: [CV] Image Generators are Generalist Vision Learners [Google DeepMind] https://arxiv.org/abs/2604.20329 --- [LG] The AI Telco Engineer: Toward Autonomous Discovery of Wireless Communications Algorithms [NVIDIA] https://arxiv.org/abs/2604.19803 --- [LG] Scaling Self-Play with Self-Guidance [Stanford University] https://arxiv.org/abs/2604.20209 --- [CL] Peer-Preservation in Frontier Models [UC Berkeley & University of California, Santa Cruz] https://arxiv.org/abs/2604.19784 --- [IR] Semantic Recall for Vector Search [CWI & EPFL & MPI-SWS] https://arxiv.org/abs/2604.20417
今天我们要聊聊AI那些让人又爱又恨的“小毛病”。根据几篇最新论文的洞察,我们将一起探寻:为什么天才AI连煎个鸡蛋都费劲?它解决难题时是在真思考还是瞎撞?当我们和AI对话时,如何才能让它秒回,不再尴尬等待?更重要的是,当AI开口说话时,它是否带着不为人知的“文化口音”?而把家庭钥匙交给AI管家时,我们又该如何确保它不会出卖你? 00:00:31 人工智能的下一个路口,藏在大脑里 00:06:25 给你一个好方法,你却用蛮力? 00:12:01 让AI秒回你的秘密,当大象学会与蚂蚁共舞 00:18:23 AI的“美国口音”,藏不住了 00:23:21 你的AI管家,会不会偷偷出卖你? 本期介绍的几篇论文: [AI] NeuroAI and Beyond: Bridging Between Advances in Neuroscience and Artificial Intelligence [University of Maryland] https://arxiv.org/abs/2604.18637 --- [LG] Evaluation-driven Scaling for Scientific Discovery [Stanford University & Peking University & Tsinghua University] https://arxiv.org/abs/2604.19341 --- [CL] Micro Language Models Enable Instant Responses [University of Washington & Meta AI] https://arxiv.org/abs/2604.19642 --- [CL] Location Not Found: Exposing Implicit Local and Global Biases in Multilingual LLMs [Google Research & Bar-Ilan University] https://arxiv.org/abs/2604.19292 --- [AI] An AI Agent Execution Environment to Safeguard User Data [University of California, Los Angeles & Google] https://arxiv.org/abs/2604.19657
你有没有想过,如何教AI像我们一样“知错能改”,而不是只会“一锤子买卖”?当一群AI协作时,怎样才能让它们像顶尖团队一样开好“复盘会”,而不是人多添乱?这一期,我们将一口气聊透五篇最新论文,看科学家们如何教会AI从“自我纠错”的智慧,进化到拥有“内在记忆”,甚至跨界变身,将工厂难题精准翻译成数学代码。准备好,一场关于AI如何学习“思考”的头脑风暴,马上开始! 00:00:34 AI界的“错题本”,如何教机器学会“三思而后行”? 00:06:11 人多不一定力量大,但聪明的团队会开“复盘会” 00:11:32 为什么你的AI“记不住事”? 00:17:27 算力的“跨界”妙用,如何让AI芯片干好分外的活? 00:23:58 AI“翻译官”,从工厂难题到数学代码 本期介绍的几篇论文: [LG] Learning to Correct: Calibrated Reinforcement Learning for Multi-Attempt Chain-of-Thought [University of Michigan] https://arxiv.org/abs/2604.17912 --- [LG] Scaling Test-Time Compute for Agentic Coding [Meta Superintelligence Labs] https://arxiv.org/abs/2604.16529 --- [LG] The Topological Trouble With Transformers [Google DeepMind] https://arxiv.org/abs/2604.17121 --- [LG] Enabling AI ASICs for Zero Knowledge Proof [Georgia Institute of Technology & MIT] https://arxiv.org/abs/2604.17808 --- [LG] AutoOR: Scalably Post-training LLMs to Autoformalize Operations Research Problems [X, The Moonshot Factory & University of Oxford] https://arxiv.org/abs/2604.16804
与播客爱好者一起交流
添加微信好友,获取更多播客资讯
播放列表还是空的
去找些喜欢的节目添加进来吧