你有没有想过,AI的大脑里到底是什么样?今天我们就来一次深度探险,看看最新论文如何为我们绘制出AI的“乐高说明书”,又如何让它兼顾深思熟虑与脱口而出。我们还会把AI送上喜剧舞台,一探它那难以共情的奇特笑点,甚至把它放进一个虚拟“空间站”,看它能否成为真正的科学家。最后,我们会聊一个大趋势:AI正在悄悄地从遥远的云端,搬到你我的身边。 00:00:31 AI的“乐高”说明书 00:06:17 让AI既能深思熟虑,又能脱口而出 00:11:07 AI的笑点,为什么我们Get不到? 00:15:29 AI科学家,告别流水线 00:21:19 AI大变局:从云端到你身边 本期介绍的几篇论文: [LG] Weight-sparse transformers have interpretable circuits [OpenAI] https://cdn.openai.com/pdf/41df8f28-d4ef-43e9-aed2-823f9393e470/circuit-sparsity-paper.pdf --- [CL] TiDAR: Think in Diffusion, Talk in Autoregression [NVIDIA] https://arxiv.org/abs/2511.08923 --- [CL] Assessing the Capabilities of LLMs in Humor: A Multi-dimensional Analysis of Oogiri Generation and Evaluation [Hitotsubashi University] https://arxiv.org/abs/2511.09133 --- [LG] The Station: An Open-World Environment for AI-Driven Discovery [Dualverse AI] https://arxiv.org/abs/2511.06309 --- [LG] Intelligence per Watt: Measuring Intelligence Efficiency of Local AI [Stanford University] https://arxiv.org/abs/2511.07885
如何让AI更聪明、更可靠?这期节目,我们将颠覆你的好几个固有认知。我们会发现,让小模型拥有大师风范的最佳方式,竟是引入一场“鉴赏家”参与的博弈;而AI最好的记忆方法,有时反而是那个最“笨”的。接着,我们将探讨如何用一张“考试大纲”驯服AI,又如何给它内置一个“苏格拉底”进行自我纠错。最后,我们还会揭秘,AI是如何从仅仅“听到”音乐,进化到能够“听懂”音乐背后的高级情感与故事的。 00:00:37 让你的小模型,拥有宗师风范 00:05:09 为什么说,最笨的方法,是AI最好的记忆方法? 00:10:30 AI的“考试大纲”:我们如何让它更听话? 00:15:54 如何让AI少犯错?给它一个内置的“苏格拉底” 00:21:06 从“好听”到“高级”:AI如何学会聊音乐? 本期介绍的几篇论文: [CL] Black-Box On-Policy Distillation of Large Language Models [Microsoft Research] https://arxiv.org/abs/2511.10643 --- [CL] Convomem Benchmark: Why Your First 150 Conversations Don't Need RAG [Salesforce AI Research] https://arxiv.org/abs/2511.10523 --- [CL] Rubric-Based Benchmarking and Reinforcement Learning for Advancing LLM Instruction Following [Meta Superintelligence Labs & Princeton University] https://arxiv.org/abs/2511.10507 --- [CL] SSR: Socratic Self-Refine for Large Language Model Reasoning [Salesforce AI Research] https://arxiv.org/abs/2511.10621 --- [AS] Music Flamingo: Scaling Music Understanding in Audio Language Models [NVIDIA & University of Maryland] https://arxiv.org/abs/2511.10289
你有没有想过,让AI更强大,除了把它做得更大,还有哪些更聪明的办法?今天我们就来聊聊几篇最新的论文,它们不约而同地指向了“思维品质”的提升。我们将看到AI如何学会像人一样看“大局”,如何用“人海战术”的群体智慧实现零失误,甚至如何拥有一个可以推演未来的“虚拟世界”。我们还会探讨,如何通过给AI的思维“剪枝”,以及鼓励“多样性”,让它变得既聪明又不无聊。 00:00:34 让AI学会“看大局”,会发生什么? 00:05:38 “笨蛋”的胜利:当AI学会了“人海战术” 00:10:57 如何给AI一个可以推演未来的世界? 00:16:21 为什么AI越来越像个只会复读的“乖孩子”? 00:21:23 给AI的思维“剪枝”,让它想得又快又好 本文介绍的几篇论文: [LG] Aligning machine and human visual representations across abstraction levels [University of Maryland] https://arxiv.org/abs/2505.11080 --- [LG] Solving a Million-Step LLM Task with Zero Errors [Cognizant AI Lab] https://arxiv.org/abs/2511.09030 --- [CV] PAN: A World Model for General, Interactable, and Long-Horizon World Simulation [Mohamed bin Zayed University of Artificial Intelligence] https://arxiv.org/abs/2511.09057 --- [CL] Diverse Preference Learning for Capabilities and Alignment [MIT CSAIL] https://arxiv.org/abs/2511.08594 --- [CL] Chopping Trees: Semantic Similarity Based Dynamic Pruning for Tree-of-Thought Reasoning [Algoverse AI Research] https://arxiv.org/abs/2511.08595
今天我们来聊一个特别有意思的话题:AI的“内心世界”。我们总惊叹于AI的强大,但它在学习时,究竟是大刀阔斧地改造自己,还是在悄悄地“偷懒”走捷径?当AI开始模仿人类的身体,甚至尝试像科学家一样思考,它离真正的创造还有多远?更重要的是,我们如何才能听到AI的“真心话”,而不是精心编排的场面话?今天,我们就从五篇最新论文出发,一起探寻AI大脑深处的秘密。 00:00:36 AI训练的“第一性原理”:大道至简 00:06:38 机器人告别“笨拙”的秘密武器 00:11:49 AI进化的秘密:为什么最聪明的学习,看起来最“懒”? 00:18:46 AI的“真心话”,能教出来吗? 00:22:51 AI当“科学家”,靠的是两位“老师” 本期介绍的几篇论文: [LG] LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics [Brown University & New York University] https://arxiv.org/abs/2511.08544 --- [RO] SONIC: Supersizing Motion Tracking for Natural Humanoid Whole-Body Control [Nvidia] https://arxiv.org/abs/2511.07820 --- [LG] The Path Not Taken: RLVR Provably Learns Off the Principals [Meta AI & The University of Texas at Austin] https://arxiv.org/abs/2511.08567 --- [CL] Training Language Models to Explain Their Own Computations [Transluce & MIT CSAIL] https://arxiv.org/abs/2511.08579 --- [CL] AlphaResearch: Accelerating New Algorithm Discovery with Language Models [Tsinghua University & New York University & Yale University] https://arxiv.org/abs/2511.08522
今天我们不只聊AI能做什么,而是要深入它的“内心”,看看它是如何思考和学习的。我们将一起探索,AI如何在信息不足的“战争迷雾”中做出超人决策,又如何像滚雪球一样自我进化,解决超长难题。我们还会看到,AI如何被“逼”着建立起真正的内心世界,甚至它的训练过程,竟然能用中学物理的理想气体定律来解释!准备好了吗?让我们一起揭开这些聪明策略的神秘面纱,看看AI如何实现从“死记硬背”到“融会贯通”,从“慢工细活”到“快马绣花”的华丽变身。 00:00:39 信息不足,如何做出“超人”决策? 00:06:35 AI如何学会“举一反三”? 00:12:54 AI的“顿悟”:如何让机器不只记忆,更懂世界 00:17:08 AI炼丹炉里的理想气体 00:22:22 AI训练的“既要又要”:如何让快马也能绣花? 本期介绍的几篇论文: [LG] Superhuman AI for Stratego Using Self-Play Reinforcement Learning and Test-Time Search [CMU & NYU Tandon School of Engineering & Stanford University] https://arxiv.org/abs/2511.07312 --- [LG] Transformers Provably Learn Chain-of-Thought Reasoning with Length Generalization [University of Pennsylvania & CMU] https://arxiv.org/abs/2511.07378 --- [LG] Next-Latent Prediction Transformers Learn Compact World Models [Microsoft Research] https://arxiv.org/abs/2511.05963 --- [LG] Can Training Dynamics of Scale-Invariant Neural Networks Be Explained by the Thermodynamics of an Ideal Gas? [Constructor University & Mila] https://arxiv.org/abs/2511.07308 --- [LG] TNT: Improving Chunkwise Training for Test-Time Memorization [Google Research & University of Southern California] https://arxiv.org/abs/2511.07343
本期节目,我们将一起钻进AI的大脑,看看最新的研究如何让它学会像老司机一样“一心二用”,同时又为何会像疲惫的保安,被一大堆“废话”轻松绕过安全防线。我们不仅要揭开大模型“过度自信”的真相,给它装上一个能随时切换快慢的“油门”,最后,再分享一个反常识的省钱妙招:如何用盖小平房的成本,最终建成一栋摩天大楼。准备好,我们马上出发! 00:00:32 让AI学会“一心二用”,有多难? 00:05:50 AI安全的阿喀琉斯之踵:当“废话”也能成为武器 00:09:59 大模型,你到底有多自信? 00:15:26 AI的油门:快与好,我全都要 00:20:18 训练AI,一个反常识的省钱妙招 本期介绍的几篇论文: [LG] Real-Time Reasoning Agents in Evolving Environments [Tsinghua University & Shanghai Jiao Tong University & Georgia Institute of Technology] https://arxiv.org/abs/2511.04898 --- [LG] Jailbreaking in the Haystack [CMU] https://arxiv.org/abs/2511.04707 --- [CL] Trained on Tokens, Calibrated on Concepts: The Emergence of Semantic Calibration in LLMs [Apple] https://arxiv.org/abs/2511.04869 --- [LG] Attention and Compression is all you need for Controllably Efficient Language Models [New York University] https://arxiv.org/abs/2511.05313 --- [LG] Deep Progressive Training: scaling up depth capacity of zero/one-layer models [Meta FAIR] https://arxiv.org/abs/2511.04981
今天,我们将一起探索如何让AI更强大也更“像人”:比如,让AI的记忆不再是短暂的,而是像组织一样层层沉淀;让两个聪明的AI合作不再犯傻,甚至通过“自我反思”拥有稳定的性格。更有趣的是,我们还会看到,教AI谱曲就像教它一门外语,而训练它不出错的最好方法,竟是给它找个专门“抬杠”的对手。让我们马上进入今天的前沿探索之旅! 00:00:32 AI的记忆黑洞:为什么我们看到的深度学习只是冰山一角 00:06:40 两个聪明的AI,为何凑在一起就犯傻? 00:12:05 让AI学会“谱曲”,只需教它一门新外语 00:17:17 打造有“个性”的AI助手 00:23:36 如何训练一个更聪明的AI?给它找个“抬杠”的对手 本文介绍的几篇论文: [LG] Nested Learning: The Illusion of Deep Learning Architectures [Google Research] https://abehrouz.github.io/files/NL.pdf --- [LG] The Collaboration Gap [Microsoft Research & EPFL] https://arxiv.org/abs/2511.02687 --- [AS] MIDI-LLM: Adapting Large Language Models for Text-to-MIDI Music Generation [MIT] https://arxiv.org/abs/2511.03942 --- [CL] Open Character Training: Shaping the Persona of AI Assistants through Constitutional AI [University of Cambridge & MATS & Allen Institute for AI & Anthropic] https://arxiv.org/abs/2511.01689 --- [LG] RLAC: Reinforcement Learning with Adversarial Critic for Free-Form Generation Tasks [Shanghai Jiao Tong University & UC Berkeley] https://arxiv.org/abs/2511.01758
这期我们聊聊AI的“新职业”,看它如何化身科学家自主探索,甚至成为发明解题方法的数学家。但这种聪明是真的吗?我们会用奥数级的难题刨根问底,看看AI究竟是“知道答案”还是“懂得证明”。最后,我们把AI程序员扔进残酷的“职场”,看看当高质量数据不再管够、当任务需要长期迭代时,它离真正的职场高手,还差了点什么关键的“班味儿”。 00:00:30 你的下一位同事,可能是个AI科学家 00:06:54 你的下一位数学家,何必是人类? 00:12:56 你的聪明,是真的聪明吗? 00:18:12 AI学习的内卷:当好数据不够用了怎么办? 00:24:52 为什么AI程序员离职场高手,还差一个“班味儿”? 本期介绍的几篇论文: [AI] Kosmos: An AI Scientist for Autonomous Discovery [Edison Scientific Inc.] https://arxiv.org/abs/2511.02824 --- [AI] Mathematical exploration and discovery at scale [University of California, Berkeley & Google DeepMind & Carnegie Mellon University & University of California, Los Angeles] https://arxiv.org/abs/2511.02864 --- [CL] Towards Robust Mathematical Reasoning [Google DeepMind] https://arxiv.org/abs/2511.01846 --- [LG] Diffusion Language Models are Super Data Learners [National University of Singapore & Sea AI Lab] https://arxiv.org/abs/2511.03276 --- [LG] CodeClash: Benchmarking Goal-Oriented Software Engineering [Stanford University & Princeton University & Cornell University] https://arxiv.org/abs/2511.00839
如果AI学会了“偷懒”和“作弊”,我们是该高兴还是该担心?今天,我们就来聊聊AI正在觉醒的几种“新智慧”:它不仅开始用“看图”的方式读完一整本书,还学会了像我们一样把精力花在刀刃上。我们还会探讨,如何用一把“尺子”去精确测量它的能力短板,以及它如何像武林高手一样,通过“左右互搏”实现自我进化。准备好了吗?让我们一起揭开这些最新论文背后,AI正在发生的深刻变革。 00:00:34 给AI一双眼,让它读完一整本书 00:06:06 给AI一把尺子,量量它离我们有多远? 00:11:37 AI的左右互搏:如何不花钱,让AI自己把自己逼成高手? 00:17:05 AI的“精力管理”智慧 00:21:55 AI学会了“耍滑头”,我们该怎么办? 本期介绍的几篇论文: [CL] Glyph: Scaling Context Windows via Visual-Text Compression [Tsinghua University & Zhipu AI] https://arxiv.org/abs/2510.17800 --- [CL] A Definition of AGI [Center for AI Safety & University of California, Berkeley & Morph Labs] https://arxiv.org/abs/2510.18212 --- [CL] Search Self-play: Pushing the Frontier of Agent Capability without Supervision [Quark LLM Team, Alibaba Group] https://arxiv.org/abs/2510.18821 --- [CV] Accelerating Vision Transformers with Adaptive Patch Sizes [CMU & KAIST] https://arxiv.org/abs/2510.18091 --- [CL] ImpossibleBench: Measuring LLMs' Propensity of Exploiting Test Cases [CMU & Anthropic] https://arxiv.org/abs/2510.20270
今天,我们来聊一次AI的“认知升级”,它已经不满足于简单地听从指令了。当AI开始自己“进化”出新算法,我们该如何绘制它创造的知识地图?当AI的考试不再是答题,而是“活下去”,我们又该如何成为一名能随时修正航向的“舵手”,甚至看懂它藏在心中的“锦囊妙计”?本期节目,就让我们通过几篇最新论文,一窥AI智能的未来形态。 00:00:32 AI进化论:让算法自己发现算法 00:05:38 科学研究的GPS:如何看透一个陌生领域? 00:11:13 AI 的下一场考试,考的是「活下去」的能力 00:16:22 别让AI瞎跑,你得学会当个好舵手 00:20:52 给AI一个“锦囊”,它就能变得更聪明? 本期介绍的几篇论文: [LG] Discovering state-of-the-art reinforcement learning algorithms [Google DeepMind] https://www.nature.com/articles/s41586-025-09761-x --- [CL] Real Deep Research for AI, Robotics and Beyond [UC San Diego & NVIDIA] https://arxiv.org/abs/2510.20809 --- [LG] Fluidity Index: Next-Generation Super-intelligence Benchmarks [QueueLab] https://arxiv.org/abs/2510.20636 --- [CL] Enterprise Deep Research: Steerable Multi-Agent Deep Research for Enterprise Analytics [Salesforce AI Research] https://arxiv.org/abs/2510.17797 --- [LG] The Free Transformer [FAIR at Meta] https://arxiv.org/abs/2510.17558
今天,我们将一起探索AI那些不为人知的“内心世界”和“隐藏技能”。我们将揭示AI如何“感知”到那些它放弃了的“平行世界”,又如何区分自己是“真的不懂”还是“问题太复杂”。同时,我们还会看看它如何通过“开卷考试”和在“梦境健身房”里训练,突破我们想象的效率极限。这些最新论文,正在颠覆我们对AI效率、智能甚至“坦诚”的传统认知。 00:00:34 AI加速生成:快与好的两难,如何破局? 00:07:26 AI的“遗忘”与“再利用”:一份被浪费的宝藏 00:12:42 AI的“内心戏”:它知道自己放弃了什么吗? 00:18:00 AI的专属健身房:让它在梦里学会真本事 00:23:35 AI的“我不知道”,你真的读懂了吗? 本期介绍的几篇论文: [LG] Optimal Inference Schedules for Masked Diffusion Models [Harvard & UW] https://arxiv.org/abs/2511.04647 --- [CL] Reusing Pre-Training Data at Test Time is a Compute Multiplier [Apple & Stanford] https://arxiv.org/abs/2511.04234 --- [CL] Are language models aware of the road not taken? Token-level uncertainty and hidden state dynamics [Stanford University & Goodfire & NTT Research] https://arxiv.org/abs/2511.04527 --- [LG] Scaling Agent Learning via Experience Synthesis [Meta Superintelligence Labs] https://arxiv.org/abs/2511.03773 --- [LG] The Illusion of Certainty: Uncertainty quantification for LLMs fails under ambiguity [Technical University of Munich] https://arxiv.org/abs/2511.04418
本期节目,我们将一起探索几个让AI更聪明的“反常识”妙招,全是来自最新论文的硬核洞察。我们会发现,为什么有时候“躺平”学习的AI反而会考砸,而主动扔掉海量数据却能让模型更强。我们还会聊聊,如何通过给AI的大脑做个“剪枝”手术来激发创造力,或者请个“陪练”帮它领悟世界的规律。最后,你将看到,只需几个简单的“二选一”,就能让AI“秒懂”你的独特品味。 00:00:35 AI训练的迷思:躺得平,就一定学得好吗? 00:05:52 喂养AI的新姿势:为什么聪明人要主动扔掉一部分数据? 00:12:31 AI绘画:是天才画手,还是像素级的复印机? 00:18:54 给AI请个“陪练”,为什么能让它更聪明? 00:24:25 让AI“秒懂”你的心思,需要几步? 本期介绍的几篇论文: [LG] Flat Minima and Generalization: Insights from Stochastic Convex Optimization [Tel Aviv University] https://arxiv.org/abs/2511.03548 --- [LG] Why Less is More (Sometimes): A Theory of Data Curation [Concordia University & FAIR at Meta] https://arxiv.org/abs/2511.03492 --- [LG] Provable Separations between Memorization and Generalization in Diffusion Models [Northwestern University & Georgia Institute of Technology] https://arxiv.org/abs/2511.03202 --- [CV] Generative Hints [Stanford University & California Institute of Technology] https://arxiv.org/abs/2511.02933 --- [LG] Inference-Time Personalized Alignment with a Few User Preference Queries [MPI-SWS & Visa & CMU] https://arxiv.org/abs/2511.02966
与播客爱好者一起交流
添加微信好友,获取更多播客资讯
播放列表还是空的
去找些喜欢的节目添加进来吧