今天我们来聊聊AI如何学会“精打细算”:它既能像乐团指挥一样,用小模型撬动大任务,也能像个老练的棋手,知道什么时候该点到为止,不再过度思考。我们还会揭开AI成功的两个秘密:一个藏在摄像头无形的运动轨迹里,另一个则深植于它追求“极简”的算法内核。最后,我们会重新审视AI过去十年的万倍效率飞跃,看看这惊人的步究竟是来自无数小改进,还是几次决定性的“工业革命”。准备好了吗?让我们一起探索这些最新论文中反常识的迷人洞见。 00:00:40 聪明人的新思路:如何用小模型办成大事 00:05:26 不用看画面,如何“猜”出视频里发生了什么? 00:09:54 AI万倍效率提升,原来只靠两件事? 00:16:47 为什么AI这么神?一篇论文揭示了它的极简主义内核 00:22:34 AI学会了“点到为止”,这事儿为啥重要?
我们都知道AI越来越强大,但你有没有想过,我们该如何让它跑得更快、更稳,甚至更“多才多艺”?本期节目,我们将一起探索几篇最新论文,看看科学家们是如何给AI的训练过程装上一个更稳健的导航系统,并揭开AI绘画高手背后“民间偏方”的科学原理。我们还会聊到,如何像培养一个“通才”一样,让一个AI同时学会两百件事。最后,我们将见证两种神奇的“魔法”:如何在没有数据的情况下给大模型高效“瘦身”,以及如何对一个黑箱模型进行精准的“微创手术”。 00:00:41 如何给AI装上一个更聪明的“导航系统” 00:05:19 AI绘画高手,背后藏着什么训练秘诀? 00:11:06 AI通才养成记:如何让一个机器学会200件事? 00:17:12 AI模型“瘦身”,如何做到无米之炊? 00:25:14 给AI模型做微创手术,需要几步? 本期介绍的几篇论文: [LG] ROOT: Robust Orthogonalized Optimizer for Neural Network Training [Huawei Noah’s Ark Lab] https://arxiv.org/abs/2511.20626 --- [LG] Demystifying Diffusion Objectives: Reweighted Losses are Better Variational Bounds [Google DeepMind] https://arxiv.org/abs/2511.19664 --- [LG] Learning Massively Multitask World Models for Continuous Control [University of California San Diego] https://arxiv.org/abs/2511.19584 --- [LG] CafeQ: Calibration-free Quantization via Learned Transformations and Adaptive Rounding [Google] https://arxiv.org/abs/2511.19705 --- [LG] ModHiFi: Identifying High Fidelity predictive components for Model Modification [CSA, IISc & HP Inc. AI Lab & Google] https://arxiv.org/abs/2511.19566
AI的学习和思考方式正在发生一场静悄悄的革命。这一期的最新论文,将带我们深入AI的“思维”深处:从和AI陪练一起进化的动态标准,到扔掉秘籍、只学“心法”的速成功夫;从让AI睁开“心眼儿”看懂空间,到将你的指令变成它的“临时大脑”;最后,我们还会看看如何治好AI写作的“耿直病”,让它变得更聪明、更高效。准备好了吗?让我们一起探索AI如何变得更像我们。 00:00:36 让AI成为一个既聪明又靠谱的研究助理 00:05:51 想学绝世武功,非得有本秘籍吗? 00:10:25 让AI睁开‘心眼儿’看世界 00:14:33 你的指令,如何成为AI的临时大脑? 00:19:45 AI写稿太慢?也许是它太“耿直”了 本期介绍的几篇论文: [CL] DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research [University of Washington & Allen Institute for AI & MIT] https://arxiv.org/abs/2511.19399 --- [LG] Flow Map Distillation Without Data [MIT & NYU] https://arxiv.org/abs/2511.19428 --- [CV] Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens [UC Berkeley & UCLA] https://arxiv.org/abs/2511.19418 --- [LG] Equivalence of Context and Parameter Updates in Modern Transformer Blocks [Google Research] https://arxiv.org/abs/2511.17864 --- [LG] CDLM: Consistency Diffusion Language Models For Faster Sampling [Seoul National University & UC Berkeley] https://arxiv.org/abs/2511.19269
今天我们要聊的,不是AI模型又变大了多少,而是它们如何从内部变得更“聪明”。我们将看到,最新的论文如何教会AI从“指哪打哪”的工具,进化为能“懂你意思”的助手;又如何让强大的AI科学家学会“混圈子”,融入人类的协作生态。我们还会探讨,AI如何拥有“预算意识”,像个聪明的管家一样精打细算;以及当AI变小时,为什么最先退化的竟然是“眼力”而不是“脑力”。最后,我们还会揭秘AI“高考”中的乌龙事件,看看科学家们如何给AI的“评分尺”纠偏,这一切都指向了AI发展的新方向。 00:00:42 让电脑学会“指哪打哪”之后,我们如何教它“看懂”? 00:06:05 AI也能当科学家?关键要先学会“混圈子” 00:11:20 聪明的AI,是如何学会“省钱”的? 00:16:17 AI的“高考”,谁来检查试卷的错别字? 00:21:14 AI变笨的秘密:为什么“眼力”比“脑力”更脆弱? 本期介绍的几篇论文: [CV] SAM 3: Segment Anything with Concepts [Meta Superintelligence Labs] https://arxiv.org/abs/2511.16719 --- [AI] OmniScientist: Toward a Co-evolving Ecosystem of Human and AI Scientists [Tsinghua University] https://arxiv.org/abs/2511.16931 --- [LG] Budget-Aware Tool-Use Enables Effective Agent Scaling [Google Cloud AI Research & Google DeepMind & UC Santa Barbara] https://arxiv.org/abs/2511.17006 --- [LG] Fantastic Bugs and Where to Find Them in AI Benchmarks [Stanford University] https://arxiv.org/abs/2511.16842 --- [CV] Downscaling Intelligence: Exploring Perception and Reasoning Bottlenecks in Small Multimodal Models [Stanford University] https://arxiv.org/abs/2511.17487
今天,我们不聊AI有多神奇,而是要给它来一次全面的“体检”,看看它那道看不见的“玻璃天花板”究竟在哪。接着,我们会颠覆你对AI训练的认知,看看它除了“上课”,如何像生物一样“演化”,以及它强大的推理能力背后,是否藏着一套需要我们帮它解锁的“思维地图”。最后,我们会发现,无论是教它解奥赛难题,还是教它做家务,最聪明的办法,可能都藏在我们自己的学习和生活经验里。准备好,让我们一起揭开AI光环背后的真实运作逻辑! 00:00:38 AI的玻璃天花板:为什么模型越大,犯的错越“自信”? 00:08:26 训练AI,除了“上课”还能“生娃”? 00:14:31 AI的“聪明”难题:为什么它能解奥数,却像个没头苍蝇? 00:21:39 AI的“题海战术”,跟我们有啥不一样? 00:27:36 一副眼镜,如何成为灵巧机器人的“私教”? 本期介绍的几篇论文: [LG] On the Fundamental Limits of LLMs at Scale [Stanford University & The University of Oklahoma] https://arxiv.org/abs/2511.12869 --- [LG] Evolution Strategies at the Hyperscale [FLAIR - University of Oxford & WhiRL - University of Oxford] https://arxiv.org/abs/2511.16652 --- [LG] Cognitive Foundations for Reasoning and Their Manifestation in LLMs [University of Illinois Urbana-Champaign & University of Washington & Princeton University] https://arxiv.org/abs/2511.16660 --- [LG] P1: Mastering Physics Olympiads with Reinforcement Learning [Shanghai AI Laboratory] https://arxiv.org/abs/2511.13612 --- [RO] Dexterity from Smart Lenses: Multi-Fingered Robot Manipulation with In-the-Wild Human Demonstrations [New York University & Meta] https://arxiv.org/abs/2511.16661
我们总惊叹AI越来越聪明,但你有没有想过,它为什么也越来越会“一本正经地胡说八道”?我们又该如何教会它回归事物的本质,甚至理解整个物理世界的运行规律?而当一个AI变得如此强大时,为什么一句简单的诗,就能轻易攻破它的安全防线?今天,我们就从几篇最新论文出发,一起聊聊AI光环之下的真实面貌。 00:00:29 AI:一个既聪明又靠不住的“好学生” 00:05:23 AI画画:为什么“猜噪音”不如“看本质”? 00:10:13 为什么聪明的AI也爱“一本正经地胡说八道”? 00:14:35 AI当学霸:如何用一个模型,通晓万物运行之道 00:19:54 为什么AI大模型,偏偏就怕“文化人”? 本期介绍的几篇论文: [LG] Structural Inducements for Hallucination in Large Language Models [University of Maryland] https://www.researchgate.net/publication/397779918_Structural_Inducements_for_Hallucination_in_Large_Language_Models_An_Output-Only_Case_Study_and_the_Discovery_of_the_False-Correction_Loop_An_Output-Only_Case_Study_from_Extended_Human-AI_Dialogue_Str --- [CV] Back to Basics: Let Denoising Generative Models Denoise [MIT] https://arxiv.org/abs/2511.13720 --- [CL] AA-Omniscience: Evaluating Cross-Domain Knowledge Reliability in Large Language Models [Artificial Analysis] https://arxiv.org/abs/2511.13029 --- [LG] Walrus: A Cross-Domain Foundation Model for Continuum Dynamics [Flatiron Institute & University of Cambridge] https://arxiv.org/abs/2511.15684 --- [CL] Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models [DEXAI – Icaro Lab] https://arxiv.org/abs/2511.15304
今天我们不只聊AI能做什么,更要揭秘它是“如何”做到的,聊聊那些让AI变得更聪明、更高效的“幕后机制”。我们将看到,AI如何像一个科研搭档一样与人协作,又如何通过一个巧妙的“飞轮”学会从一张照片脑补出三维世界。我们还会发现,AI怎样通过给自己出题、换上新引擎,甚至给自己当“私教”来实现自我进化,打破能力瓶颈。准备好了吗?让我们一起探寻这些驱动AI飞跃的精妙设计。 00:00:35 你的下一个同事,可能不是人 00:07:57 一张照片,一个世界:我们如何“脑补”出三维? 00:13:22 AI自己教自己,怎么才能不原地踏步? 00:18:06 AI造句新高速:换个引擎,解决堵车问题 00:22:49 AI的私教课:让聪明的芯片更聪明 本期介绍的几篇论文: [CL] Early science acceleration experiments with GPT-5 [OpenAI & University of Oxford] https://arxiv.org/abs/2511.16072 --- [CV] SAM 3D: 3Dfy Anything in Images [Meta Superintelligence Labs] https://arxiv.org/abs/2511.16624 --- [LG] Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning [UNC-Chapel Hill] https://arxiv.org/abs/2511.16043 --- [LG] Breaking the Bottleneck with DiffuApriel: High-Throughput Diffusion LMs with Mamba Backbone [Mila – Quebec AI Institute & ServiceNow Research] https://arxiv.org/abs/2511.15927 --- [LG] AccelOpt: A Self-Improving LLM Agentic System for AI Accelerator Kernel Optimization [Stanford University & Amazon Web Services] https://arxiv.org/abs/2511.15915
你有没有想过,最顶尖的AI是如何思考的?本期节目,我们将从四篇最新论文出发,揭示AI成长的秘密:有时,思路的宽度比深度更重要;有时,机器也需要演化出难以言喻的“品味”;甚至,它还需要学会“左手画方、右手画圆”的协同技巧,并懂得在关键时刻,用恰当的“约束”来避免犯下最聪明的傻错误。准备好了吗?让我们一起潜入AI思考的深水区。 00:00:34 AI搞科研,拼的不是智商,而是“思路宽” 00:05:32 如何让机器拥有“数学品味”? 00:10:30 AI思考,也需要“左手画方,右手画圆”? 00:16:58 为什么最聪明的工具,反而最容易犯傻? 本期介绍的几篇论文: [AI] What Does It Take to Be a Good AI Research Agent? Studying the Role of Ideation Diversity [FAIR at Meta] https://arxiv.org/abs/2511.15593 --- [LG] Learning Interestingness in Automated Mathematical Theory Formation [UT Austin] https://arxiv.org/abs/2511.14778 --- [CV] Think Visually, Reason Textually: Vision-Language Synergy in ARC [The Chinese University of Hong Kong & Shanghai AI Laboratory] https://arxiv.org/abs/2511.15703 --- [LG] CODE: A global approach to ODE dynamics learning [University of Stuttgart & Stanford University] https://arxiv.org/abs/2511.15619
你有没有想过,让AI变得更聪明,不一定需要更强的算力,也许只需要换个“姿势”看问题?本期节目,我们将一起探索几篇最新论文,看看AI如何通过像艺术家一样思考、像高明的交通协管员一样调度、甚至像耐心的学生一样“分步走”,来解决那些曾经的无解难题。更神奇的是,我们还会发现,当AI学会结合语言和演示来猜测我们心思的时候,它其实也在教我们如何更有效地沟通。准备好了吗?让我们马上进入今天的前沿之旅。 00:00:36 换个姿势,AI也能像人一样思考? 00:05:29 AI画画,能不能别再“三班倒”了? 00:12:14 AI变聪明的秘密:不是更猛,而是更有耐心 00:16:54 AI训练场上的“交通协管员” 00:22:42 机器人“猜”心思的秘密 本期介绍的几篇论文: [CV] ARC Is a Vision Problem! [MIT] https://arxiv.org/abs/2511.14761 --- [CV] Diffusion As Self-Distillation: End-to-End Latent Diffusion In One Model [Peking University] https://arxiv.org/abs/2511.14716 --- [CV] Step by Step Network [Tsinghua University] https://arxiv.org/abs/2511.14329 --- [LG] Seer: Online Context Learning for Fast Synchronous LLM Reinforcement Learning [Moonshot AI] https://arxiv.org/abs/2511.14617 --- [RO] Masked IRL: LLM-Guided Reward Disambiguation from Demonstrations and Language [MIT CSAIL] https://arxiv.org/abs/2511.14565
你有没有想过,能写诗作画的AI,为什么有时却像个固执的孩子?本期我们要聊的几篇最新论文,就试图教会AI一些我们习以为常、但它却难以理解的人类智慧。我们将一起看看,如何治好AI的“路痴”症,让它拥有空间感;如何让它从被动看图,变身主动破案的“侦探”;甚至,如何通过巧妙的“换个姿势”,让它终于听懂“不要”,并随心所欲地调整观察事物的“粒度”。 00:00:33 人工智能的“路痴”难题 00:05:24 AI侦探,如何给千米大桥做“体检”? 00:09:59 从“你猜”到“你定”:AI图像分割的新玩法 00:14:45 换个姿势,让AI听懂“不要” 本期介绍的几篇论文: [CV] Scaling Spatial Intelligence with Multimodal Foundation Models [SenseTime Research] https://arxiv.org/abs/2511.13719 --- [CV] BridgeEQA: Virtual Embodied Agents for Real Bridge Inspections [University of Houston] https://arxiv.org/abs/2511.12676 --- [CV] UnSAMv2: Self-Supervised Learning Enables Segment Anything at Any Granularity [UC Berkeley] https://arxiv.org/abs/2511.13714 --- [CV] SpaceVLM: Sub-Space Modeling of Negation in Vision-Language Models [MIT] https://arxiv.org/abs/2511.12331
你有没有想过,最顶尖的AI,它的智慧可能不是体现在无所不知,而是敢于坦诚地说出“我不知道”?本期节目,我们将一起探索AI如何学会这项宝贵的品质。我们还会揭秘,如何给AI装上一双“眼睛”让它在嘈杂派对里也能跟你轻松对话,如何用一个优美的公式教会它“速读”长篇报告,甚至让一份200页的PDF自己开口说话,并在一秒内找到AI画作的灵感“祖先”。准备好了吗?让我们一起进入AI更深邃、更智慧的内心世界。 00:00:39 AI画画的灵感,能秒速溯源吗? 00:06:29 大模型读书慢?给它一副聪明的“速读眼镜” 00:12:13 给AI一双眼睛,让它学会“察言观色” 00:16:37 AI的最高智慧,是承认自己不知道 00:22:56 如何让一份200页的PDF,自己开口说话? 本期介绍的几篇论文: [CV] Fast Data Attribution for Text-to-Image Models [CMU & Adobe Research & UC Berkeley] https://arxiv.org/abs/2511.10721 --- [LG] Optimizing Mixture of Block Attention [MIT] https://arxiv.org/abs/2511.11571 --- [CL] AV-Dialog: Spoken Dialogue Models with Audio-Visual Input [University of Washington & Meta AI Research] https://arxiv.org/abs/2511.11124 --- [LG] Honesty over Accuracy: Trustworthy Language Models through Reinforced Hesitation [Toyota Technological Institute at Chicago & University of California, San Diego] https://arxiv.org/abs/2511.11500 --- [CL] Information Extraction From Fiscal Documents Using LLMs [Google Inc & XKDR Forum] https://arxiv.org/abs/2511.10659
想让AI更聪明,答案一定是用更多数据喂出个更大的模型吗?本期我们要聊点不一样的:当AI不再单打独斗,而是组建起一支“交响乐团”;当它不再追求更大,而是学会了“反复琢磨”;当它甚至能像武林高手一样开启“自我修炼”。我们将从几篇最新论文出发,看看AI如何从理解微观世界的“集体舞步”,到为自己的想象力配上一本“物理说明书”,走上一条更聪明的进化之路。 00:00:31 AI制药,也需要一个“交响乐团”? 00:05:32 人工智能的“自我修炼”手册 00:11:46 如何预测一群舞者的集体舞步? 00:17:16 AI变聪明的捷径:不是更大,而是更深 00:22:03 给AI视频配一本“物理说明书” 本期介绍的几篇论文: [LG] MADD: Multi-Agent Drug Discovery Orchestra [ITMO University] https://arxiv.org/abs/2511.08217 --- [LG] AgentEvolver: Towards Efficient Self-Evolving Agent System [Tongyi Lab] https://arxiv.org/abs/2511.10395 --- [LG] Entangled Schrödinger Bridge Matching [University of Pennsylvania & Duke-NUS Medical School] https://arxiv.org/abs/2511.07406 --- [CL] Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence [University of Maryland & New York University] https://arxiv.org/abs/2511.07384 --- [RO] Robot Learning from a Physical World Model [Google DeepMind & USC] https://arxiv.org/abs/2511.07416
与播客爱好者一起交流
添加微信好友,获取更多播客资讯
播放列表还是空的
去找些喜欢的节目添加进来吧