HuggingFace 每日AI论文速递 - 节目列表

【赞助商】通勤路上就听AI每周谈。AI每周谈，每周带你回顾上周AI大事传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd 【目录】本期的 15 篇论文如下： [00:32] 🩺 Baichuan-M3: Modeling Clinical Inquiry for Reliable Medical Decision-Making（Baichuan-M3：建模临床问询以实现可靠的医疗决策） [01:17] 🧭 OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions（奥德赛竞技场：面向长视野、主动与归纳交互的大语言模型基准测试） [02:03] 📈 On the Entropy Dynamics in Reinforcement Fine-Tuning of Large Language Models（论大型语言模型强化微调中的熵动态） [02:47] 🎯 F-GRPO: Don't Let Your Policy Learn the Obvious and Forget the Rare（F-GRPO：别让你的策略学会常见而遗忘罕见） [03:48] ⚖ MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration（MSign：一种通过稳定秩恢复防止大语言模型训练不稳定的优化器） [04:33] 🤖 DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos（DreamDojo：基于大规模人类视频的通用机器人世界模型） [05:14] 🧠 Self-Improving Multilingual Long Reasoning via Translation-Reasoning Integrated Training（通过翻译-推理集成训练实现自我改进的多语言长推理） [06:07] 🧮 Judging What We Cannot Solve: A Consequence-Based Approach for Oracle-Free Evaluation of Research-Level Math（评判我们无法解决的问题：一种基于后果的无监督研究级数学评估方法） [06:46] 🎯 POINTS-GUI-G: GUI-Grounding Journey（POINTS-GUI-G：图形用户界面基础任务之旅） [07:45] 🧠 MemGUI-Bench: Benchmarking Memory of Mobile GUI Agents in Dynamic Environments（MemGUI-Bench：动态环境中移动GUI代理内存能力的基准测试） [08:29] 🧠 Back to Basics: Revisiting Exploration in Reinforcement Learning for LLM Reasoning via Generative Probabilities（回归基础：通过生成概率重新审视强化学习在LLM推理中的探索） [09:18] 🎵 AudioSAE: Towards Understanding of Audio-Processing Models with Sparse AutoEncoders（AudioSAE：利用稀疏自编码器理解音频处理模型） [09:59] ⚡ Canzona: A Unified, Asynchronous, and Load-Balanced Framework for Distributed Matrix-based Optimizers（Canzona：一个统一、异步且负载均衡的分布式矩阵优化器框架） [11:02] 🧠 InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning（InftyThink+：通过强化学习实现高效且有效的无限视野推理） [11:49] 🧠 PlanViz: Evaluating Planning-Oriented Image Generation and Editing for Computer-Use Tasks（PlanViz：面向计算机使用任务的规划导向图像生成与编辑评估）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

【周末特辑】2月第2周最火AI论文 | 分阶段统一动作空间；ERNIE 5.0大一统多模态

【赞助商】通勤路上就听AI每周谈。AI每周谈，每周带你回顾上周AI大事传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd 【目录】本期的 5 篇论文如下： [00:48] TOP1(🔥235) | 🤖 Green-VLA: Staged Vision-Language-Action Model for Generalist Robots（Green-VLA：面向通用机器人的分阶段视觉-语言-动作模型） [02:54] TOP2(🔥235) | 🧠 ERNIE 5.0 Technical Report（ERNIE 5.0 技术报告） [05:14] TOP3(🔥206) | 🤖 Kimi K2.5: Visual Agentic Intelligence（Kimi K2.5：视觉智能体） [07:49] TOP4(🔥147) | 🔍 Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models（Vision-DeepResearch：激励多模态大语言模型中的深度研究能力） [10:28] TOP5(🔥137) | 🍌 PaperBanana: Automating Academic Illustration for AI Scientists（PaperBanana：面向AI科学家的学术插图自动化生成框架）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

2026.02.06 | RLVR去长度偏见；长镜头不换记忆

【赞助商】通勤路上就听AI每周谈。AI每周谈，每周带你回顾上周AI大事传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd 【目录】本期的 15 篇论文如下： [00:29] 📊 Length-Unbiased Sequence Policy Optimization: Revealing and Controlling Response Length Variation in RLVR（长度无偏序列策略优化：揭示与控制RLVR中的响应长度变化） [01:20] 🎬 Context Forcing: Consistent Autoregressive Video Generation with Long Context（上下文强制：具有长上下文的一致自回归视频生成） [02:11] 🧠 RISE-Video: Can Video Generators Decode Implicit World Rules?（RISE-Video：视频生成器能否解码隐含的世界规则？） [02:57] 🔮 ProAct: Agentic Lookahead in Interactive Environments（ProAct：交互式环境中的前瞻性智能体规划） [03:47] ⚡ Dr. Kernel: Reinforcement Learning Done Right for Triton Kernel Generations（Dr. Kernel：用于Triton内核生成的强化学习正确实现） [04:39] 🧭 Steering LLMs via Scalable Interactive Oversight（通过可扩展的交互式监督引导大型语言模型） [05:27] 🧠 Grounding and Enhancing Informativeness and Utility in Dataset Distillation（数据集约简中信息性与实用性的基础与增强） [06:13] 🧪 Retrieval-Infused Reasoning Sandbox: A Benchmark for Decoupling Retrieval and Reasoning Capabilities（检索增强推理沙盒：一个解耦检索与推理能力的基准） [07:07] 🔍 Semantic Search over 9 Million Mathematical Theorems（对超过900万个数学定理的语义搜索） [07:57] 🕷 Spider-Sense: Intrinsic Risk Sensing for Efficient Agent Defense with Hierarchical Adaptive Screening（Spider-Sense：基于内在风险感知的高效智能体防御与分层自适应筛查） [08:39] 🧪 CAR-bench: Evaluating the Consistency and Limit-Awareness of LLM Agents under Real-World Uncertainty（CAR-bench：评估现实世界不确定性下LLM智能体的一致性与极限感知能力） [09:30] 🤖 InterPrior: Scaling Generative Control for Physics-Based Human-Object Interactions（InterPrior：基于物理的人-物交互生成控制扩展框架） [10:22] 🎬 Thinking in Frames: How Visual Context and Test-Time Scaling Empower Video Reasoning（帧中思考：视觉上下文与测试时缩放如何赋能视频推理） [11:14] 🔄 SwimBird: Eliciting Switchable Reasoning Mode in Hybrid Autoregressive MLLMs（SwimBird：在混合自回归多模态大语言模型中引发可切换推理模式） [12:20] 🔍 SAGE: Benchmarking and Improving Retrieval for Deep Research Agents（SAGE：深度研究智能体的检索基准评测与性能提升）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

2026.02.05 | ERNIE 5.0统一模态；FASA稀疏注意力省内存

【赞助商】通勤路上就听AI每周谈。AI每周谈，每周带你回顾上周AI大事传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd 【目录】本期的 15 篇论文如下： [00:29] 🧠 ERNIE 5.0 Technical Report（ERNIE 5.0 技术报告） [01:11] ⚡ FASA: Frequency-aware Sparse Attention（FASA：基于频率感知的稀疏注意力机制） [02:01] 📊 Training Data Efficiency in Multimodal Process Reward Models（多模态过程奖励模型中的训练数据效率研究） [02:44] 🤖 WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning（WideSeek-R1：通过多智能体强化学习探索宽度扩展以实现广泛信息检索） [03:28] ⚡ OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models（OmniSIFT：面向高效全模态大语言模型的模态非对称令牌压缩） [04:21] ⚡ HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing（HySparse：一种具有预言机令牌选择和KV缓存共享的混合稀疏注意力架构） [05:02] 🤖 EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models（EgoActor：通过视觉语言模型将任务规划落地为空间感知的具身动作） [06:05] 🎬 Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization（Quant VideoGen：通过2位KV缓存量化实现自回归长视频生成） [06:59] 🤖 SoMA: A Real-to-Sim Neural Simulator for Robotic Soft-body Manipulation（SoMA：面向机器人软体操作的真实到仿真神经模拟器） [07:44] 🔍 TIDE: Trajectory-based Diagnostic Evaluation of Test-Time Improvement in LLM Agents（TIDE：基于轨迹的LLM智能体测试时改进诊断评估） [08:21] 🧠 Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers（语义路由：探索扩散变换器中多层LLM特征加权的融合框架） [09:12] 🤖 Rethinking the Trust Region in LLM Reinforcement Learning（重新思考大语言模型强化学习中的信任区域） [09:54] ♻ Residual Context Diffusion Language Models（残差上下文扩散语言模型） [10:40] 🧱 HY3D-Bench: Generation of 3D Assets（HY3D-Bench：3D资产的生成） [11:34] 🎨 AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations（AutoFigure：生成与优化可直接用于发表的科学插图）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

12分钟

2026.02.04 | 看图写代码省token；临时组队降成本

【赞助商】通勤路上就听AI每周谈。AI每周谈，每周带你回顾上周AI大事传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd 【目录】本期的 15 篇论文如下： [00:32] 👁 CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding（CodeOCR：视觉语言模型在代码理解中的有效性研究） [01:18] 🤖 AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration（AOrchestra：面向智能体编排的子智能体自动创建） [02:01] 🔍 No Global Plan in Chain-of-Thought: Uncover the Latent Planning Horizon of LLMs（思维链中无全局规划：揭示大语言模型的潜在规划视野） [02:43] 🔗 daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently（daVinci-Agency：高效解锁长程智能体工作流） [03:23] 🧠 Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks（世界模型研究并非仅将世界知识注入特定任务） [04:06] 🎬 3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation（面向视角自适应人体视频生成的3D感知隐式运动控制） [04:56] 🤖 MARS: Modular Agent with Reflective Search for Automated AI Research（MARS：具备反思搜索能力的模块化智能体用于自动化人工智能研究） [05:41] 📊 CoBA-RL: Capability-Oriented Budget Allocation for Reinforcement Learning in LLMs（CoBA-RL：面向大语言模型强化学习的基于能力的预算分配算法） [06:25] ⚡ Diversity-Preserved Distribution Matching Distillation for Fast Visual Synthesis（保持多样性的分布匹配蒸馏用于快速视觉合成） [07:19] 🤖 SWE-World: Building Software Engineering Agents in Docker-Free Environments（SWE-World：在无Docker环境中构建软件工程智能体） [08:09] 🤖 SWE-Master: Unleashing the Potential of Software Engineering Agents via Post-Training（SWE-Master：通过后训练释放软件工程智能体的潜力） [09:14] 📊 Learning Query-Specific Rubrics from Human Preferences for DeepResearch Report Generation（基于人类偏好的查询特定评分规则学习用于深度研究报告生成） [10:08] ⚡ Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing（Parallel-Probe：通过二维探测实现高效并行思维） [10:59] 🎯 Unified Personalized Reward Model for Vision Generation（视觉生成的统一个性化奖励模型） [11:47] 🔍 WideSeek: Advancing Wide Research via Multi-Agent Scaling（WideSeek：通过多智能体扩展推进广度研究）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

12分钟

2026.02.03 | 分阶段训练统一动作空间；MoE+视觉编码器并行智能体

【赞助商】通勤路上就听AI每周谈。AI每周谈，每周带你回顾上周AI大事传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd 【目录】本期的 15 篇论文如下： [00:32] 🤖 Green-VLA: Staged Vision-Language-Action Model for Generalist Robots（Green-VLA：面向通用机器人的分阶段视觉-语言-动作模型） [01:24] 🤖 Kimi K2.5: Visual Agentic Intelligence（Kimi K2.5：视觉智能体） [02:09] 🔍 Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models（Vision-DeepResearch：激励多模态大语言模型中的深度研究能力） [03:08] 🔍 Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models（Vision-DeepResearch 基准：重新思考多模态大语言模型的视觉与文本搜索） [03:57] 🔄 Closing the Loop: Universal Repository Representation with RPG-Encoder（闭环：基于RPG-Encoder的通用代码仓库表示方法） [04:39] 🧠 UniReason 1.0: A Unified Reasoning Framework for World Knowledge Aligned Image Generation and Editing（UniReason 1.0：面向世界知识对齐图像生成与编辑的统一推理框架） [05:23] 📊 WildGraphBench: Benchmarking GraphRAG with Wild-Source Corpora（WildGraphBench：基于野生来源语料库的图检索增强生成基准测试） [06:28] 📚 FS-Researcher: Test-Time Scaling for Long-Horizon Research Tasks with File-System-Based Agents（FS-Researcher：基于文件系统的智能体在长周期研究任务中的测试时扩展） [07:23] 🚀 SWE-Universe: Scale Real-World Verifiable Environments to Millions（SWE-Universe：将真实世界可验证的软件工程环境扩展至百万规模） [08:13] 📚 Wiki Live Challenge: Challenging Deep Research Agents with Expert-Level Wikipedia Articles（维基实时挑战：用专家级维基百科文章挑战深度研究智能体） [08:58] ⚖ SLIME: Stabilized Likelihood Implicit Margin Enforcement for Preference Optimization（SLIME：基于稳定似然的隐式边界强化偏好优化） [09:45] 🎨 PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss（PixelGen：基于感知损失的像素扩散模型超越潜在扩散模型） [10:38] ⚙ RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System（RLAnything：在完全动态强化学习系统中锻造环境、策略与奖励模型） [11:30] 🧠 Mind-Brush: Integrating Agentic Cognitive Search and Reasoning into Image Generation（思维画笔：将智能认知搜索与推理融入图像生成） [12:17] 🎬 PISCES: Annotation-free Text-to-Video Post-Training via Optimal Transport-Aligned Rewards（PISCES：基于最优传输对齐奖励的无标注文本到视频后训练方法）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

2026.02.02 | ASTRA合成轨迹炼工具；THINKSAFE自对齐保安全

【赞助商】通勤路上就听AI每周谈。AI每周谈，每周带你回顾上周AI大事传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd 【目录】本期的 15 篇论文如下： [00:33] 🤖 ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas（ASTRA：基于自动化轨迹合成与强化学习竞技场的智能体训练框架） [01:22] 🛡 THINKSAFE: Self-Generated Safety Alignment for Reasoning Models（THINKSAFE：推理模型的自生成安全对齐） [02:18] 🧠 TTCS: Test-Time Curriculum Synthesis for Self-Evolving（TTCS：面向自进化的测试时课程合成） [03:09] 🍌 PaperBanana: Automating Academic Illustration for AI Scientists（PaperBanana：面向AI科学家的学术插图自动化生成框架） [03:51] 🔬 FourierSampler: Unlocking Non-Autoregressive Potential in Diffusion Language Models via Frequency-Guided Generation（傅里叶采样器：通过频率引导生成解锁扩散语言模型的非自回归潜力） [04:40] 🧠 ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought（ReGuLaR：基于渲染思维链指导的变分潜在推理） [05:22] 🎯 SSL: Sweet Spot Learning for Differentiated Guidance in Agentic Optimization（SSL：基于甜点学习的差异化引导智能体优化） [06:02] 🎯 DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment（DenseGRPO：从稀疏奖励到稠密奖励的流匹配模型对齐方法） [07:08] 🧠 Pushing the Boundaries of Natural Reasoning: Interleaved Bonus from Formal-Logic Verification（突破自然推理的边界：形式逻辑验证的交织增益） [07:55] 📄 PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing（PaddleOCR-VL-1.5：面向鲁棒野外文档解析的多任务0.9B视觉语言模型） [08:45] 🎬 DreamActor-M2: Universal Character Image Animation via Spatiotemporal In-Context Learning（DreamActor-M2：通过时空上下文学习的通用角色图像动画） [09:42] 🧠 MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning（MemOCR：面向高效长程推理的布局感知视觉记忆） [10:24] 🦢 Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text（金鹅：一种从未经验证的互联网文本中合成无限RLVR任务的简单技巧） [11:13] 📊 Statistical Estimation of Adversarial Risk in Large Language Models under Best-of-N Sampling（大语言模型在最佳N采样下对抗性风险的统计估计） [12:00] ⚡ RM -RF: Reward Model for Run-Free Unit Test Evaluation（RM-RF：一种用于免运行单元测试评估的奖励模型）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

95

【月末特辑】1月最火AI论文 | mHC稳梯度；GDPO解多奖励

【赞助商】通勤路上就听AI每周谈。AI每周谈，每周带你回顾上周AI大事传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd 【目录】本期的 10 篇论文如下： [00:42] TOP1(🔥292) | 🧠 mHC: Manifold-Constrained Hyper-Connections（mHC：流形约束的超连接） [03:06] TOP2(🔥212) | 📈 GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization（GDPO：面向多奖励强化学习优化的组奖励解耦归一化策略优化） [04:45] TOP3(🔥209) | 🔍 Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning（观察、推理与搜索：面向智能体视频推理的开放网络视频深度研究基准） [06:59] TOP4(🔥193) | 👶 BabyVision: Visual Reasoning Beyond Language（BabyVision：超越语言的视觉推理） [08:57] TOP5(🔥190) | 🚀 STEP3-VL-10B Technical Report（STEP3-VL-10B 技术报告） [10:39] TOP6(🔥186) | 🤖 Agentic Reasoning for Large Language Models（大语言模型的智能体推理） [12:58] TOP7(🔥181) | 🧹 Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs（大语言模型能否清理你的数据？基于LLM的应用就绪数据准备综述） [15:19] TOP8(🔥171) | 🧠 LongCat-Flash-Thinking-2601 Technical Report（LongCat-Flash-Thinking-2601 技术报告） [17:22] TOP9(🔥165) | 🗺 Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization（借助地图思考：用于地理定位的强化并行地图增强智能体） [19:17] TOP10(🔥158) | 🧠 Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives（Idea2Story：将研究概念转化为完整科学叙事的自动化流程）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

22分钟

【周末特辑】2月第1周最火AI论文 | LLM当管家，数据变净菜；LongCat训特工，上网打副本

【赞助商】通勤路上就听AI每周谈。AI每周谈，每周带你回顾上周AI大事传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd 【目录】本期的 5 篇论文如下： [00:39] TOP1(🔥181) | 🧹 Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs（大语言模型能否清理你的数据？基于LLM的应用就绪数据准备综述） [02:50] TOP2(🔥169) | 🧠 LongCat-Flash-Thinking-2601 Technical Report（LongCat-Flash-Thinking-2601 技术报告） [04:51] TOP3(🔥138) | 🧠 Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives（Idea2Story：将研究概念转化为完整科学叙事的自动化流程） [06:40] TOP4(🔥123) | 🤖 daVinci-Dev: Agent-native Mid-training for Software Engineering（daVinci-Dev：面向软件工程的智能体原生中期训练） [08:51] TOP5(🔥120) | 🛡 AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security（AgentDoG：面向AI智能体安全与安全的诊断性护栏框架）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

11分钟

2026.01.30 | 空间智能基准测不准；Idea2Story一键成文

【赞助商】通勤路上就听AI每周谈。AI每周谈，每周带你回顾上周AI大事传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd 【目录】本期的 15 篇论文如下： [00:29] 🧭 Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models（万物归位：文本到图像模型空间智能基准测试） [01:21] 🧠 Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives（Idea2Story：将研究概念转化为完整科学叙事的自动化流程） [02:19] ⚡ Scaling Embeddings Outperforms Scaling Experts in Language Models（在语言模型中扩展嵌入层优于扩展专家混合） [02:58] 🔍 OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models（OCRVerse：迈向端到端视觉语言模型中的整体OCR） [03:39] 🤖 DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation（DynamicVLA：面向动态物体操作的视觉-语言-动作模型） [04:33] 🧠 MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods（MMFineReason：通过开放数据为中心的方法弥合多模态推理鸿沟） [05:20] 🔺 PLANING: A Loosely Coupled Triangle-Gaussian Framework for Streaming 3D Reconstruction（PLANING：一种用于流式三维重建的松散耦合三角-高斯框架） [06:08] 🧠 ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation（ConceptMoE：面向隐式计算分配的自适应令牌到概念压缩） [07:01] 🧩 AgentLongBench: A Controllable Long Benchmark For Long-Contexts Agents via Environment Rollouts（AgentLongBench：通过环境推演实现可控的长上下文智能体基准测试） [07:43] 🧠 Exploring Reasoning Reward Model for Agents（探索智能体推理奖励模型） [08:39] 🎤 Qwen3-ASR Technical Report（Qwen3-ASR技术报告） [09:27] 🚀 Language-based Trial and Error Falls Behind in the Era of Experience（经验时代下基于语言的试错方法已然落后） [10:16] 🌐 Typhoon-S: Minimal Open Post-Training for Sovereign Large Language Models（台风-S：主权大语言模型的最小化开放后训练方法） [11:02] ⚡ Scalable Power Sampling: Unlocking Efficient, Training-Free Reasoning for LLMs via Distribution Sharpening（可扩展的幂采样：通过分布锐化解锁LLM高效、免训练推理） [11:59] 🧠 MAD: Modality-Adaptive Decoding for Mitigating Cross-Modal Hallucinations in Multimodal Large Language Models（MAD：模态自适应解码用于缓解多模态大语言模型中的跨模态幻觉）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

2026.01.29 | 难题优先补数学推理；LingBot生成交互平行世界

【赞助商】通勤路上就听AI每周谈。AI每周谈，每周带你回顾上周AI大事传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd 【目录】本期的 13 篇论文如下： [00:33] 🧠 Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation（越难越好：通过难度感知GRPO与多角度问题重构提升数学推理能力） [01:21] 🌍 Advancing Open-source World Models（推进开源世界模型） [01:55] 🧠 DeepSeek-OCR 2: Visual Causal Flow（DeepSeek-OCR 2：视觉因果流） [02:58] 🚀 Spark: Strategic Policy-Aware Exploration via Dynamic Branching for Long-Horizon Agentic Learning（Spark：通过关键状态动态分支实现战略策略感知探索的长视野智能体学习） [03:49] 🔬 Innovator-VL: A Multimodal Large Language Model for Scientific Discovery（创新者-VL：面向科学发现的多模态大语言模型） [04:34] 🔄 Linear representations in language models can change dramatically over a conversation（语言模型中的线性表征在对话过程中会发生剧烈变化） [05:26] 🚀 SERA: Soft-Verified Efficient Repository Agents（SERA：软验证高效代码库智能体） [06:01] 🤖 OmegaUse: Building a General-Purpose GUI Agent for Autonomous Task Execution（OmegaUse：构建用于自主任务执行的通用图形用户界面代理） [06:46] 🤖 GDCNet: Generative Discrepancy Comparison Network for Multimodal Sarcasm Detection（GDCNet：用于多模态讽刺检测的生成式差异比较网络） [07:37] 🗣 SE-DiCoW: Self-Enrolled Diarization-Conditioned Whisper（SE-DiCoW：自注册的说话人日志条件化Whisper模型） [08:27] 📊 RIR-Mega-Speech: A Reverberant Speech Corpus with Comprehensive Acoustic Metadata and Reproducible Evaluation（RIR-Mega-Speech：一个包含全面声学元数据且可复现评估的混响语音语料库） [09:16] ✏ SketchDynamics: Exploring Free-Form Sketches for Dynamic Intent Expression in Animation Generation（SketchDynamics：探索自由手绘草图在动画生成中的动态意图表达） [10:07] 🚀 UPLiFT: Efficient Pixel-Dense Feature Upsampling with Local Attenders（UPLiFT：利用局部注意力机制实现高效像素密集特征上采样）【关注我们】您还可以在以下平台找到我们，获得播客内容以外更多信息小红书: AI速递

11分钟