HuggingFace 每日AI论文速递 - 2026.01.13 | VideoDR让模型边搜边推理；BabyVision揭视觉短板 - EarsOnMe

主播

节目简介

来源：小宇宙

本期的 15 篇论文如下：

[00:20] 🔍 Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning（观察、推理与搜索：面向智能体视频推理的开放网络视频深度研究基准）

[01:01] 👶 BabyVision: Visual Reasoning Beyond Language（BabyVision：超越语言的视觉推理）

[01:45] 🚀 PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning（PaCoRe：通过并行协调推理学习扩展测试时计算）

[02:24] 🧠 X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests（X-Coder：基于全合成任务、解决方案与测试推进竞争性编程）

[03:03] ⚡ MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head（MHLA：通过令牌级多头机制恢复线性注意力的表达能力）

[03:41] ⚡ GlimpRouter: Efficient Collaborative Inference by Glimpsing One Token of Thoughts（GlimpRouter：通过瞥见思维令牌实现高效协同推理）

[04:17] 🤖 OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent（OS-Symphony：一个用于鲁棒且通用的计算机使用智能体的整体框架）

[05:20] 📉 Lost in the Noise: How Reasoning Models Fail with Contextual Distractors（迷失于噪声之中：推理模型如何因上下文干扰物而失效）

[06:00] 🚀 Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models（超越硬掩码：扩散语言模型的渐进式令牌演化）

[06:30] 🧠 Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction（可控内存使用：在长期人机交互中平衡锚定与创新）

[07:10] 🚗 DrivingGen: A Comprehensive Benchmark for Generative Video World Models in Autonomous Driving（DrivingGen：自动驾驶生成式视频世界模型的综合基准）

[07:58] 🤖 MegaFlow: Large-Scale Distributed Orchestration System for the Agentic Era（MegaFlow：面向智能体时代的大规模分布式编排系统）

[08:26] 🎨 Boosting Latent Diffusion Models via Disentangled Representation Alignment（通过解耦表征对齐提升潜在扩散模型）

[09:08] 🤔 What Users Leave Unsaid: Under-Specified Queries Limit Vision-Language Models（用户未言明之处：欠明确的查询限制视觉语言模型）

[09:45] 🔧 ET-Agent: Incentivizing Effective Tool-Integrated Reasoning Agent via Behavior Calibration（ET-Agent：通过行为校准激励有效的工具集成推理智能体）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

在小宇宙查看该单集文稿

2026.01.13 | VideoDR让模型边搜边推理；BabyVision揭视觉短板

加入我们的 Discord

扫描微信二维码

播放列表