HuggingFace 每日AI论文速递 - 2026.03.18 | 验证求精代理破局；工业代码模型一次过 - EarsOnMe

主播

拨号上网 1 档播客

节目简介

来源：小宇宙

【赞助商】

通勤路上就听AI每周谈。AI每周谈，每周带你回顾上周AI大事

传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd

【目录】

本期的 15 篇论文如下：

[00:29] 🤖 MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification（MiroThinker-1.7与H1：通过验证迈向重型研究智能体）

[01:10] 🏭 InCoder-32B: Code Foundation Model for Industrial Scenarios（InCoder-32B：面向工业场景的代码基础模型）

[02:08] 🧠 Qianfan-OCR: A Unified End-to-End Model for Document Intelligence（千帆OCR：一个用于文档智能的统一端到端模型）

[02:50] 🤖 Kinema4D: Kinematic 4D World Modeling for Spatiotemporal Embodied Simulation（Kinema4D：面向时空具身仿真的运动学4D世界建模）

[03:28] 🧠 Demystifing Video Reasoning（揭秘视频推理机制）

[04:26] 🎮 WorldCam: Interactive Autoregressive 3D Gaming Worlds with Camera Pose as a Unifying Geometric Representation（WorldCam：以相机位姿为统一几何表示的交互式自回归3D游戏世界）

[05:26] 🧠 TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas（TRUST-SQL：面向未知模式的文本到SQL工具集成多轮强化学习）

[06:12] 🤔 Thinking in Uncertainty: Mitigating Hallucinations in MLRMs with Latent Entropy-Aware Decoding（在不确定性中思考：通过潜在熵感知解码缓解多模态大推理模型的幻觉问题）

[07:02] 🔄 Online Experiential Learning for Language Models（语言模型的在线体验式学习）

[07:54] 📊 FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use（FinToolBench：评估面向现实世界金融工具使用的大语言模型智能体）

[08:47] 🚀 Rethinking UMM Visual Generation: Masked Modeling for Efficient Image-Only Pre-training（重新思考统一多模态模型视觉生成：基于掩码建模的高效纯图像预训练）

[09:30] 🧭 WiT: Waypoint Diffusion Transformers via Trajectory Conflict Navigation（WiT：基于轨迹冲突导航的路径点扩散Transformer）

[10:20] 🔍 AgentProcessBench: Diagnosing Step-Level Process Quality in Tool-Using Agents（AgentProcessBench：诊断工具使用智能体的步骤级过程质量）

[11:03] 🎨 SegviGen: Repurposing 3D Generative Model for Part Segmentation（SegviGen：重新利用3D生成模型进行部件分割）

[11:59] 🗣 SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models（SocialOmni：全模态模型中视听社交交互能力的基准测试）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

在小宇宙查看该单集文稿

2026.03.18 | 验证求精代理破局；工业代码模型一次过

加入我们的 Discord

扫描微信二维码

播放列表