HuggingFace 每日AI论文速递 - 2026.03.24 | 世界模型交互评估短板；单流架构极速生成 - EarsOnMe

主播

节目简介

来源：小宇宙

【赞助商】
通勤路上就听AI每周谈。AI每周谈，每周带你回顾上周AI大事
传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd
【目录】
本期的 15 篇论文如下：
[00:32] 🧪 Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models（Omni-WorldBench：迈向面向世界模型的全面交互中心化评估）
[01:13] 🚀 Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model（速度源于简洁：用于快速音视频生成基础模型的单流架构）
[01:55] 🧠 LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning（LongCat-Flash-Prover：通过智能体工具集成强化学习推进原生形式推理）
[02:42] 🔍 VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding（VideoDetective：基于外部查询与内部相关性的线索搜寻用于长视频理解）
[03:30] 🧠 SpatialBoost: Enhancing Visual Representation through Language-Guided Reasoning（SpatialBoost：通过语言引导推理增强视觉表征）
[04:10] 🎯 F4Splat: Feed-Forward Predictive Densification for Feed-Forward 3D Gaussian Splatting（F4Splat：用于前馈3D高斯泼溅的前馈预测性致密化）
[05:03] 🎬 Manifold-Aware Exploration for Reinforcement Learning in Video Generation（面向视频生成的强化学习中的流形感知探索）
[05:56] ⚖ mSFT: Addressing Dataset Mixtures Overfiting Heterogeneously in Multi-task SFT（mSFT：解决多任务监督微调中数据集混合的异质过拟合问题）
[06:46] 🧠 Group3D: MLLM-Driven Semantic Grouping for Open-Vocabulary 3D Object Detection（Group3D：基于多模态大语言模型的语义分组开放词汇3D物体检测）
[07:35] 🔄 Repurposing Geometric Foundation Models for Multi-view Diffusion（几何基础模型在多视角扩散中的再利用）
[08:21] 🤖 RoboAlign: Learning Test-Time Reasoning for Language-Action Alignment in Vision-Language-Action Models（RoboAlign：学习视觉-语言-动作模型中语言-动作对齐的测试时推理）
[09:15] 🔍 OpenResearcher: A Fully Open Pipeline for Long-Horizon Deep Research Trajectory Synthesis（OpenResearcher：一个完全开源的深度研究长程轨迹合成流程）
[10:02] 💭 BubbleRAG: Evidence-Driven Retrieval-Augmented Generation for Black-Box Knowledge Graphs（BubbleRAG：面向黑盒知识图谱的证据驱动检索增强生成）
[10:54] ⚖ SEM: Sparse Embedding Modulation for Post-Hoc Debiasing of Vision-Language Models（SEM：用于视觉语言模型事后去偏的稀疏嵌入调制）
[11:43] 🧭 On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation（论RLVR更新方向对LLM推理的影响：识别与利用）
【关注我们】
您还可以在以下平台找到我们，获得播客内容以外更多信息
小红书: AI速递

2026.03.24 | 世界模型交互评估短板；单流架构极速生成

加入我们的 Discord

扫描微信二维码

播放列表