2026.03.11 | 几何强化3D编辑;掩码扩散多模态
HuggingFace 每日AI论文速递
【赞助商】
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd
【目录】
本期的 15 篇论文如下:
[00:32] 🎨 Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing(几何引导的强化学习用于多视角一致的3D场景编辑)
[01:11] 🔄 Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion(Omni-Diffusion:基于掩码离散扩散的统一多模态理解与生成)
[02:06] 🧠 Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs(思考以回忆:推理如何解锁大语言模型中的参数化知识)
[02:55] 🚀 MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data(MM-Zero:从零数据自演进的多模态视觉语言模型)
[03:41] 🧠 InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing(InternVL-U:民主化统一多模态模型,实现理解、推理、生成与编辑)
[04:34] 🏸 Stepping VLMs onto the Court: Benchmarking Spatial Intelligence in Sports(让视觉语言模型踏上赛场:体育场景空间智能基准测试)
[05:15] 🔍 Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs(阅读而非思考:理解并弥合多模态大语言模型中文本像素化时的模态鸿沟)
[06:01] 🗣 Fish Audio S2 Technical Report(Fish Audio S2 技术报告)
[06:48] 🎧 Are Audio-Language Models Listening? Audio-Specialist Heads for Adaptive Audio Steering(音频语言模型在聆听吗?用于自适应音频引导的音频专家注意力头)
[07:45] 📱 MiniAppBench: Evaluating the Shift from Text to Interactive HTML Responses in LLM-Powered Assistants(MiniAppBench:评估LLM驱动助手中从文本到交互式HTML响应的转变)
[08:48] 🔍 VLM-SubtleBench: How Far Are VLMs from Human-Level Subtle Comparative Reasoning?(VLM-SubtleBench:视觉语言模型距离人类级别的细微比较推理还有多远?)
[09:34] 🗣 Do What I Say: A Spoken Prompt Dataset for Instruction-Following(按我说的做:一个用于指令跟随的语音提示数据集)
[10:20] 🎬 Streaming Autoregressive Video Generation via Diagonal Distillation(通过对角线蒸馏实现流式自回归视频生成)
[11:08] 🧪 Test-Driven AI Agent Definition (TDAD): Compiling Tool-Using Agents from Behavioral Specifications(测试驱动AI智能体定义(TDAD):从行为规范编译工具使用型智能体)
[11:58] ⚖ Decoupling Reasoning and Confidence: Resurrecting Calibration in Reinforcement Learning from Verifiable Rewards(解耦推理与置信度:在可验证奖励的强化学习中重建校准)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递