HuggingFace 每日AI论文速递 - 2025.07.14 | 高效推理路径选择；压缩光场令牌渲染 - EarsOnMe

主播

节目简介

来源：小宇宙

本期的 14 篇论文如下：

[00:22] 🧠 Test-Time Scaling with Reflective Generative Model（基于反射生成模型的测试时缩放）

[00:59] 💡 CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering（CLiFT：用于计算高效和自适应神经渲染的压缩光场令牌）

[01:34] 💻 NeuralOS: Towards Simulating Operating Systems via Neural Generative Models（NeuralOS：迈向通过神经生成模型模拟操作系统的方向）

[02:19] 🧠 KV Cache Steering for Inducing Reasoning in Small Language Models（用于诱导小语言模型推理的KV缓存引导）

[03:03] 🧠 Neural-Driven Image Editing（神经驱动的图像编辑）

[03:42] 🎬 Lumos-1: On Autoregressive Video Generation from a Unified Model Perspective（Lumos-1：基于统一模型视角的自回归视频生成）

[04:27] 🧠 Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning（开放视觉推理器：迁移语言认知行为以实现视觉推理）

[05:14] 🧩 From One to More: Contextual Part Latents for 3D Generation（从一到多：用于3D生成的上下文部件隐变量）

[05:53] 🤖 One Token to Fool LLM-as-a-Judge（一个Token即可欺骗LLM法官）

[06:32] 🖼 Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Image Generation（视觉基础模型作为自回归图像生成的有效视觉标记器）

[07:16] 🔭 What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models（基础模型发现了什么？利用归纳偏置来探测世界模型）

[08:00] 🚀 Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities（Gemini 2.5：通过高级推理、多模态、长上下文和下一代 Agent 能力推向新前沿）

[08:48] 🚀 BlockFFN: Towards End-Side Acceleration-Friendly Mixture-of-Experts with Chunk-Level Activation Sparsity（BlockFFN：面向终端侧加速友好的块级激活稀疏混合专家模型）

[09:25] 😵 Robust Multimodal Large Language Models Against Modality Conflict（面向模态冲突的鲁棒多模态大语言模型）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

在小宇宙查看该单集文稿

2025.07.14 | 高效推理路径选择；压缩光场令牌渲染

加入我们的 Discord

扫描微信二维码

播放列表