时长:
10分钟
播放:
126
发布:
3周前
主播...
简介...
本期的 15 篇论文如下:
[00:20] 🔍 Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning(观察、推理与搜索:面向智能体视频推理的开放网络视频深度研究基准)
[01:01] 👶 BabyVision: Visual Reasoning Beyond Language(BabyVision:超越语言的视觉推理)
[01:45] 🚀 PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning(PaCoRe:通过并行协调推理学习扩展测试时计算)
[02:24] 🧠 X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests(X-Coder:基于全合成任务、解决方案与测试推进竞争性编程)
[03:03] ⚡ MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head(MHLA:通过令牌级多头机制恢复线性注意力的表达能力)
[03:41] ⚡ GlimpRouter: Efficient Collaborative Inference by Glimpsing One Token of Thoughts(GlimpRouter:通过瞥见思维令牌实现高效协同推理)
[04:17] 🤖 OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent(OS-Symphony:一个用于鲁棒且通用的计算机使用智能体的整体框架)
[05:20] 📉 Lost in the Noise: How Reasoning Models Fail with Contextual Distractors(迷失于噪声之中:推理模型如何因上下文干扰物而失效)
[06:00] 🚀 Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models(超越硬掩码:扩散语言模型的渐进式令牌演化)
[06:30] 🧠 Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction(可控内存使用:在长期人机交互中平衡锚定与创新)
[07:10] 🚗 DrivingGen: A Comprehensive Benchmark for Generative Video World Models in Autonomous Driving(DrivingGen:自动驾驶生成式视频世界模型的综合基准)
[07:58] 🤖 MegaFlow: Large-Scale Distributed Orchestration System for the Agentic Era(MegaFlow:面向智能体时代的大规模分布式编排系统)
[08:26] 🎨 Boosting Latent Diffusion Models via Disentangled Representation Alignment(通过解耦表征对齐提升潜在扩散模型)
[09:08] 🤔 What Users Leave Unsaid: Under-Specified Queries Limit Vision-Language Models(用户未言明之处:欠明确的查询限制视觉语言模型)
[09:45] 🔧 ET-Agent: Incentivizing Effective Tool-Integrated Reasoning Agent via Behavior Calibration(ET-Agent:通过行为校准激励有效的工具集成推理智能体)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
[00:20] 🔍 Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning(观察、推理与搜索:面向智能体视频推理的开放网络视频深度研究基准)
[01:01] 👶 BabyVision: Visual Reasoning Beyond Language(BabyVision:超越语言的视觉推理)
[01:45] 🚀 PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning(PaCoRe:通过并行协调推理学习扩展测试时计算)
[02:24] 🧠 X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests(X-Coder:基于全合成任务、解决方案与测试推进竞争性编程)
[03:03] ⚡ MHLA: Restoring Expressivity of Linear Attention via Token-Level Multi-Head(MHLA:通过令牌级多头机制恢复线性注意力的表达能力)
[03:41] ⚡ GlimpRouter: Efficient Collaborative Inference by Glimpsing One Token of Thoughts(GlimpRouter:通过瞥见思维令牌实现高效协同推理)
[04:17] 🤖 OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent(OS-Symphony:一个用于鲁棒且通用的计算机使用智能体的整体框架)
[05:20] 📉 Lost in the Noise: How Reasoning Models Fail with Contextual Distractors(迷失于噪声之中:推理模型如何因上下文干扰物而失效)
[06:00] 🚀 Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models(超越硬掩码:扩散语言模型的渐进式令牌演化)
[06:30] 🧠 Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction(可控内存使用:在长期人机交互中平衡锚定与创新)
[07:10] 🚗 DrivingGen: A Comprehensive Benchmark for Generative Video World Models in Autonomous Driving(DrivingGen:自动驾驶生成式视频世界模型的综合基准)
[07:58] 🤖 MegaFlow: Large-Scale Distributed Orchestration System for the Agentic Era(MegaFlow:面向智能体时代的大规模分布式编排系统)
[08:26] 🎨 Boosting Latent Diffusion Models via Disentangled Representation Alignment(通过解耦表征对齐提升潜在扩散模型)
[09:08] 🤔 What Users Leave Unsaid: Under-Specified Queries Limit Vision-Language Models(用户未言明之处:欠明确的查询限制视觉语言模型)
[09:45] 🔧 ET-Agent: Incentivizing Effective Tool-Integrated Reasoning Agent via Behavior Calibration(ET-Agent:通过行为校准激励有效的工具集成推理智能体)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
评价...
空空如也
小宇宙热门评论...
暂无小宇宙热门评论