主播
节目简介
来源:小宇宙
【赞助商】
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd
【目录】
本期的 15 篇论文如下:
[00:29] 🧠 AI Can Learn Scientific Taste(AI可以学习科学品味)
[01:13] 🔍 OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data(OpenSeeker:通过完全开源训练数据实现前沿搜索代理的民主化)
[02:06] 🏢 EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings(EnterpriseOps-Gym:企业环境中状态感知的智能体规划与工具使用评估环境)
[03:00] 🌆 Grounding World Simulation Models in a Real-World Metropolis(将世界仿真模型锚定于真实大都市)
[03:53] 🤖 HSImul3R: Physics-in-the-Loop Reconstruction of Simulation-Ready Human-Scene Interactions(HSImul3R:基于物理闭环的仿真就绪人-场景交互重建)
[04:39] 🧠 Attention Residuals(注意力残差)
[05:38] 🧠 Mixture-of-Depths Attention(混合深度注意力机制)
[06:44] 🧠 Effective Distillation to Hybrid xLSTM Architectures(面向混合xLSTM架构的高效知识蒸馏)
[07:23] 🔍 Anatomy of a Lie: A Multi-Stage Diagnostic Framework for Tracing Hallucinations in Vision-Language Models(谎言剖析:追踪视觉语言模型幻觉的多阶段诊断框架)
[08:14] 🎬 ViFeEdit: A Video-Free Tuner of Your Video Diffusion Transformer(ViFeEdit:一种无需视频数据的视频扩散变换器调谐器)
[08:54] 🚀 POLCA: Stochastic Generative Optimization with LLM(POLCA:基于大语言模型的随机生成优化)
[10:00] 🤖 Safe and Scalable Web Agent Learning via Recreated Websites(通过重建网站实现安全且可扩展的网页智能体学习)
[10:45] 🔍 Make it SING: Analyzing Semantic Invariants in Classifiers(使其SING:分析分类器中的语义不变量)
[11:28] ⏱ TERMINATOR: Learning Optimal Exit Points for Early Stopping in Chain-of-Thought Reasoning(终结者:学习链式思维推理中提前停止的最优退出点)
[12:30] 🎬 WebVR: Benchmarking Multimodal LLMs for WebPage Recreation from Videos via Human-Aligned Visual Rubrics(WebVR:基于人类对齐视觉量表的视频到网页重建多模态大语言模型评测基准)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd
【目录】
本期的 15 篇论文如下:
[00:29] 🧠 AI Can Learn Scientific Taste(AI可以学习科学品味)
[01:13] 🔍 OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data(OpenSeeker:通过完全开源训练数据实现前沿搜索代理的民主化)
[02:06] 🏢 EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings(EnterpriseOps-Gym:企业环境中状态感知的智能体规划与工具使用评估环境)
[03:00] 🌆 Grounding World Simulation Models in a Real-World Metropolis(将世界仿真模型锚定于真实大都市)
[03:53] 🤖 HSImul3R: Physics-in-the-Loop Reconstruction of Simulation-Ready Human-Scene Interactions(HSImul3R:基于物理闭环的仿真就绪人-场景交互重建)
[04:39] 🧠 Attention Residuals(注意力残差)
[05:38] 🧠 Mixture-of-Depths Attention(混合深度注意力机制)
[06:44] 🧠 Effective Distillation to Hybrid xLSTM Architectures(面向混合xLSTM架构的高效知识蒸馏)
[07:23] 🔍 Anatomy of a Lie: A Multi-Stage Diagnostic Framework for Tracing Hallucinations in Vision-Language Models(谎言剖析:追踪视觉语言模型幻觉的多阶段诊断框架)
[08:14] 🎬 ViFeEdit: A Video-Free Tuner of Your Video Diffusion Transformer(ViFeEdit:一种无需视频数据的视频扩散变换器调谐器)
[08:54] 🚀 POLCA: Stochastic Generative Optimization with LLM(POLCA:基于大语言模型的随机生成优化)
[10:00] 🤖 Safe and Scalable Web Agent Learning via Recreated Websites(通过重建网站实现安全且可扩展的网页智能体学习)
[10:45] 🔍 Make it SING: Analyzing Semantic Invariants in Classifiers(使其SING:分析分类器中的语义不变量)
[11:28] ⏱ TERMINATOR: Learning Optimal Exit Points for Early Stopping in Chain-of-Thought Reasoning(终结者:学习链式思维推理中提前停止的最优退出点)
[12:30] 🎬 WebVR: Benchmarking Multimodal LLMs for WebPage Recreation from Videos via Human-Aligned Visual Rubrics(WebVR:基于人类对齐视觉量表的视频到网页重建多模态大语言模型评测基准)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递