主播
节目简介
来源:小宇宙
【赞助商】
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd
【目录】
本期的 15 篇论文如下:
[00:27] 🎬 CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents(CUA-Suite:用于计算机使用代理的大规模人工标注视频演示集)
[01:24] 🎬 EVA: Efficient Reinforcement Learning for End-to-End Video Agent(EVA:面向端到端视频智能体的高效强化学习框架)
[02:05] 🛡 T-MAP: Red-Teaming LLM Agents with Trajectory-aware Evolutionary Search(T-MAP:基于轨迹感知进化搜索的LLM智能体红队测试)
[02:50] 🤖 UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience(UI-Voyager:一种通过失败经验学习的自进化图形用户界面代理)
[03:33] 🤔 Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?(自蒸馏为何(有时)会削弱大语言模型的推理能力?)
[04:20] 🎮 GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents(GameplayQA:面向决策密集型第一人称同步多视频理解的3D虚拟智能体基准测试框架)
[05:13] 🧠 When Models Judge Themselves: Unsupervised Self-Evolution for Multimodal Reasoning(当模型自我评判时:多模态推理的无监督自我进化)
[06:11] 🤖 CarePilot: A Multi-Agent Framework for Long-Horizon Computer Task Automation in Healthcare(CarePilot:面向医疗领域长周期计算机任务自动化的多智能体框架)
[07:13] 🌀 4DGS360: 360° Gaussian Reconstruction of Dynamic Objects from a Single Video(4DGS360:基于单视频的动态物体360度高斯重建)
[07:54] 🎬 OmniWeaving: Towards Unified Video Generation with Free-form Composition and Reasoning(OmniWeaving:面向自由组合与推理的统一视频生成)
[08:38] 🚗 Toward Physically Consistent Driving Video World Models under Challenging Trajectories(面向挑战性轨迹下物理一致性驾驶视频世界模型的研究)
[09:18] 📊 Can LLM Agents Be CFOs? A Benchmark for Resource Allocation in Dynamic Enterprise Environments(LLM智能体能否胜任CFO?动态企业环境中资源分配的基准测试)
[10:10] 🧠 Unleashing Spatial Reasoning in Multimodal Large Language Models via Textual Representation Guided Reasoning(通过文本表征引导推理释放多模态大语言模型的空间推理能力)
[10:53] 🤖 StreamingClaw Technical Report(StreamingClaw技术报告)
[11:30] 🔍 LagerNVS: Latent Geometry for Fully Neural Real-time Novel View Synthesis(LagerNVS:基于潜在几何的全神经实时新视角合成)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd
【目录】
本期的 15 篇论文如下:
[00:27] 🎬 CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents(CUA-Suite:用于计算机使用代理的大规模人工标注视频演示集)
[01:24] 🎬 EVA: Efficient Reinforcement Learning for End-to-End Video Agent(EVA:面向端到端视频智能体的高效强化学习框架)
[02:05] 🛡 T-MAP: Red-Teaming LLM Agents with Trajectory-aware Evolutionary Search(T-MAP:基于轨迹感知进化搜索的LLM智能体红队测试)
[02:50] 🤖 UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience(UI-Voyager:一种通过失败经验学习的自进化图形用户界面代理)
[03:33] 🤔 Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?(自蒸馏为何(有时)会削弱大语言模型的推理能力?)
[04:20] 🎮 GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents(GameplayQA:面向决策密集型第一人称同步多视频理解的3D虚拟智能体基准测试框架)
[05:13] 🧠 When Models Judge Themselves: Unsupervised Self-Evolution for Multimodal Reasoning(当模型自我评判时:多模态推理的无监督自我进化)
[06:11] 🤖 CarePilot: A Multi-Agent Framework for Long-Horizon Computer Task Automation in Healthcare(CarePilot:面向医疗领域长周期计算机任务自动化的多智能体框架)
[07:13] 🌀 4DGS360: 360° Gaussian Reconstruction of Dynamic Objects from a Single Video(4DGS360:基于单视频的动态物体360度高斯重建)
[07:54] 🎬 OmniWeaving: Towards Unified Video Generation with Free-form Composition and Reasoning(OmniWeaving:面向自由组合与推理的统一视频生成)
[08:38] 🚗 Toward Physically Consistent Driving Video World Models under Challenging Trajectories(面向挑战性轨迹下物理一致性驾驶视频世界模型的研究)
[09:18] 📊 Can LLM Agents Be CFOs? A Benchmark for Resource Allocation in Dynamic Enterprise Environments(LLM智能体能否胜任CFO?动态企业环境中资源分配的基准测试)
[10:10] 🧠 Unleashing Spatial Reasoning in Multimodal Large Language Models via Textual Representation Guided Reasoning(通过文本表征引导推理释放多模态大语言模型的空间推理能力)
[10:53] 🤖 StreamingClaw Technical Report(StreamingClaw技术报告)
[11:30] 🔍 LagerNVS: Latent Geometry for Fully Neural Real-time Novel View Synthesis(LagerNVS:基于潜在几何的全神经实时新视角合成)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递