主播
节目简介
来源:小宇宙
【赞助商】
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗www.xiaoyuzhoufm.com
【目录】
本期的 15 篇论文如下:
00:33 🤖 ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents(ClawGUI:用于训练、评估和部署GUI智能体的统一框架)
01:21 🧠 KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance(KnowRL:通过强化学习与最小充分知识指导提升大语言模型推理能力)
02:16 🧠 Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe(重新思考大型语言模型的在线策略蒸馏:现象学、机制与方案)
03:09 🤖 Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization(屏幕上的图灵测试:移动GUI代理拟人化基准)
04:01 🧠 SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks(SPPO:面向长程推理任务的序列级近端策略优化)
04:47 🤖 Toward Autonomous Long-Horizon Engineering for ML Research(迈向自主长周期机器学习研究工程)
05:33 ⚖ BERT-as-a-Judge: A Robust Alternative to Lexical Methods for Efficient Reference-Based LLM Evaluation(BERT作为评判者:一种用于高效基于参考的LLM评估的鲁棒性替代词汇方法)
06:17 🔍 Towards Long-horizon Agentic Multimodal Search(迈向长视野智能体多模态搜索)
06:57 🌍 Lyra 2.0: Explorable Generative 3D Worlds(Lyra 2.0:可探索的生成式3D世界)
07:40 ⚡ Self-Adversarial One Step Generation via Condition Shifting(通过条件偏移实现的自对抗单步生成)
08:37 🤖 Habitat-GS: A High-Fidelity Navigation Simulator with Dynamic Gaussian Splatting(Habitat-GS:基于动态高斯溅射的高保真导航模拟器)
09:20 ⚖ Many-Tier Instruction Hierarchy in LLM Agents(大语言模型代理中的多层级指令层次结构)
10:04 🚀 Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning(Nemotron 3 Super:用于智能体推理的开放、高效混合专家Mamba-Transformer模型)
10:52 🧠 Rethinking the Diffusion Model from a Langevin Perspective(从朗之万视角重新思考扩散模型)
11:44 🤖 LARY: A Latent Action Representation Yielding Benchmark for Generalizable Vision-to-Action Alignment(LARY:一种用于通用视觉-动作对齐基准的潜在动作表示)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗www.xiaoyuzhoufm.com
【目录】
本期的 15 篇论文如下:
00:33 🤖 ClawGUI: A Unified Framework for Training, Evaluating, and Deploying GUI Agents(ClawGUI:用于训练、评估和部署GUI智能体的统一框架)
01:21 🧠 KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance(KnowRL:通过强化学习与最小充分知识指导提升大语言模型推理能力)
02:16 🧠 Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe(重新思考大型语言模型的在线策略蒸馏:现象学、机制与方案)
03:09 🤖 Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization(屏幕上的图灵测试:移动GUI代理拟人化基准)
04:01 🧠 SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks(SPPO:面向长程推理任务的序列级近端策略优化)
04:47 🤖 Toward Autonomous Long-Horizon Engineering for ML Research(迈向自主长周期机器学习研究工程)
05:33 ⚖ BERT-as-a-Judge: A Robust Alternative to Lexical Methods for Efficient Reference-Based LLM Evaluation(BERT作为评判者:一种用于高效基于参考的LLM评估的鲁棒性替代词汇方法)
06:17 🔍 Towards Long-horizon Agentic Multimodal Search(迈向长视野智能体多模态搜索)
06:57 🌍 Lyra 2.0: Explorable Generative 3D Worlds(Lyra 2.0:可探索的生成式3D世界)
07:40 ⚡ Self-Adversarial One Step Generation via Condition Shifting(通过条件偏移实现的自对抗单步生成)
08:37 🤖 Habitat-GS: A High-Fidelity Navigation Simulator with Dynamic Gaussian Splatting(Habitat-GS:基于动态高斯溅射的高保真导航模拟器)
09:20 ⚖ Many-Tier Instruction Hierarchy in LLM Agents(大语言模型代理中的多层级指令层次结构)
10:04 🚀 Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning(Nemotron 3 Super:用于智能体推理的开放、高效混合专家Mamba-Transformer模型)
10:52 🧠 Rethinking the Diffusion Model from a Langevin Perspective(从朗之万视角重新思考扩散模型)
11:44 🤖 LARY: A Latent Action Representation Yielding Benchmark for Generalizable Vision-to-Action Alignment(LARY:一种用于通用视觉-动作对齐基准的潜在动作表示)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递