主播
节目简介
来源:小宇宙
【赞助商】
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd
【目录】
本期的 15 篇论文如下:
[00:31] ⚠ The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies(魔书背后的魔鬼:在自我进化的AI社会中,人类安全价值总是趋于消失)
[01:24] 🎵 MOSS-Audio-Tokenizer: Scaling Audio Tokenizers for Future Audio Foundation Models(MOSS-Audio-Tokenizer:为未来音频基础模型扩展音频分词器)
[02:28] 🧠 Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation(超越教师的学习:基于奖励外推的广义策略蒸馏)
[03:05] 🤖 GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning(GigaBrain-0.5M*:一种通过世界模型强化学习训练的视觉-语言-动作模型)
[03:56] ⚖ LawThinker: A Deep Research Legal Agent in Dynamic Environments(LawThinker:动态环境中的深度研究法律智能体)
[04:33] 🔍 Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning(思之愈久,探之愈深:通过长度激励强化学习实现上下文内探索)
[05:16] 🎨 Stroke of Surprise: Progressive Semantic Illusions in Vector Sketching(惊喜之笔:矢量草图绘制中的渐进式语义错觉)
[06:01] 🚀 DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing(DeepGen 1.0:一个用于推进图像生成与编辑的轻量级统一多模态模型)
[06:55] 🧩 Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models(Composition-RL:为大型语言模型强化学习组合可验证提示)
[07:38] 🧠 Thinking with Drafting: Optical Decompression via Logical Reconstruction(思维与草稿:通过逻辑重构实现光学解压缩)
[08:17] 🗳 dVoting: Fast Voting for dLLMs(dVoting:面向扩散大语言模型的快速投票推理方法)
[09:09] 🤖 RISE: Self-Improving Robot Policy with Compositional World Model(RISE:基于组合世界模型的机器人策略自改进框架)
[09:54] 🤖 $χ_{0}$: Resource-Aware Robust Manipulation via Taming Distributional Inconsistencies(χ₀:通过驯服分布不一致实现资源感知的鲁棒机器人操作)
[10:48] 🤖 EgoHumanoid: Unlocking In-the-Wild Loco-Manipulation with Robot-Free Egocentric Demonstration(EgoHumanoid:利用无机器人自我中心演示解锁野外移动操作)
[11:45] 🔍 Unveiling Implicit Advantage Symmetry: Why GRPO Struggles with Exploration and Difficulty Adaptation(揭示隐式优势对称性:为何GRPO在探索与难度适应中举步维艰)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd
【目录】
本期的 15 篇论文如下:
[00:31] ⚠ The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies(魔书背后的魔鬼:在自我进化的AI社会中,人类安全价值总是趋于消失)
[01:24] 🎵 MOSS-Audio-Tokenizer: Scaling Audio Tokenizers for Future Audio Foundation Models(MOSS-Audio-Tokenizer:为未来音频基础模型扩展音频分词器)
[02:28] 🧠 Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation(超越教师的学习:基于奖励外推的广义策略蒸馏)
[03:05] 🤖 GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning(GigaBrain-0.5M*:一种通过世界模型强化学习训练的视觉-语言-动作模型)
[03:56] ⚖ LawThinker: A Deep Research Legal Agent in Dynamic Environments(LawThinker:动态环境中的深度研究法律智能体)
[04:33] 🔍 Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning(思之愈久,探之愈深:通过长度激励强化学习实现上下文内探索)
[05:16] 🎨 Stroke of Surprise: Progressive Semantic Illusions in Vector Sketching(惊喜之笔:矢量草图绘制中的渐进式语义错觉)
[06:01] 🚀 DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing(DeepGen 1.0:一个用于推进图像生成与编辑的轻量级统一多模态模型)
[06:55] 🧩 Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models(Composition-RL:为大型语言模型强化学习组合可验证提示)
[07:38] 🧠 Thinking with Drafting: Optical Decompression via Logical Reconstruction(思维与草稿:通过逻辑重构实现光学解压缩)
[08:17] 🗳 dVoting: Fast Voting for dLLMs(dVoting:面向扩散大语言模型的快速投票推理方法)
[09:09] 🤖 RISE: Self-Improving Robot Policy with Compositional World Model(RISE:基于组合世界模型的机器人策略自改进框架)
[09:54] 🤖 $χ_{0}$: Resource-Aware Robust Manipulation via Taming Distributional Inconsistencies(χ₀:通过驯服分布不一致实现资源感知的鲁棒机器人操作)
[10:48] 🤖 EgoHumanoid: Unlocking In-the-Wild Loco-Manipulation with Robot-Free Egocentric Demonstration(EgoHumanoid:利用无机器人自我中心演示解锁野外移动操作)
[11:45] 🔍 Unveiling Implicit Advantage Symmetry: Why GRPO Struggles with Exploration and Difficulty Adaptation(揭示隐式优势对称性:为何GRPO在探索与难度适应中举步维艰)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
评价
空空如也
小宇宙热评
暂无小宇宙热门评论