HuggingFace 每日AI论文速递 - 2026.02.13 | 自演化AI难守安全；音频大模型统一token - EarsOnMe

主播

节目简介

来源：小宇宙

【赞助商】

通勤路上就听AI每周谈。AI每周谈，每周带你回顾上周AI大事

传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd

【目录】

本期的 15 篇论文如下：

[00:31] ⚠ The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies（魔书背后的魔鬼：在自我进化的AI社会中，人类安全价值总是趋于消失）

[01:24] 🎵 MOSS-Audio-Tokenizer: Scaling Audio Tokenizers for Future Audio Foundation Models（MOSS-Audio-Tokenizer：为未来音频基础模型扩展音频分词器）

[02:28] 🧠 Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation（超越教师的学习：基于奖励外推的广义策略蒸馏）

[03:05] 🤖 GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning（GigaBrain-0.5M*：一种通过世界模型强化学习训练的视觉-语言-动作模型）

[03:56] ⚖ LawThinker: A Deep Research Legal Agent in Dynamic Environments（LawThinker：动态环境中的深度研究法律智能体）

[04:33] 🔍 Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning（思之愈久，探之愈深：通过长度激励强化学习实现上下文内探索）

[05:16] 🎨 Stroke of Surprise: Progressive Semantic Illusions in Vector Sketching（惊喜之笔：矢量草图绘制中的渐进式语义错觉）

[06:01] 🚀 DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing（DeepGen 1.0：一个用于推进图像生成与编辑的轻量级统一多模态模型）

[06:55] 🧩 Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models（Composition-RL：为大型语言模型强化学习组合可验证提示）

[07:38] 🧠 Thinking with Drafting: Optical Decompression via Logical Reconstruction（思维与草稿：通过逻辑重构实现光学解压缩）

[08:17] 🗳 dVoting: Fast Voting for dLLMs（dVoting：面向扩散大语言模型的快速投票推理方法）

[09:09] 🤖 RISE: Self-Improving Robot Policy with Compositional World Model（RISE：基于组合世界模型的机器人策略自改进框架）

[09:54] 🤖 $χ_{0}$: Resource-Aware Robust Manipulation via Taming Distributional Inconsistencies（χ₀：通过驯服分布不一致实现资源感知的鲁棒机器人操作）

[10:48] 🤖 EgoHumanoid: Unlocking In-the-Wild Loco-Manipulation with Robot-Free Egocentric Demonstration（EgoHumanoid：利用无机器人自我中心演示解锁野外移动操作）

[11:45] 🔍 Unveiling Implicit Advantage Symmetry: Why GRPO Struggles with Exploration and Difficulty Adaptation（揭示隐式优势对称性：为何GRPO在探索与难度适应中举步维艰）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

2026.02.13 | 自演化AI难守安全；音频大模型统一token