主播
节目简介
来源:小宇宙
【目录】
本期的 11 篇论文如下:
[00:22] 🤖 GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents(GLM-5V-Turbo:迈向多模态智能体的原生基础模型)
[01:26] 🔬 Large Language Models Explore by Latent Distilling(大型语言模型通过潜在蒸馏进行探索)
[02:16] 🌊 Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models(扭转潮流:面向扩散大语言模型的跨架构蒸馏)
[03:02] 🦾 ClawGym: A Scalable Framework for Building Effective Claw Agents(ClawGym:一个构建高效Claw智能体的可扩展框架)
[03:49] 🤖 RADIO-ViPE: Online Tightly Coupled Multi-Modal Fusion for Open-Vocabulary Semantic SLAM in Dynamic Environments(RADIO-ViPE:面向动态环境中开放词汇语义SLAM的在线紧耦合多模态融合)
[04:35] 🧩 Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion(扩散模板:一种用于可控扩散的统一插件框架)
[05:20] 🚀 Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding(通过系统集成的推测解码加速强化学习后训练中的自回归生成)
[06:08] 🌍 Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising(基于异步去噪的视频先验的统一4D世界动作建模)
[07:02] 💬 A Survey on LLM-based Conversational User Simulation(基于大语言模型的对话式用户模拟综述)
[07:55] 👗 FASH-iCNN: Making Editorial Fashion Identity Inspectable Through Multimodal CNN Probing(FASH-iCNN:通过多模态CNN探针使时尚编辑身份可审查)
[08:43] 🧩 Probing Visual Planning in Image Editing Models(探究图像编辑模型中的视觉规划能力)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
本期的 11 篇论文如下:
[00:22] 🤖 GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents(GLM-5V-Turbo:迈向多模态智能体的原生基础模型)
[01:26] 🔬 Large Language Models Explore by Latent Distilling(大型语言模型通过潜在蒸馏进行探索)
[02:16] 🌊 Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models(扭转潮流:面向扩散大语言模型的跨架构蒸馏)
[03:02] 🦾 ClawGym: A Scalable Framework for Building Effective Claw Agents(ClawGym:一个构建高效Claw智能体的可扩展框架)
[03:49] 🤖 RADIO-ViPE: Online Tightly Coupled Multi-Modal Fusion for Open-Vocabulary Semantic SLAM in Dynamic Environments(RADIO-ViPE:面向动态环境中开放词汇语义SLAM的在线紧耦合多模态融合)
[04:35] 🧩 Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion(扩散模板:一种用于可控扩散的统一插件框架)
[05:20] 🚀 Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding(通过系统集成的推测解码加速强化学习后训练中的自回归生成)
[06:08] 🌍 Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising(基于异步去噪的视频先验的统一4D世界动作建模)
[07:02] 💬 A Survey on LLM-based Conversational User Simulation(基于大语言模型的对话式用户模拟综述)
[07:55] 👗 FASH-iCNN: Making Editorial Fashion Identity Inspectable Through Multimodal CNN Probing(FASH-iCNN:通过多模态CNN探针使时尚编辑身份可审查)
[08:43] 🧩 Probing Visual Planning in Image Editing Models(探究图像编辑模型中的视觉规划能力)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递