Album

HuggingFace 每日AI论文速递

10分钟速读热门AI论文

拨号上网 佚名
1.4万 订阅 587 集 6天前
播客简介
每天10分钟,带您快速了解当日HuggingFace热门AI论文内容。每个工作日更新,欢迎订阅。 📢播客节目在小宇宙、Apple Podcast平台搜索【HuggingFace 每日AI论文速递】 🖼另外还有图文版,可在小红书搜索并关注【AI速递】
节目

2026.05.01 | Eywa让LLM牵手领域模型提效30%;视觉生成五级跃迁仍卡第三关

HuggingFace 每日AI论文速递

【目录】 本期的 15 篇论文如下: [00:25] 🧠 Heterogeneous Scientific Foundation Model Collaboration(异构科学基础模型协作) [01:24] 🌍 Visual Generation in the New Era: An Evolution from Atomic Mapping to Agentic World Modeling(新时代的视觉生成:从原子映射到智能体世界建模的演进) [02:04] 🧬 Co-Evolving Policy Distillation(共同演化策略蒸馏) [02:47] 🤖 ExoActor: Exocentric Video Generation as Generalizable Interactive Humanoid Control(ExoActor:外视点视频生成作为可泛化的交互式人形机器人控制) [03:38] 🚀 Efficient Training on Multiple Consumer GPUs with RoundPipe(在多块消费级GPU上使用RoundPipe进行高效训练) [04:17] 🧠 Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows(Claw-Eval-Live:一个面向不断演变的真实世界工作流的实时智能体基准测试) [05:08] 🎨 Leveraging Verifier-Based Reinforcement Learning in Image Editing(利用基于验证器的强化学习进行图像编辑) [06:18] 📏 Length Value Model: Scalable Value Pretraining for Token-Level Length Modeling(长度价值模型:面向令牌级长度建模的可扩展价值预训练) [07:15] 🔬 Intern-Atlas: A Methodological Evolution Graph as Research Infrastructure for AI Scientists(Intern-Atlas:作为AI科学家研究基础设施的方法演化图) [08:31] 🌐 InteractWeb-Bench: Can Multimodal Agent Escape Blind Execution in Interactive Website Generation?(InteractWeb-Bench:多模态智能体能否在交互式网站生成中摆脱盲目执行?) [09:15] 🎨 Representation Fréchet Loss for Visual Generation(用于视觉生成的表示空间弗雷歇损失) [10:05] 🖥 Synthetic Computers at Scale for Long-Horizon Productivity Simulation(面向长周期生产力模拟的大规模合成计算机) [10:52] 🧠 Compliance versus Sensibility: On the Reasoning Controllability in Large Language Models(合规性与敏感性:大型语言模型中的推理可控性研究) [11:25] 🤖 The Last Human-Written Paper: Agent-Native Research Artifacts(最后一篇人类撰写的论文:智能体原生研究工件) [12:14] 💃 MoCapAnything V2: End-to-End Motion Capture for Arbitrary Skeletons(MoCapAnything V2:面向任意骨骼的端到端动作捕捉) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递

13分钟
67
1周前

2026.04.30 | GLM-5V一锅端训多模态;潜在蒸馏采样省样本

HuggingFace 每日AI论文速递

【目录】 本期的 11 篇论文如下: [00:22] 🤖 GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents(GLM-5V-Turbo:迈向多模态智能体的原生基础模型) [01:26] 🔬 Large Language Models Explore by Latent Distilling(大型语言模型通过潜在蒸馏进行探索) [02:16] 🌊 Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models(扭转潮流:面向扩散大语言模型的跨架构蒸馏) [03:02] 🦾 ClawGym: A Scalable Framework for Building Effective Claw Agents(ClawGym:一个构建高效Claw智能体的可扩展框架) [03:49] 🤖 RADIO-ViPE: Online Tightly Coupled Multi-Modal Fusion for Open-Vocabulary Semantic SLAM in Dynamic Environments(RADIO-ViPE:面向动态环境中开放词汇语义SLAM的在线紧耦合多模态融合) [04:35] 🧩 Diffusion Templates: A Unified Plugin Framework for Controllable Diffusion(扩散模板:一种用于可控扩散的统一插件框架) [05:20] 🚀 Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding(通过系统集成的推测解码加速强化学习后训练中的自回归生成) [06:08] 🌍 Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising(基于异步去噪的视频先验的统一4D世界动作建模) [07:02] 💬 A Survey on LLM-based Conversational User Simulation(基于大语言模型的对话式用户模拟综述) [07:55] 👗 FASH-iCNN: Making Editorial Fashion Identity Inspectable Through Multimodal CNN Probing(FASH-iCNN:通过多模态CNN探针使时尚编辑身份可审查) [08:43] 🧩 Probing Visual Planning in Image Editing Models(探究图像编辑模型中的视觉规划能力) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递

9分钟
52
1周前

2026.04.29 | 递归多智能体套娃提速;数据编程Git式自改进

HuggingFace 每日AI论文速递

【目录】 本期的 15 篇论文如下: [00:25] 🔄 Recursive Multi-Agent Systems(递归多智能体系统) [01:01] 🔧 Programming with Data: Test-Driven Data Engineering for Self-Improving LLMs from Raw Corpora(数据编程:面向自改进大语言模型从原始语料库进行测试驱动数据工程) [01:55] 📊 DV-World: Benchmarking Data Visualization Agents in Real-World Scenarios(DV-World:在真实世界场景中评估数据可视化智能体的基准) [02:36] 🔬 AutoResearchBench: Benchmarking AI Agents on Complex Scientific Literature Discovery(AutoResearchBench:基于复杂科学文献发现的AI智能体基准测试) [03:23] 🖼 Meta-CoT: Enhancing Granularity and Generalization in Image Editing(元链式思维:增强图像编辑的粒度与泛化能力) [04:07] 🎨 Refinement via Regeneration: Enlarging Modification Space Boosts Image Refinement in Unified Multimodal Models(通过重生成进行精炼:扩大修改空间提升统一多模态模型中的图像精炼效果) [05:03] 🎥 Mutual Forcing: Dual-Mode Self-Evolution for Fast Autoregressive Audio-Video Character Generation(相互强迫:用于快速自回归音视频角色生成的双模式自演化) [05:46] 🎧 Step-Audio-R1.5 Technical Report(Step-Audio-R1.5 技术报告) [06:26] 🎬 Co-Director: Agentic Generative Video Storytelling(联合导演:基于智能体的生成式视频故事讲述) [07:13] 🖥 Toward Scalable Terminal Task Synthesis via Skill Graphs(面向可扩展终端任务合成的技能图方法) [07:57] 🎓 TCOD: Exploring Temporal Curriculum in On-Policy Distillation for Multi-turn Autonomous Agents(TCOD:面向多轮自主智能体的在策略蒸馏中的时序课程探索) [08:53] 🛡 BARRED: Synthetic Training of Custom Policy Guardrails via Asymmetric Debate(BARRED:通过非对称辩论进行自定义策略护栏的合成训练) [09:36] 🎓 MAIC-UI: Making Interactive Courseware with Generative UI(MAIC-UI:利用生成式用户界面制作交互式课件) [10:35] 🎨 V-GRPO: Online Reinforcement Learning for Denoising Generative Models Is Easier than You Think(V-GRPO:去噪生成模型的在线强化学习比你想象的要简单) [11:15] 🏃 IAM: Identity-Aware Human Motion and Shape Joint Generation(身份感知的人体运动与形状联合生成) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递

12分钟
88
1周前
评价

空空如也

加入我们的 Discord

与播客爱好者一起交流

立即加入

扫描微信二维码

添加微信好友,获取更多播客资讯

微信二维码

播放列表

自动播放下一个

播放列表还是空的

去找些喜欢的节目添加进来吧