本期的 15 篇论文如下:
[00:25] 🧠 Learning to Reason under Off-Policy Guidance(离线策略指导下的推理学习)
[01:00] 🤖 FlowReasoner: Reinforcing Query-Level Meta-Agents(FlowReasoner:强化查询级别元代理)
[01:40] 🦅 Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models(Eagle 2.5:提升前沿视觉-语言模型长文本后训练性能)
[02:22] 🧰 ToolRL: Reward is All Tool Learning Needs(工具强化学习:奖励是工具学习的全部)
[03:07] 🌐 SphereDiff: Tuning-free Omnidirectional Panoramic Image and Video Generation via Spherical Latent Representation(SphereDiff:通过球面潜在表示实现免调优全景图像和视频生成)
[03:39] 🎨 StyleMe3D: Stylization with Disentangled Priors by Multiple Encoders on 3D Gaussians(StyleMe3D:基于3D高斯的解耦先验多编码器风格化)
[04:18] 🤖 X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents(X-Teaming:基于自适应多智能体的多轮越狱与防御)
[04:57] 🤖 UFO2: The Desktop AgentOS(UFO2:桌面AgentOS)
[05:34] 🧑 LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs(LeetCodeDataset:一个用于代码大语言模型稳健评估和高效训练的时序数据集)
[06:18] 👀 Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs(换个角度看世界:评估多模态大语言模型中的多视角理解能力)
[07:02] 🤖 InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners(InfiGUI-R1:推进多模态GUI智能体从反应式执行者到审慎推理者的演进)
[07:42] 🕹 EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models(EasyEdit2:一种用于编辑大型语言模型的简易操控框架)
[08:23] 📱 LearnAct: Few-Shot Mobile GUI Agent with a Unified Demonstration Benchmark(LearnAct:基于统一演示基准的少样本移动GUI智能体)
[09:06] 🖼 LookingGlass: Generative Anamorphoses via Laplacian Pyramid Warping(窥镜:基于拉普拉斯金字塔扭曲的生成式畸变图像)
[09:50] 🎵 DRAGON: Distributional Rewards Optimize Diffusion Generative Models(DRAGON:利用分布奖励优化扩散生成模型)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递

空空如也
暂无小宇宙热门评论