Album
时长:
10分钟
播放:
84
发布:
3天前
主播...
简介...
https://xiaoyuzhoufm.com
本期的 15 篇论文如下:
[00:21] 📈 GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization(GDPO:面向多奖励强化学习优化的组奖励解耦归一化策略优化)
[01:05] ⚖ Learnable Multipliers: Freeing the Scale of Language Model Matrix Layers(可学习的乘数:释放语言模型矩阵层的尺度)
[01:33] 🌙 RL-AWB: Deep Reinforcement Learning for Auto White Balance Correction in Low-Light Night-time Scenes(RL-AWB:基于深度强化学习的低光照夜间场景自动白平衡校正)
[02:07] 🤖 RoboVIP: Multi-View Video Generation with Visual Identity Prompting Augments Robot Manipulation(RoboVIP:基于视觉身份提示的多视角视频生成增强机器人操作)
[02:56] 🤝 RelayLLM: Efficient Reasoning via Collaborative Decoding(RelayLLM:基于协作解码的高效推理框架)
[03:31] 🌲 AT$^2$PO: Agentic Turn-based Policy Optimization via Tree Search(AT²PO:基于树搜索的智能体回合制策略优化)
[04:24] 🤔 VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice(VideoAuto-R1:通过思考一次、回答两次实现视频自动推理)
[04:57] 🎬 VerseCrafter: Dynamic Realistic Video World Model with 4D Geometric Control(VerseCrafter:具有4D几何控制的动态逼真视频世界模型)
[05:34] 🔍 The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models(专业化的幻象:揭示混合专家模型中的领域不变“常务委员会”)
[06:09] 🎯 Few Tokens Matter: Entropy Guided Attacks on Vision-Language Models(少数令牌至关重要:针对视觉语言模型的熵引导攻击)
[06:40] 🎥 Plenoptic Video Generation(全光视频生成)
[07:12] ⚖ Agent-as-a-Judge(智能体作为评审者)
[07:43] 📄 DocDancer: Towards Agentic Document-Grounded Information Seeking(DocDancer:面向智能体化的文档驱动信息检索)
[08:20] 🧠 Re-Align: Structured Reasoning-guided Alignment for In-Context Image Generation and Editing(Re-Align:基于结构化推理引导对齐的上下文图像生成与编辑)
[09:05] 🧠 DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs(DiffCoT:大语言模型中的扩散风格思维链推理)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
评价...

空空如也

小宇宙热门评论...

暂无小宇宙热门评论

EarsOnMe

加入我们的 Discord

与播客爱好者一起交流

立即加入

扫描微信二维码

添加微信好友,获取更多播客资讯

微信二维码

播放列表

自动播放下一个

播放列表还是空的

去找些喜欢的节目添加进来吧