时长:
13分钟
播放:
56
发布:
1天前
主播...
简介...
【赞助商】
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd
【目录】
本期的 15 篇论文如下:
[00:33] 🤖 ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas(ASTRA:基于自动化轨迹合成与强化学习竞技场的智能体训练框架)
[01:22] 🛡 THINKSAFE: Self-Generated Safety Alignment for Reasoning Models(THINKSAFE:推理模型的自生成安全对齐)
[02:18] 🧠 TTCS: Test-Time Curriculum Synthesis for Self-Evolving(TTCS:面向自进化的测试时课程合成)
[03:09] 🍌 PaperBanana: Automating Academic Illustration for AI Scientists(PaperBanana:面向AI科学家的学术插图自动化生成框架)
[03:51] 🔬 FourierSampler: Unlocking Non-Autoregressive Potential in Diffusion Language Models via Frequency-Guided Generation(傅里叶采样器:通过频率引导生成解锁扩散语言模型的非自回归潜力)
[04:40] 🧠 ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought(ReGuLaR:基于渲染思维链指导的变分潜在推理)
[05:22] 🎯 SSL: Sweet Spot Learning for Differentiated Guidance in Agentic Optimization(SSL:基于甜点学习的差异化引导智能体优化)
[06:02] 🎯 DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment(DenseGRPO:从稀疏奖励到稠密奖励的流匹配模型对齐方法)
[07:08] 🧠 Pushing the Boundaries of Natural Reasoning: Interleaved Bonus from Formal-Logic Verification(突破自然推理的边界:形式逻辑验证的交织增益)
[07:55] 📄 PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing(PaddleOCR-VL-1.5:面向鲁棒野外文档解析的多任务0.9B视觉语言模型)
[08:45] 🎬 DreamActor-M2: Universal Character Image Animation via Spatiotemporal In-Context Learning(DreamActor-M2:通过时空上下文学习的通用角色图像动画)
[09:42] 🧠 MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning(MemOCR:面向高效长程推理的布局感知视觉记忆)
[10:24] 🦢 Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text(金鹅:一种从未经验证的互联网文本中合成无限RLVR任务的简单技巧)
[11:13] 📊 Statistical Estimation of Adversarial Risk in Large Language Models under Best-of-N Sampling(大语言模型在最佳N采样下对抗性风险的统计估计)
[12:00] ⚡ RM -RF: Reward Model for Run-Free Unit Test Evaluation(RM-RF:一种用于免运行单元测试评估的奖励模型)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd
【目录】
本期的 15 篇论文如下:
[00:33] 🤖 ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas(ASTRA:基于自动化轨迹合成与强化学习竞技场的智能体训练框架)
[01:22] 🛡 THINKSAFE: Self-Generated Safety Alignment for Reasoning Models(THINKSAFE:推理模型的自生成安全对齐)
[02:18] 🧠 TTCS: Test-Time Curriculum Synthesis for Self-Evolving(TTCS:面向自进化的测试时课程合成)
[03:09] 🍌 PaperBanana: Automating Academic Illustration for AI Scientists(PaperBanana:面向AI科学家的学术插图自动化生成框架)
[03:51] 🔬 FourierSampler: Unlocking Non-Autoregressive Potential in Diffusion Language Models via Frequency-Guided Generation(傅里叶采样器:通过频率引导生成解锁扩散语言模型的非自回归潜力)
[04:40] 🧠 ReGuLaR: Variational Latent Reasoning Guided by Rendered Chain-of-Thought(ReGuLaR:基于渲染思维链指导的变分潜在推理)
[05:22] 🎯 SSL: Sweet Spot Learning for Differentiated Guidance in Agentic Optimization(SSL:基于甜点学习的差异化引导智能体优化)
[06:02] 🎯 DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment(DenseGRPO:从稀疏奖励到稠密奖励的流匹配模型对齐方法)
[07:08] 🧠 Pushing the Boundaries of Natural Reasoning: Interleaved Bonus from Formal-Logic Verification(突破自然推理的边界:形式逻辑验证的交织增益)
[07:55] 📄 PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing(PaddleOCR-VL-1.5:面向鲁棒野外文档解析的多任务0.9B视觉语言模型)
[08:45] 🎬 DreamActor-M2: Universal Character Image Animation via Spatiotemporal In-Context Learning(DreamActor-M2:通过时空上下文学习的通用角色图像动画)
[09:42] 🧠 MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning(MemOCR:面向高效长程推理的布局感知视觉记忆)
[10:24] 🦢 Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text(金鹅:一种从未经验证的互联网文本中合成无限RLVR任务的简单技巧)
[11:13] 📊 Statistical Estimation of Adversarial Risk in Large Language Models under Best-of-N Sampling(大语言模型在最佳N采样下对抗性风险的统计估计)
[12:00] ⚡ RM -RF: Reward Model for Run-Free Unit Test Evaluation(RM-RF:一种用于免运行单元测试评估的奖励模型)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
评价...
空空如也
小宇宙热门评论...
暂无小宇宙热门评论