主播
节目简介
来源:小宇宙
【赞助商】
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd
【目录】
本期的 15 篇论文如下:
[00:30] 🧠 FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization(FIPO:通过未来KL影响策略优化引导深度推理)
[01:12] 🧩 LongCat-Next: Lexicalizing Modalities as Discrete Tokens(LongCat-Next:将多模态信息离散化为标记)
[01:48] 🚁 CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence(CARLA-Air:在CARLA世界中飞行无人机——面向空地具身智能的统一基础设施)
[02:31] 🧬 Lingshu-Cell: A generative cellular world model for transcriptome modeling toward virtual cells(Lingshu-Cell:一种用于转录组建模的生成式细胞世界模型,迈向虚拟细胞)
[03:33] 🤖 GEMS: Agent-Native Multimodal Generation with Memory and Skills(GEMS:具备记忆与技能的智能体原生多模态生成框架)
[04:12] 🎬 VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward(VGGRPO:迈向具有4D潜在奖励的世界一致性视频生成)
[05:04] 🤖 Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis(Unify-Agent:面向世界接地的图像合成的统一多模态智能体)
[05:45] 🔬 daVinci-LLM:Towards the Science of Pretraining(daVinci-LLM:迈向预训练的科学)
[06:19] 🎬 CutClaw: Agentic Hours-Long Video Editing via Music Synchronization(CutClaw:通过音乐同步实现代理式数小时视频编辑)
[07:10] 🔍 MonitorBench: A Comprehensive Benchmark for Chain-of-Thought Monitorability in Large Language Models(MonitorBench:大型语言模型中思维链可监控性的综合基准)
[07:58] 🧬 FlowPIE: Test-Time Scientific Idea Evolution with Flow-Guided Literature Exploration(FlowPIE:基于流引导文献探索的测试时科学思想演化)
[08:46] 🏙 Extend3D: Town-Scale 3D Generation(Extend3D:城镇尺度的三维生成)
[09:28] 💭 Think Anywhere in Code Generation(代码生成中的随处思考)
[10:18] ⚙ OptiMer: Optimal Distribution Vector Merging Is Better than Data Mixing for Continual Pre-Training(OptiMer:最优分布向量合并优于数据混合用于持续预训练)
[11:03] 🎨 VectorGym: A Multitask Benchmark for SVG Code Generation, Sketching, and Editing(VectorGym:面向SVG代码生成、绘制与编辑的多任务基准)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd
【目录】
本期的 15 篇论文如下:
[00:30] 🧠 FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization(FIPO:通过未来KL影响策略优化引导深度推理)
[01:12] 🧩 LongCat-Next: Lexicalizing Modalities as Discrete Tokens(LongCat-Next:将多模态信息离散化为标记)
[01:48] 🚁 CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence(CARLA-Air:在CARLA世界中飞行无人机——面向空地具身智能的统一基础设施)
[02:31] 🧬 Lingshu-Cell: A generative cellular world model for transcriptome modeling toward virtual cells(Lingshu-Cell:一种用于转录组建模的生成式细胞世界模型,迈向虚拟细胞)
[03:33] 🤖 GEMS: Agent-Native Multimodal Generation with Memory and Skills(GEMS:具备记忆与技能的智能体原生多模态生成框架)
[04:12] 🎬 VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward(VGGRPO:迈向具有4D潜在奖励的世界一致性视频生成)
[05:04] 🤖 Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis(Unify-Agent:面向世界接地的图像合成的统一多模态智能体)
[05:45] 🔬 daVinci-LLM:Towards the Science of Pretraining(daVinci-LLM:迈向预训练的科学)
[06:19] 🎬 CutClaw: Agentic Hours-Long Video Editing via Music Synchronization(CutClaw:通过音乐同步实现代理式数小时视频编辑)
[07:10] 🔍 MonitorBench: A Comprehensive Benchmark for Chain-of-Thought Monitorability in Large Language Models(MonitorBench:大型语言模型中思维链可监控性的综合基准)
[07:58] 🧬 FlowPIE: Test-Time Scientific Idea Evolution with Flow-Guided Literature Exploration(FlowPIE:基于流引导文献探索的测试时科学思想演化)
[08:46] 🏙 Extend3D: Town-Scale 3D Generation(Extend3D:城镇尺度的三维生成)
[09:28] 💭 Think Anywhere in Code Generation(代码生成中的随处思考)
[10:18] ⚙ OptiMer: Optimal Distribution Vector Merging Is Better than Data Mixing for Continual Pre-Training(OptiMer:最优分布向量合并优于数据混合用于持续预训练)
[11:03] 🎨 VectorGym: A Multitask Benchmark for SVG Code Generation, Sketching, and Editing(VectorGym:面向SVG代码生成、绘制与编辑的多任务基准)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递