主播
节目简介
来源:小宇宙
【目录】
本期的 15 篇论文如下:
00:24 🚀 Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation(从类别标签到文本:通过判别性文本表征扩展一步图像生成)
01:08 🚗 OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation(OneVL:基于视觉语言解释的单步潜在推理与规划)
01:54 🤖 Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence(Agent-World:通过可扩展环境合成推进通用智能体智能的自我演化训练场)
02:41 🎮 OpenGame: Open Agentic Coding for Games(OpenGame:面向游戏开发的开放式智能体编码框架)
03:48 🤖 MultiWorld: Scalable Multi-Agent Multi-View Video World Models(MultiWorld:可扩展的多智能体多视角视频世界模型)
04:44 🎬 EasyVideoR1: Easier RL for Video Understanding(EasyVideoR1:面向视频理解的简易强化学习框架)
05:42 🧭 WebCompass: Towards Multimodal Web Coding Evaluation for Code Language Models(WebCompass:面向代码语言模型的多模态网页编码评估)
06:46 🧠 GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification(GFT:从模仿到奖励微调——基于无偏群体优势与动态系数校正)
07:34 🧠 SkillFlow:Benchmarking Lifelong Skill Discovery and Evolution for Autonomous Agents(SkillFlow:面向自主智能体的终身技能发现与演化基准测试)
08:22 🧩 Crowded in B-Space: Calibrating Shared Directions for LoRA Merging(B空间拥挤:为LoRA合并校准共享方向)
09:13 🧠 When Can LLMs Learn to Reason with Weak Supervision?(大型语言模型何时能在弱监督下学会推理?)
10:04 🤖 ClawEnvKit: Automatic Environment Generation for Claw-Like Agents(ClawEnvKit:面向爪状智能体的自动环境生成系统)
10:52 🎬 OmniScript: Towards Audio-Visual Script Generation for Long-Form Cinematic Video(OmniScript:面向长篇幅影视视频的视听脚本生成)
11:35 🧬 Training LLM Agents for Spontaneous, Reward-Free Self-Evolution via World Knowledge Exploration(通过世界知识探索训练LLM智能体实现自发的、无奖励的自我进化)
12:26 🧮 MathNet: a Global Multimodal Benchmark for Mathematical Reasoning and Retrieval(MathNet:一个用于数学推理与检索的全球多模态基准)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
本期的 15 篇论文如下:
00:24 🚀 Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation(从类别标签到文本:通过判别性文本表征扩展一步图像生成)
01:08 🚗 OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation(OneVL:基于视觉语言解释的单步潜在推理与规划)
01:54 🤖 Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence(Agent-World:通过可扩展环境合成推进通用智能体智能的自我演化训练场)
02:41 🎮 OpenGame: Open Agentic Coding for Games(OpenGame:面向游戏开发的开放式智能体编码框架)
03:48 🤖 MultiWorld: Scalable Multi-Agent Multi-View Video World Models(MultiWorld:可扩展的多智能体多视角视频世界模型)
04:44 🎬 EasyVideoR1: Easier RL for Video Understanding(EasyVideoR1:面向视频理解的简易强化学习框架)
05:42 🧭 WebCompass: Towards Multimodal Web Coding Evaluation for Code Language Models(WebCompass:面向代码语言模型的多模态网页编码评估)
06:46 🧠 GFT: From Imitation to Reward Fine-Tuning with Unbiased Group Advantages and Dynamic Coefficient Rectification(GFT:从模仿到奖励微调——基于无偏群体优势与动态系数校正)
07:34 🧠 SkillFlow:Benchmarking Lifelong Skill Discovery and Evolution for Autonomous Agents(SkillFlow:面向自主智能体的终身技能发现与演化基准测试)
08:22 🧩 Crowded in B-Space: Calibrating Shared Directions for LoRA Merging(B空间拥挤:为LoRA合并校准共享方向)
09:13 🧠 When Can LLMs Learn to Reason with Weak Supervision?(大型语言模型何时能在弱监督下学会推理?)
10:04 🤖 ClawEnvKit: Automatic Environment Generation for Claw-Like Agents(ClawEnvKit:面向爪状智能体的自动环境生成系统)
10:52 🎬 OmniScript: Towards Audio-Visual Script Generation for Long-Form Cinematic Video(OmniScript:面向长篇幅影视视频的视听脚本生成)
11:35 🧬 Training LLM Agents for Spontaneous, Reward-Free Self-Evolution via World Knowledge Exploration(通过世界知识探索训练LLM智能体实现自发的、无奖励的自我进化)
12:26 🧮 MathNet: a Global Multimodal Benchmark for Mathematical Reasoning and Retrieval(MathNet:一个用于数学推理与检索的全球多模态基准)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递