时长:
12分钟
播放:
110
发布:
1周前
主播...
简介...
【赞助商】
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd
【目录】
本期的 15 篇论文如下:
[00:29] 🧠 ERNIE 5.0 Technical Report(ERNIE 5.0 技术报告)
[01:11] ⚡ FASA: Frequency-aware Sparse Attention(FASA:基于频率感知的稀疏注意力机制)
[02:01] 📊 Training Data Efficiency in Multimodal Process Reward Models(多模态过程奖励模型中的训练数据效率研究)
[02:44] 🤖 WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning(WideSeek-R1:通过多智能体强化学习探索宽度扩展以实现广泛信息检索)
[03:28] ⚡ OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models(OmniSIFT:面向高效全模态大语言模型的模态非对称令牌压缩)
[04:21] ⚡ HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing(HySparse:一种具有预言机令牌选择和KV缓存共享的混合稀疏注意力架构)
[05:02] 🤖 EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models(EgoActor:通过视觉语言模型将任务规划落地为空间感知的具身动作)
[06:05] 🎬 Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization(Quant VideoGen:通过2位KV缓存量化实现自回归长视频生成)
[06:59] 🤖 SoMA: A Real-to-Sim Neural Simulator for Robotic Soft-body Manipulation(SoMA:面向机器人软体操作的真实到仿真神经模拟器)
[07:44] 🔍 TIDE: Trajectory-based Diagnostic Evaluation of Test-Time Improvement in LLM Agents(TIDE:基于轨迹的LLM智能体测试时改进诊断评估)
[08:21] 🧠 Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers(语义路由:探索扩散变换器中多层LLM特征加权的融合框架)
[09:12] 🤖 Rethinking the Trust Region in LLM Reinforcement Learning(重新思考大语言模型强化学习中的信任区域)
[09:54] ♻ Residual Context Diffusion Language Models(残差上下文扩散语言模型)
[10:40] 🧱 HY3D-Bench: Generation of 3D Assets(HY3D-Bench:3D资产的生成)
[11:34] 🎨 AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations(AutoFigure:生成与优化可直接用于发表的科学插图)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd
【目录】
本期的 15 篇论文如下:
[00:29] 🧠 ERNIE 5.0 Technical Report(ERNIE 5.0 技术报告)
[01:11] ⚡ FASA: Frequency-aware Sparse Attention(FASA:基于频率感知的稀疏注意力机制)
[02:01] 📊 Training Data Efficiency in Multimodal Process Reward Models(多模态过程奖励模型中的训练数据效率研究)
[02:44] 🤖 WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning(WideSeek-R1:通过多智能体强化学习探索宽度扩展以实现广泛信息检索)
[03:28] ⚡ OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models(OmniSIFT:面向高效全模态大语言模型的模态非对称令牌压缩)
[04:21] ⚡ HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing(HySparse:一种具有预言机令牌选择和KV缓存共享的混合稀疏注意力架构)
[05:02] 🤖 EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models(EgoActor:通过视觉语言模型将任务规划落地为空间感知的具身动作)
[06:05] 🎬 Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization(Quant VideoGen:通过2位KV缓存量化实现自回归长视频生成)
[06:59] 🤖 SoMA: A Real-to-Sim Neural Simulator for Robotic Soft-body Manipulation(SoMA:面向机器人软体操作的真实到仿真神经模拟器)
[07:44] 🔍 TIDE: Trajectory-based Diagnostic Evaluation of Test-Time Improvement in LLM Agents(TIDE:基于轨迹的LLM智能体测试时改进诊断评估)
[08:21] 🧠 Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers(语义路由:探索扩散变换器中多层LLM特征加权的融合框架)
[09:12] 🤖 Rethinking the Trust Region in LLM Reinforcement Learning(重新思考大语言模型强化学习中的信任区域)
[09:54] ♻ Residual Context Diffusion Language Models(残差上下文扩散语言模型)
[10:40] 🧱 HY3D-Bench: Generation of 3D Assets(HY3D-Bench:3D资产的生成)
[11:34] 🎨 AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations(AutoFigure:生成与优化可直接用于发表的科学插图)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
评价...
空空如也
小宇宙热门评论...
暂无小宇宙热门评论