
时长:
7分钟
播放:
116
发布:
2周前
主播...
简介...
本期的 15 篇论文如下:
[00:22] 🧪 Intern-S1: A Scientific Multimodal Foundation Model(Intern-S1:一个科学多模态基础模型)
[00:46] 🤖 Mobile-Agent-v3: Foundamental Agents for GUI Automation(Mobile-Agent-v3:GUI自动化基础智能体)
[01:10] ✅ Deep Think with Confidence(置信深思)
[01:31] 🤔 LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries(LiveMCP-101:在挑战性查询上对启用MCP的智能体进行压力测试与诊断)
[02:01] 🎬 Waver: Wave Your Way to Lifelike Video Generation(Waver:驾驭波形,生成栩栩如生的视频)
[02:25] 🏞 SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass(SceneGen:单图一次前向传播生成三维场景)
[02:56] 📚 A Survey on Large Language Model Benchmarks(大语言模型基准测试综述)
[03:20] 🤸 ATLAS: Decoupling Skeletal and Shape Parameters for Expressive Parametric Human Modeling(ATLAS:解耦骨骼与形状参数,实现富有表现力的参数化人体建模)
[03:46] 🎨 Visual Autoregressive Modeling for Instruction-Guided Image Editing(用于指令引导图像编辑的视觉自回归建模)
[04:15] 🤖 aiXiv: A Next-Generation Open Access Ecosystem for Scientific Discovery Generated by AI Scientists(aiXiv:由AI科学家生成的下一代开放获取科学发现生态系统)
[04:40] 🗺 "Does the cafe entrance look accessible? Where is the door?" Towards Geospatial AI Agents for Visual Inquiries(“咖啡馆入口是否无障碍?门在哪里?”——迈向地理空间AI智能体实现视觉查询)
[05:12] 🔍 When and What: Diffusion-Grounded VideoLLM with Entity Aware Segmentation for Long Video Understanding(何时何物:基于扩散模型的视频大语言模型,结合实体感知分割实现长视频理解)
[05:44] 💰 Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models(Fin-PRM:大型语言模型金融推理的领域专用过程奖励模型)
[06:08] ⚡ Snap-Snap: Taking Two Images to Reconstruct 3D Human Gaussians in Milliseconds(Snap-Snap:双图快拍,毫秒级3D人体高斯重建)
[06:37] 🫂 INTIMA: A Benchmark for Human-AI Companionship Behavior(INTIMA:人机陪伴行为基准)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
[00:22] 🧪 Intern-S1: A Scientific Multimodal Foundation Model(Intern-S1:一个科学多模态基础模型)
[00:46] 🤖 Mobile-Agent-v3: Foundamental Agents for GUI Automation(Mobile-Agent-v3:GUI自动化基础智能体)
[01:10] ✅ Deep Think with Confidence(置信深思)
[01:31] 🤔 LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries(LiveMCP-101:在挑战性查询上对启用MCP的智能体进行压力测试与诊断)
[02:01] 🎬 Waver: Wave Your Way to Lifelike Video Generation(Waver:驾驭波形,生成栩栩如生的视频)
[02:25] 🏞 SceneGen: Single-Image 3D Scene Generation in One Feedforward Pass(SceneGen:单图一次前向传播生成三维场景)
[02:56] 📚 A Survey on Large Language Model Benchmarks(大语言模型基准测试综述)
[03:20] 🤸 ATLAS: Decoupling Skeletal and Shape Parameters for Expressive Parametric Human Modeling(ATLAS:解耦骨骼与形状参数,实现富有表现力的参数化人体建模)
[03:46] 🎨 Visual Autoregressive Modeling for Instruction-Guided Image Editing(用于指令引导图像编辑的视觉自回归建模)
[04:15] 🤖 aiXiv: A Next-Generation Open Access Ecosystem for Scientific Discovery Generated by AI Scientists(aiXiv:由AI科学家生成的下一代开放获取科学发现生态系统)
[04:40] 🗺 "Does the cafe entrance look accessible? Where is the door?" Towards Geospatial AI Agents for Visual Inquiries(“咖啡馆入口是否无障碍?门在哪里?”——迈向地理空间AI智能体实现视觉查询)
[05:12] 🔍 When and What: Diffusion-Grounded VideoLLM with Entity Aware Segmentation for Long Video Understanding(何时何物:基于扩散模型的视频大语言模型,结合实体感知分割实现长视频理解)
[05:44] 💰 Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models(Fin-PRM:大型语言模型金融推理的领域专用过程奖励模型)
[06:08] ⚡ Snap-Snap: Taking Two Images to Reconstruct 3D Human Gaussians in Milliseconds(Snap-Snap:双图快拍,毫秒级3D人体高斯重建)
[06:37] 🫂 INTIMA: A Benchmark for Human-AI Companionship Behavior(INTIMA:人机陪伴行为基准)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
评价...
空空如也
小宇宙热门评论...
暂无小宇宙热门评论