HuggingFace 每日AI论文速递 - 2025.03.19 | 动态序列建模优势，视频生成理解挑战 - EarsOnMe

HuggingFace 每日AI论文速递
2025.03.19 | 动态序列建模优势，视频生成理解挑战

时长：

10分钟

播放：

164

发布：

5个月前

主播...

拨号上网

简介...

本期的 15 篇论文如下：

[00:21] 🦢 RWKV-7 "Goose" with Expressive Dynamic State Evolution（RWKV-7 "Goose"：具有表达性动态状态演化的序列建模）

[00:55] 🤯 Impossible Videos（不可能的视频）

[01:38] 🎨 Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM（Creation-MMBench：评估多模态大型语言模型中具有上下文感知能力的创造性智能）

[02:17] 🤖 DAPO: An Open-Source LLM Reinforcement Learning System at Scale（DAPO：一个大规模的开源LLM强化学习系统）

[02:58] 🧠 DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding（DeepPerception：提升多模态大型语言模型中类R1认知视觉感知能力，用于知识密集型视觉定位）

[03:39] 🖼 CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era（CapArena：LLM时代下详细图像描述的基准测试与分析）

[04:25] 🤖 Infinite Mobility: Scalable High-Fidelity Synthesis of Articulated Objects via Procedural Generation（无限可动性：通过程序生成实现可伸缩的高保真铰接物体合成）

[05:13] 🧠 Frac-Connections: Fractional Extension of Hyper-Connections（Frac-Connections：超连接的分数扩展）

[05:52] 🌍 Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control（宇宙-迁移1：基于自适应多模态控制的条件世界生成）

[06:30] 🧐 MPBench: A Comprehensive Multimodal Reasoning Benchmark for Process Errors Identification（MPBench：用于过程错误识别的综合多模态推理基准）

[07:13] 🤖 Aligning Multimodal LLM with Human Preference: A Survey（多模态大语言模型与人类偏好对齐：一项综述）

[07:51] ⏱ Measuring AI Ability to Complete Long Tasks（衡量人工智能完成长时任务的能力）

[08:38] 🎭 Concat-ID: Towards Universal Identity-Preserving Video Synthesis（Concat-ID：面向通用身份保持的视频合成）

[09:13] 🖼 FlexWorld: Progressively Expanding 3D Scenes for Flexiable-View Synthesis（FlexWorld: 用于灵活视角合成的渐进式扩展3D场景）

[09:50] 🤔 Temporal Consistency for LLM Reasoning Process Error Identification（LLM推理过程错误识别的时序一致性方法）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

评价...

空空如也

小宇宙热门评论...

暂无小宇宙热门评论

去听...

小宇宙

谁收藏了...

EarsOnMe

空空如也

加入我们的 Discord

扫描微信二维码

播放列表