HuggingFace 每日AI论文速递 - 2024.09.27 每日AI论文 | 3D感知能力提升，计算开销减少。 - EarsOnMe

时长：

8分钟

播放：

发布：

11个月前

主播...

简介...

本期的 12 篇论文如下：

[00:27] 🌐 LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness（LLaVA-3D：一种简单而有效的路径，赋予多模态模型3D感知能力）

[01:10] 🧩 MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models（MaskLLM：大型语言模型的可学习半结构化稀疏性）

[01:49] 🎭 EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions（EMOVA：赋予语言模型以生动的情感，使其能够看、听和说）

[02:35] 🌸 Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction（莲花：基于扩散的高质量密集预测视觉基础模型）

[03:15] ⚡ Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction（探索早期层的瑰宝：通过1000倍输入令牌减少加速长上下文LLM）

[03:58] 🖼 Pixel-Space Post-Training of Latent Diffusion Models（潜在扩散模型的像素空间后训练）

[04:36] 🔍 Reducing the Footprint of Multi-Vector Retrieval with Minimal Performance Impact via Token Pooling（通过令牌池化减少多向量检索的足迹并保持最小性能影响）

[05:17] 🎭 Disco4D: Disentangled 4D Human Generation and Animation from a Single Image（Disco4D：从单张图像生成和动画化分离的4D人体模型）

[05:55] 🧠 Instruction Following without Instruction Tuning（无需指令微调的指令跟随）

[06:30] 📊 The Imperative of Conversation Analysis in the Era of LLMs: A Survey of Tasks, Techniques, and Trends（大语言模型时代对话分析的必要性：任务、技术与趋势综述）

[07:07] 🤖 Robot See Robot Do: Imitating Articulated Object Manipulation with Monocular 4D Reconstruction（机器人看机器人做：通过单目4D重建模仿关节物体操作）

[07:43] ⚽ Enhancing Structured-Data Retrieval with GraphRAG: Soccer Data Case Study（增强结构化数据检索与GraphRAG：足球数据案例研究）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

评价...

空空如也

小宇宙热门评论...

暂无小宇宙热门评论

去听...

小宇宙

谁收藏了...