HuggingFace 每日AI论文速递 - 2024.07.22 每日AI论文 | 视觉-语言模型、长上下文LLM推理、文本到3D生成 - EarsOnMe

时长：

9分钟

播放：

发布：

1年前

主播...

简介...

大家好，欢迎收听'Hugging Face 每日AI论文速递'。今天是2024年7月22日，我们将带您快速浏览今日的15篇热门AI论文，涵盖视觉-语言模型、长上下文LLM推理、文本到3D生成等多个前沿领域。精彩内容，马上开始！

[00:25] 🧠 EVLM: An Efficient Vision-Language Model for Visual Understanding（EVLM：一种用于视觉理解的高效视觉-语言模型）

[00:55] 📚 ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities（ChatQA 2：弥合开放访问LLMs与专有LLMs在长上下文与RAG能力上的差距）

[01:32] ⚡ LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference（LazyLLM：动态令牌剪枝技术在长上下文LLM推理中的高效应用）

[02:05] 🤖 The Vision of Autonomic Computing: Can LLMs Make It a Reality?（自主计算愿景：LLMs能否使其成为现实？）

[02:35] 🔊 Stable Audio Open（稳定音频开放）

[03:07] 📄 VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding（VisFocus：无需OCR的视觉编码器用于密集文档理解）

[03:39] 📄 Visual Text Generation in the Wild（真实场景中的视觉文本生成）

[04:10] 🚀 Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders（跳跃前进：通过JumpReLU稀疏自动编码器提高重建保真度）

[04:44] 🔬 SciCode: A Research Coding Benchmark Curated by Scientists（SciCode：科学家策划的研究编码基准）

[05:16] 🚀 Fast Matrix Multiplications for Lookup Table-Quantized LLMs（大型语言模型的查找表量化快速矩阵乘法）

[05:51] 🌐 PlacidDreamer: Advancing Harmony in Text-to-3D Generation（PlacidDreamer：推进文本到3D生成的和谐）

[06:28] 🔄 Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle（Phi-3安全后训练：通过“break-fix”循环对齐语言模型）

[06:59] 🎵 Efficient Audio Captioning with Encoder-Level Knowledge Distillation（基于编码器级知识蒸馏的高效音频描述）

[07:27] 📚 Qalam : A Multimodal LLM for Arabic Optical Character and Handwriting Recognition（Qalam：一种用于阿拉伯光学字符和手写识别的多模态大型语言模型）

[08:03] 🌐 SparseCraft: Few-Shot Neural Reconstruction through Stereopsis Guided Geometric Linearization（SparseCraft：基于立体视觉引导的几何线性化少样本神经重建）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

评价...

空空如也

小宇宙热门评论...

暂无小宇宙热门评论

去听...

小宇宙

谁收藏了...