HuggingFace 每日AI论文速递 - 2024.09.04 每日AI论文 | Kvasir-VQA提升医疗诊断，LongRecipe扩展语言模型上下文 - EarsOnMe

时长：

11分钟

播放：

发布：

1年前

主播...

简介...

大家好，欢迎收听“Hugging Face 每日AI论文速递”。今天是2024年9月04日，我们将带您快速浏览16篇热门AI论文，涵盖数据集、语言模型、视频生成等多个领域。现在，让我们立即进入今天的论文速递。

[00:22] 📊 Kvasir-VQA: A Text-Image Pair GI Tract Dataset（Kvasir-VQA：一个带有文本图像对的胃肠道数据集）

[00:58] 📚 LongRecipe: Recipe for Efficient Long Context Generalization in Large Languge Models（LongRecipe：大型语言模型中高效长上下文泛化的训练策略）

[01:43] 🧠 OLMoE: Open Mixture-of-Experts Language Models（OLMoE：开放式混合专家语言模型）

[02:23] 🎶 FLUX that Plays Music（能播放音乐的FLUX）

[03:00] 📹 DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos（DepthCrafter：为开放世界视频生成一致的长深度序列）

[03:41] 🎥 VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges（VideoLLaMB：利用循环记忆桥进行长上下文视频理解）

[04:24] 🎥 Compositional 3D-aware Video Generation with LLM Director（基于LLM导演的组合式3D感知视频生成）

[05:02] 🤖 Diffusion Policy Policy Optimization（扩散策略优化）

[05:37] 🚀 LinFusion: 1 GPU, 1 Minute, 16K Image（LinFusion：1 GPU，1分钟，16K图像）

[06:28] 🔍 ContextCite: Attributing Model Generation to Context（ContextCite：将模型生成归因于上下文）

[07:05] 📺 OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model（OD-VAE：一种用于改进潜在视频扩散模型的全方位视频压缩器）

[07:44] 📉 Accurate Compression of Text-to-Image Diffusion Models via Vector Quantization（通过向量量化实现文本到图像扩散模型的精确压缩）

[08:21] 🎥 Follow-Your-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation（Follow-Your-Canvas：高分辨率视频外绘与广泛内容生成）

[08:58] 🧠 Density Adaptive Attention-based Speech Network: Enhancing Feature Understanding for Mental Health Disorders（密度自适应注意力语音网络：增强心理健康障碍的特征理解）

[09:35] 📚 Know When to Fuse: Investigating Non-English Hybrid Retrieval in the Legal Domain（了解何时融合：研究法律领域中的非英语混合检索）

[10:10] 📚 The MERIT Dataset: Modelling and Efficiently Rendering Interpretable Transcripts（MERIT数据集：建模和高效渲染可解释的转录文本）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

评价...

空空如也

小宇宙热门评论...

暂无小宇宙热门评论

去听...

小宇宙

谁收藏了...