大家好,欢迎收听“Hugging Face 每日AI论文速递”。今天是2024年9月04日,我们将带您快速浏览16篇热门AI论文,涵盖数据集、语言模型、视频生成等多个领域。现在,让我们立即进入今天的论文速递。
[00:22] 📊 Kvasir-VQA: A Text-Image Pair GI Tract Dataset(Kvasir-VQA:一个带有文本图像对的胃肠道数据集)
[00:58] 📚 LongRecipe: Recipe for Efficient Long Context Generalization in Large Languge Models(LongRecipe:大型语言模型中高效长上下文泛化的训练策略)
[01:43] 🧠 OLMoE: Open Mixture-of-Experts Language Models(OLMoE:开放式混合专家语言模型)
[02:23] 🎶 FLUX that Plays Music(能播放音乐的FLUX)
[03:00] 📹 DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos(DepthCrafter:为开放世界视频生成一致的长深度序列)
[03:41] 🎥 VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges(VideoLLaMB:利用循环记忆桥进行长上下文视频理解)
[04:24] 🎥 Compositional 3D-aware Video Generation with LLM Director(基于LLM导演的组合式3D感知视频生成)
[05:02] 🤖 Diffusion Policy Policy Optimization(扩散策略优化)
[05:37] 🚀 LinFusion: 1 GPU, 1 Minute, 16K Image(LinFusion:1 GPU,1分钟,16K图像)
[06:28] 🔍 ContextCite: Attributing Model Generation to Context(ContextCite:将模型生成归因于上下文)
[07:05] 📺 OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model(OD-VAE:一种用于改进潜在视频扩散模型的全方位视频压缩器)
[07:44] 📉 Accurate Compression of Text-to-Image Diffusion Models via Vector Quantization(通过向量量化实现文本到图像扩散模型的精确压缩)
[08:21] 🎥 Follow-Your-Canvas: Higher-Resolution Video Outpainting with Extensive Content Generation(Follow-Your-Canvas:高分辨率视频外绘与广泛内容生成)
[08:58] 🧠 Density Adaptive Attention-based Speech Network: Enhancing Feature Understanding for Mental Health Disorders(密度自适应注意力语音网络:增强心理健康障碍的特征理解)
[09:35] 📚 Know When to Fuse: Investigating Non-English Hybrid Retrieval in the Legal Domain(了解何时融合:研究法律领域中的非英语混合检索)
[10:10] 📚 The MERIT Dataset: Modelling and Efficiently Rendering Interpretable Transcripts(MERIT数据集:建模和高效渲染可解释的转录文本)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递

空空如也
暂无小宇宙热门评论