大家好,欢迎收听'Hugging Face 每日AI论文速递'。今天是2024年8月12日,我们将带您快速浏览今日的10篇热门AI论文,涵盖全模态大型语言模型、多模态理解、视觉推理等多个前沿领域。现在,让我们立即进入精彩的论文世界。
[00:24] 🌐 VITA: Towards Open-Source Interactive Omni Multimodal LLM(VITA:迈向开源交互式全模态大型语言模型)
[00:58] 🦉 mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models(mPLUG-Owl3:多模态大型语言模型中长图像序列理解的研究)
[01:42] 🔍 Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2(Gemma Scope:在Gemma 2上全面开放稀疏自编码器)
[02:19] 🔍 UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling(UniBench:视觉推理需要重新思考视觉-语言模型超越规模)
[03:00] 📊 ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities(ToolSandbox:一个用于评估LLM工具使用能力的状态依赖、对话交互的评估基准)
[03:53] 🔄 MulliVC: Multi-lingual Voice Conversion With Cycle Consistency(MulliVC:多语言语音转换与循环一致性)
[04:36] 🔄 BRAT: Bonus oRthogonAl Token for Architecture Agnostic Textual Inversion(BRAT:架构无关文本反转的额外正交令牌)
[05:14] 🧠 Generating novel experimental hypotheses from language models: A case study on cross-dative generalization(从语言模型生成新的实验假设:跨间接泛化案例研究)
[05:52] 🎙 MooER: LLM-based Speech Recognition and Translation Models from Moore Threads(基于LLM的语音识别与翻译模型MooER)
[06:40] 📹 Kalman-Inspired Feature Propagation for Video Face Super-Resolution(基于Kalman滤波的特征传播在视频人脸超分辨率中的应用)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递

空空如也
暂无小宇宙热门评论