2025.01.28 | Baichuan多模态模型表现优异,长上下文处理成本降低。
HuggingFace 每日AI论文速递
本期的 9 篇论文如下:[00:26] 🎙 Baichuan-Omni-1.5 Technical Report(百川全能1.5技术报告)[01:03] 📚 Qwen2.5-1M Technical Report(Qwen2.5-1M 技术报告)[01:47] 🤖 Towards General-Purpose Model-Free Reinforcement Learning(面向通用无模型强化学习的研究)[02:25] 🗣 Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation(Emilia:一个大规模、广泛、多语言和多样化的语音生成数据集)[03:07] 🧠 ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer(ARWKV:预训练并非我们所需要的,基于RNN-注意力机制的语言模型诞生于Transformer)[03:52] 🧠 iFormer: Integrating ConvNet and Transformer for Mobile Application(iFormer:将卷积网络与Transformer集成应用于移动应用)[04:38] 🧠 Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models(参数 vs FLOPs:混合专家语言模型最优稀疏性的缩放规律)[05:19] 🧠 Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity(混合Mamba:通过模态感知稀疏性增强多模态状态空间模型)[06:09] 📊 Feasible Learning(可行学习)【关注我们】您还可以在以下平台找到我们,获得播客内容以外更多信息小红书: AI速递在小宇宙查看该单集文稿