本期的 14 篇论文如下:
[00:23] 🤖 OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference(OmniAlign-V:迈向多模态大语言模型与人类偏好增强对齐)
[01:06] ⚡ SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference(SpargeAttn:准确稀疏注意力加速任意模型推理)
[01:53] 🖼 KV-Edit: Training-Free Image Editing for Precise Background Preservation(KV-编辑:无需训练的图像编辑方法,实现精确背景保留)
[02:32] 🌈 ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation(匿名区域变换器:可变多层透明图像生成)
[03:08] 🤖 SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution(SWE-RL:通过开源软件演化数据强化学习提升LLM推理能力)
[03:51] 📊 Unveiling Downstream Performance Scaling of LLMs: A Clustering-Based Perspective(揭示大语言模型下游性能扩展:基于聚类的视角)
[04:30] 🧠 Scale-Distribution Decoupling: Enabling Stable and Effective Training of Large Language Models(尺度分布解耦:实现大型语言模型稳定有效训练)
[05:11] 🔄 K-LoRA: Unlocking Training-Free Fusion of Any Subject and Style LoRAs(K-LoRA:解锁无需训练的任意主题和风格LoRA融合)
[05:51] 🌐 WebGames: Challenging General-Purpose Web-Browsing AI Agents(WebGames:挑战通用网页浏览AI代理)
[06:29] 🧠 Introducing Visual Perception Token into Multimodal Large Language Model(引入视觉感知令牌的多模态大语言模型)
[07:07] 🎰 The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?(彩票LLM假说:重新思考LLM压缩应保留的能力)
[07:47] 🧠 AAD-LLM: Neural Attention-Driven Auditory Scene Understanding(AAD-LLM:神经注意力驱动的听觉场景理解)
[08:26] 🔍 LaTIM: Measuring Latent Token-to-Token Interactions in Mamba Models(LaTIM:测量Mamba模型中的潜在Token-to-Token交互)
[09:07] 🧠 Shakti-VLMs: Scalable Vision-Language Models for Enterprise AI(Shakti-VLMs:企业级AI的可扩展视觉语言模型)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递

空空如也
暂无小宇宙热门评论