HuggingFace 每日AI论文速递 - 2025.03.21 | 蒸馏提升超分辨率效率，优化推理减少计算负担。 - EarsOnMe

HuggingFace 每日AI论文速递
2025.03.21 | 蒸馏提升超分辨率效率，优化推理减少计算负担。

时长：

10分钟

播放：

134

发布：

5个月前

主播...

拨号上网

简介...

本期的 15 篇论文如下：

[00:23] 🖼 One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation（基于蒸馏的单步残差转移扩散超分辨率）

[01:01] 🤔 Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models（停止过度思考：大型语言模型高效推理综述）

[01:38] 🚀 Unleashing Vecset Diffusion Model for Fast Shape Generation（释放Vecset扩散模型以实现快速形状生成）

[02:18] 🤖 Survey on Evaluation of LLM-based Agents（基于大型语言模型（LLM）的智能体评估方法综述）

[02:56] 🎨 DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers（DiffMoE：用于可扩展扩散Transformer的动态Token选择）

[03:33] 🤖 Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning（Cosmos-Reason1：从物理常识到具身推理）

[04:14] 🖼 Scale-wise Distillation of Diffusion Models（扩散模型的尺度wise蒸馏）

[04:54] 🗜 Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models（面向视频大语言模型的即插即用1.x-Bit KV缓存量化）

[05:36] 🧮 MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion（MathFusion：通过指令融合增强大型语言模型解决数学问题的能力）

[06:17] 🖼 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity（无限的你：在保留身份的同时进行灵活的照片重塑）

[06:56] 🎮 JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse（JARVIS-VLA：通过后训练大规模视觉语言模型，使用键盘和鼠标玩视觉游戏）

[07:41] 🧠 CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners（CaKE：电路感知编辑实现通用知识学习器）

[08:26] 🖼 Ultra-Resolution Adaptation with Ease（简易的超分辨率自适应）

[09:04] 🎨 Expert Race: A Flexible Routing Strategy for Scaling Diffusion Transformer with Mixture of Experts（专家竞赛：一种灵活的路由策略，用于扩展具有混合专家模型的扩散Transformer）

[09:48] 🎬 MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance（MagicMotion：基于稠密到稀疏轨迹引导的可控视频生成）