HuggingFace 每日AI论文速递 - 2025.04.25 | 开源模型超越闭源；新型评估指标提升生成质量。 - EarsOnMe

时长：

10分钟

播放：

132

发布：

4个月前

主播...

简介...

本期的 15 篇论文如下：

[00:24] 🖼 Step1X-Edit: A Practical Framework for General Image Editing（Step1X-Edit：一个通用的图像编辑实用框架）

[01:05] 🖼 RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation（RefVNLI：面向主体驱动的文本到图像生成的可扩展评估）

[01:48] 🤖 Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning（Paper2Code：从机器学习科学论文中自动生成代码）

[02:22] 🖼 Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs（打破模态壁垒：基于多模态大型语言模型的通用嵌入学习）

[03:02] 🧠 Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation（基于心智图像模拟的视觉-语言模型中的视角感知推理）

[03:42] ⚖ QuaDMix: Quality-Diversity Balanced Data Selection for Efficient LLM Pretraining（QuaDMix：面向高效LLM预训练的质量-多样性平衡数据选择）

[04:19] 🖼 Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models（Token-Shuffle：面向自回归模型的高分辨率图像生成）

[04:58] 🖼 Distilling semantically aware orders for autoregressive image generation（用于自回归图像生成的语义感知顺序蒸馏）

[05:38] 🗜 DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs（DyMU：用于高效视觉语言模型的动态合并与虚拟解合并）

[06:17] 🇪 IberBench: LLM Evaluation on Iberian Languages（IberBench：伊比利亚语系的大语言模型评测基准）

[07:01] 🧠 Process Reward Models That Think（思考过程奖励模型）

[07:46] 🎨 Boosting Generative Image Modeling via Joint Image-Feature Synthesis（通过联合图像-特征合成增强生成图像建模）

[08:21] 🎬 ViSMaP: Unsupervised Hour-long Video Summarisation by Meta-Prompting（ViSMaP：基于元提示的无监督小时级视频摘要）

[09:02] 👗 3DV-TON: Textured 3D-Guided Consistent Video Try-on via Diffusion Models（3DV-TON：基于扩散模型的纹理3D引导一致性视频试穿）

[09:44] 📹 TimeChat-Online: 80% Visual Tokens are Naturally Redundant in Streaming Videos（TimeChat-Online：在线流媒体视频中 80% 的视觉 tokens 天然冗余）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

评价...

空空如也

小宇宙热门评论...

暂无小宇宙热门评论

去听...

小宇宙

谁收藏了...