HuggingFace 每日AI论文速递 - 2025.03.13 | 降低视频扩散模型计算需求，提升多视角视频生成质量。 - EarsOnMe

时长：

10分钟

播放：

106

发布：

5个月前

主播...

简介...

本期的 15 篇论文如下：

[00:20] 🎥 TPDiff: Temporal Pyramid Video Diffusion Model（TPDiff：时间金字塔视频扩散模型）

[00:58] 🎥 Reangle-A-Video: 4D Video Generation as Video-to-Video Translation（Reangle-A-Video：将4D视频生成作为视频到视频的转换）

[01:42] 🧠 Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models（块扩散：在自回归与扩散语言模型之间插值）

[02:18] 🎯 RewardSDS: Aligning Score Distillation via Reward-Weighted Sampling（RewardSDS：通过奖励加权采样对齐分数蒸馏）

[02:55] 🧠 GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training（GTR：引导思维强化防止基于RL的VLM代理训练中的思维崩溃）

[03:36] 📄 More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAG（更多文档，相同长度：隔离RAG中多文档的挑战）

[04:19] 💃 Motion Anything: Any to Motion Generation（运动万象：任意到运动生成）

[05:15] 📊 WildIFEval: Instruction Following in the Wild（野外交互评估：复杂条件下的指令遵循）

[05:49] 📹 VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary（VLog：通过生成性检索叙事词汇的视频-语言模型）

[06:29] 🤖 Quantizing Large Language Models for Code Generation: A Differentiated Replication（量化大型语言模型用于代码生成：差异化复现）

[07:13] 🧠 Cost-Optimal Grouped-Query Attention for Long-Context LLMs（长上下文大语言模型的成本最优分组查询注意力）

[07:53] 🧬 Multimodal Language Modeling for High-Accuracy Single Cell Transcriptomics Analysis and Generation（高精度单细胞转录组分析与生成中的多模态语言建模）

[08:33] 🔄 Alias-Free Latent Diffusion Models:Improving Fractional Shift Equivariance of Diffusion Latent Space（无别名潜在扩散模型：提升扩散潜在空间的分数位移等变性）

[09:15] 🔄 Self-Taught Self-Correction for Small Language Models（小语言模型的自教自纠）

[09:49] 🧩 MoC: Mixtures of Text Chunking Learners for Retrieval-Augmented Generation System（MoC：检索增强生成系统中的文本分块学习混合模型）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

评价...

空空如也

小宇宙热门评论...

暂无小宇宙热门评论

去听...

小宇宙

谁收藏了...