HuggingFace 每日AI论文速递 - 节目列表

2025.02.03 | 测试时缩放提升推理,奖励引导解码减少计算。

2025.02.03 | 测试时缩放提升推理,奖励引导解码减少计算。

HuggingFace 每日AI论文速递

本期的 9 篇论文如下:[00:26] 🧠 s1: Simple test-time scaling(简单的测试时缩放)[01:18] ⚡ Reward-Guided Speculative Decoding for Efficient LLM Reasoning(奖励引导的推测解码方法用于高效LLM推理)[02:00] 🧠 Self-supervised Quantized Representation for Seamlessly Integrating Knowledge Graphs with Large Language Models(自监督量化表示法用于无缝集成知识图谱与大型语言模型)[02:41] 🛡 Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming(宪法分类器:在数千小时的红队测试中防御通用越狱攻击)[03:28] 🌍 DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning(DINO-WM:基于预训练视觉特征的世界模型实现零样本规划)[04:13] 🧠 Trading Inference-Time Compute for Adversarial Robustness(推理时间计算对对抗鲁棒性的影响)[04:54] 🧠 INT: Instance-Specific Negative Mining for Task-Generic Promptable Segmentation(任务通用提示分割的实例特定负样本挖掘)[05:30] 📰 Unraveling the Capabilities of Language Models in News Summarization(揭秘语言模型在新闻摘要中的能力)[06:09] 🎥 Fast Encoder-Based 3D from Casual Videos via Point Track Processing(基于快速编码器的从随意视频中进行3D重建的点轨迹处理)【关注我们】您还可以在以下平台找到我们,获得播客内容以外更多信息小红书: AI速递在小宇宙查看该单集文稿

7分钟
92
1年前
【月末特辑】1月最火AI论文 | DeepSeek-R1强化学习提升LLM推理能力;长文本处理突破

【月末特辑】1月最火AI论文 | DeepSeek-R1强化学习提升LLM推理能力;长文本处理突破

HuggingFace 每日AI论文速递

本期的 10 篇论文如下:[00:40] TOP1(🔥281) | 🧠 DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning(DeepSeek-R1:通过强化学习激励大语言模型的推理能力)[03:13] TOP2(🔥271) | ⚡ MiniMax-01: Scaling Foundation Models with Lightning Attention(MiniMax-01:基于闪电注意力机制扩展基础模型)[05:36] TOP3(🔥249) | 🧠 rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking(rStar-Math:小型语言模型通过自我进化的深度思考掌握数学推理)[08:13] TOP4(🔥103) | 🧠 Evolving Deeper LLM Thinking(演化更深层次的LLM思维)[10:28] TOP5(🔥99) | 📚 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining(2.5年课堂:用于视觉-语言预训练的多模态教科书)[12:51] TOP6(🔥90) | 🚀 REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models(REINFORCE++:一种简单高效的大语言模型对齐方法)[15:15] TOP7(🔥90) | 🧠 Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though(迈向LLMs中的系统2推理:学习如何通过元思维链进行思考)[17:14] TOP8(🔥89) | 📊 The Lessons of Developing Process Reward Models in Mathematical Reasoning(数学推理中过程奖励模型开发的经验教训)[19:33] TOP9(🔥88) | 🤔 Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training(Agent-R:通过迭代自训练使语言模型代理具备反思能力)[21:35] TOP10(🔥87) | 🧠 The GAN is dead; long live the GAN! A Modern GAN Baseline(GAN已死;GAN万岁!一个现代的GAN基线)【关注我们】您还可以在以下平台找到我们,获得播客内容以外更多信息小红书: AI速递在小宇宙查看该单集文稿

24分钟
99+
1年前
2025.01.29 | RL泛化优,SFT稳定输出;FP4量化降成本,精度保持。

2025.01.29 | RL泛化优,SFT稳定输出;FP4量化降成本,精度保持。

HuggingFace 每日AI论文速递

本期的 8 篇论文如下:[00:26] 🧠 SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training(监督微调记忆,强化学习泛化:基础模型后训练的比较研究)[01:07] ⚡ Optimizing Large Language Model Training Using FP4 Quantization(优化使用FP4量化的超大语言模型训练)[01:47] 📚 Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling(过度分词的Transformer:词汇量通常值得扩展)[02:30] 🧠 Open Problems in Mechanistic Interpretability(机制解释性中的开放问题)[03:14] 🌐 DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation(DiffSplat:利用图像扩散模型进行可扩展的3D高斯喷洒生成)[03:58] 🔍 Low-Rank Adapters Meet Neural Architecture Search for LLM Compression(低秩适配器与神经架构搜索在大语言模型压缩中的应用)[04:41] 🌐 IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding(IndicMMLU-Pro:在多任务语言理解上评估印度语言大型语言模型)[05:27] 📚 Histoires Morales: A French Dataset for Assessing Moral Alignment(道德故事:评估道德一致性的法语数据集)【关注我们】您还可以在以下平台找到我们,获得播客内容以外更多信息小红书: AI速递在小宇宙查看该单集文稿

6分钟
99+
1年前
2025.01.28 | Baichuan多模态模型表现优异,长上下文处理成本降低。

2025.01.28 | Baichuan多模态模型表现优异,长上下文处理成本降低。

HuggingFace 每日AI论文速递

本期的 9 篇论文如下:[00:26] 🎙 Baichuan-Omni-1.5 Technical Report(百川全能1.5技术报告)[01:03] 📚 Qwen2.5-1M Technical Report(Qwen2.5-1M 技术报告)[01:47] 🤖 Towards General-Purpose Model-Free Reinforcement Learning(面向通用无模型强化学习的研究)[02:25] 🗣 Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation(Emilia:一个大规模、广泛、多语言和多样化的语音生成数据集)[03:07] 🧠 ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer(ARWKV:预训练并非我们所需要的,基于RNN-注意力机制的语言模型诞生于Transformer)[03:52] 🧠 iFormer: Integrating ConvNet and Transformer for Mobile Application(iFormer:将卷积网络与Transformer集成应用于移动应用)[04:38] 🧠 Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models(参数 vs FLOPs:混合专家语言模型最优稀疏性的缩放规律)[05:19] 🧠 Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity(混合Mamba:通过模态感知稀疏性增强多模态状态空间模型)[06:09] 📊 Feasible Learning(可行学习)【关注我们】您还可以在以下平台找到我们,获得播客内容以外更多信息小红书: AI速递在小宇宙查看该单集文稿

7分钟
82
1年前
2025.01.24 | SRMT提升多智能体协作能力,VideoReward优化视频生成质量。

2025.01.24 | SRMT提升多智能体协作能力,VideoReward优化视频生成质量。

HuggingFace 每日AI论文速递

本期的 15 篇论文如下:[00:26] 🧠 SRMT: Shared Memory for Multi-agent Lifelong Pathfinding(SRMT:多智能体终身路径规划中的共享记忆)[01:05] 🎥 Improving Video Generation with Human Feedback(利用人类反馈改进视频生成)[01:40] ⚡ Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models(Sigma:查询、键和值的差分重缩放以实现高效语言模型)[02:20] 🖼 Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step(能否通过思维链生成图像?逐步验证和强化图像生成)[02:55] 🖼 IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models(IMAGINE-E:最先进文本到图像模型的图像生成智能评估)[03:32] 📚 Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos(Video-MMMU:评估从多学科专业视频中获取知识的能力)[04:14] 🎥 DiffuEraser: A Diffusion Model for Video Inpainting(DiffuEraser:基于扩散模型的视频修复)[04:50] 🎥 Temporal Preference Optimization for Long-Form Video Understanding(长视频理解中的时序偏好优化)[05:29] 🎨 One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt(一提示一故事:使用单一提示实现免费午餐式一致的文本到图像生成)[06:07] 🎥 EchoVideo: Identity-Preserving Human Video Generation by Multimodal Feature Fusion(EchoVideo:基于多模态特征融合的身份保持人类视频生成)[06:42] 🧠 Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback(Step-KTO:通过逐步二元反馈优化数学推理)[07:17] 🧠 Debate Helps Weak-to-Strong Generalization(辩论助力弱到强泛化)[07:53] 🤔 Evolution and The Knightian Blindspot of Machine Learning(进化与机器学习的奈特盲点)[08:30] 🧪 Hallucinations Can Improve Large Language Models in Drug Discovery(幻觉可以提升大语言模型在药物发现中的表现)[09:10] 🌀 GSTAR: Gaussian Surface Tracking and Reconstruction(GSTAR:高斯曲面跟踪与重建)【关注我们】您还可以在以下平台找到我们,获得播客内容以外更多信息小红书: AI速递在小宇宙查看该单集文稿

10分钟
77
1年前
2025.01.23 | DeepSeek-R1强化学习提升推理能力,多智能体框架实现虚拟电影自动化

2025.01.23 | DeepSeek-R1强化学习提升推理能力,多智能体框架实现虚拟电影自动化

HuggingFace 每日AI论文速递

本期的 9 篇论文如下:[00:24] 🧠 DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning(DeepSeek-R1:通过强化学习激励大语言模型的推理能力)[01:07] 🎬 FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in Virtual 3D Spaces(FilmAgent:虚拟3D空间中的端到端电影自动化多智能体框架)[01:48] 🔄 Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback(测试时偏好优化:通过迭代文本反馈实现即时对齐)[02:25] 👁 VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding(VideoLLaMA 3:面向图像与视频理解的前沿多模态基础模型)[03:03] 🚀 Kimi k1.5: Scaling Reinforcement Learning with LLMs(Kimi k1.5:利用大语言模型扩展强化学习)[03:40] 🧠 Autonomy-of-Experts Models(专家自主模型)[04:18] 🏆 Pairwise RM: Perform Best-of-N Sampling with Knockout Tournament(成对奖励模型:通过淘汰赛进行最佳N采样)[05:01] ✂ O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning(O1-Pruner:基于长度协调的微调用于O1类推理剪枝)[05:34] 🤖 IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI Systems(IntellAgent:用于评估对话AI系统的多智能体框架)【关注我们】您还可以在以下平台找到我们,获得播客内容以外更多信息小红书: AI速递在小宇宙查看该单集文稿

6分钟
99+
1年前
2025.01.22 | Agent-R提升语言模型实时纠错能力,MMVU评估多学科视频理解专家级表现。

2025.01.22 | Agent-R提升语言模型实时纠错能力,MMVU评估多学科视频理解专家级表现。

HuggingFace 每日AI论文速递

本期的 16 篇论文如下:[00:24] 🤔 Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training(Agent-R:通过迭代自训练使语言模型代理具备反思能力)[00:59] 🎥 MMVU: Measuring Expert-Level Multi-Discipline Video Understanding(MMVU:专家级多学科视频理解的测量)[01:35] ⚖ Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models(细节中的魔鬼:实现负载均衡损失以训练专业化专家混合模型)[02:17] 🤖 UI-TARS: Pioneering Automated GUI Interaction with Native Agents(UI-TARS:开创性的原生GUI交互自动化代理)[02:55] 🤖 Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks(Mobile-Agent-E:面向复杂任务的自我进化移动助手)[03:31] 🎨 TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space(TokenVerse:基于令牌调制空间的多概念个性化方法)[04:14] 🏆 InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model(InternLM-XComposer2.5-Reward:一种简单而有效的多模态奖励模型)[04:57] 🎥 Video Depth Anything: Consistent Depth Estimation for Super-Long Videos(视频深度任意:超长视频的一致性深度估计)[05:39] 🤖 Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments(通过交互学习:现实环境中自适应代理的数据中心框架)[06:18] 🧠 Reasoning Language Models: A Blueprint(推理语言模型:蓝图)[06:58] 🎨 Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation(Hunyuan3D 2.0:扩展扩散模型以生成高分辨率纹理3D资产)[07:40] 🧠 Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement(Condor:通过知识驱动的数据合成与精炼增强大语言模型的对齐能力)[08:21] 🎥 EMO2: End-Effector Guided Audio-Driven Avatar Video Generation(EMO2:基于末端执行器引导的音频驱动虚拟形象视频生成)[08:55] 🎥 Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise(随流而动:使用实时扭曲噪声实现运动可控的视频扩散模型)[09:32] 🌍 GPS as a Control Signal for Image Generation(GPS作为图像生成的控制信号)[10:11] ⚠ MSTS: A Multimodal Safety Test Suite for Vision-Language Models(MSTS:面向视觉-语言模型的多模态安全测试套件)【关注我们】您还可以在以下平台找到我们,获得播客内容以外更多信息小红书: AI速递在小宇宙查看该单集文稿

11分钟
99+
1年前

加入我们的 Discord

与播客爱好者一起交流

立即加入

扫描微信二维码

添加微信好友,获取更多播客资讯

微信二维码

播放列表

自动播放下一个

播放列表还是空的

去找些喜欢的节目添加进来吧