本期的 20 篇论文如下: [00:26] 🤔 The Differences Between Direct Alignment Algorithms are a Blur(直接对齐算法的差异逐渐模糊) [01:07] 🤖 OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models(OmniHuman-1:重新思考单阶段条件式人体动画模型的放大) [01:48] 💡 Process Reinforcement through Implicit Rewards(基于隐式奖励的过程强化) [02:36] ⚖ Preference Leakage: A Contamination Problem in LLM-as-a-judge(偏好泄露:LLM即评判器中的污染问题) [03:14] 🛡 SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model(SafeRAG:评估大语言模型检索增强生成中的安全性) [04:02] 🚀 FastKV: KV Cache Compression for Fast Long-Context Processing with Token-Selective Propagation(FastKV:通过令牌选择性传播实现快速长文本处理的KV缓存压缩) [04:50] 🌍 AIN: The Arabic INclusive Large Multimodal Model(AIN:阿拉伯语包容性大型多模态模型) [05:39] 🧠 DeepRAG: Thinking to Retrieval Step by Step for Large Language Models(DeepRAG:面向大型语言模型的逐步思考检索) [06:30] 🤔 MM-IQ: Benchmarking Human-Like Abstraction and Reasoning in Multimodal Models(MM-IQ:多模态模型中类人抽象与推理能力的基准测试) [07:19] 🛡 Almost Surely Safe Alignment of Large Language Models at Inference-Time(大语言模型在推理时近乎完全安全的对齐) [08:04] 🤔 ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning(ZebraLogic:关于大型语言模型在逻辑推理中的扩展极限) [08:49] 🤔 The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles(跳跃的推理曲线?追踪GPT-[n]和o-[n]模型在多模态谜题上的推理性能演变) [09:38] 🎮 Improving Transformer World Models for Data-Efficient RL(改进Transformer世界模型以实现数据高效的强化学习) [10:22] 💡 Improved Training Technique for Latent Consistency Models(改进的潜在一致性模型训练技术) [11:07] 🧠 Scaling Embedding Layers in Language Models(语言模型中扩展嵌入层) [11:42] 🎨 SliderSpace: Decomposing the Visual Capabilities of Diffusion Models(SliderSpace:解构扩散模型的视觉能力) [12:24] 🤔 PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models(无需博士知识:大型语言模型的推理挑战) [13:08] 🧠 Lifelong Sequential Knowledge Editing without Model Degradation(终身序列知识编辑,且不降低模型性能) [13:46] 🔬 Current Pathology Foundation Models are unrobust to Medical Center Differences(当前病理学基础模型对于医疗中心差异不具有鲁棒性) [14:37] 🫀 A Study on the Performance of U-Net Modifications in Retroperitoneal Tumor Segmentation(U-Net改进模型在腹膜后肿瘤分割中的性能研究) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 9 篇论文如下: [00:26] 🧠 s1: Simple test-time scaling(简单的测试时缩放) [01:18] ⚡ Reward-Guided Speculative Decoding for Efficient LLM Reasoning(奖励引导的推测解码方法用于高效LLM推理) [02:00] 🧠 Self-supervised Quantized Representation for Seamlessly Integrating Knowledge Graphs with Large Language Models(自监督量化表示法用于无缝集成知识图谱与大型语言模型) [02:41] 🛡 Constitutional Classifiers: Defending against Universal Jailbreaks across Thousands of Hours of Red Teaming(宪法分类器:在数千小时的红队测试中防御通用越狱攻击) [03:28] 🌍 DINO-WM: World Models on Pre-trained Visual Features enable Zero-shot Planning(DINO-WM:基于预训练视觉特征的世界模型实现零样本规划) [04:13] 🧠 Trading Inference-Time Compute for Adversarial Robustness(推理时间计算对对抗鲁棒性的影响) [04:54] 🧠 INT: Instance-Specific Negative Mining for Task-Generic Promptable Segmentation(任务通用提示分割的实例特定负样本挖掘) [05:30] 📰 Unraveling the Capabilities of Language Models in News Summarization(揭秘语言模型在新闻摘要中的能力) [06:09] 🎥 Fast Encoder-Based 3D from Casual Videos via Point Track Processing(基于快速编码器的从随意视频中进行3D重建的点轨迹处理) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 10 篇论文如下: [00:40] TOP1(🔥281) | 🧠 DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning(DeepSeek-R1:通过强化学习激励大语言模型的推理能力) [03:13] TOP2(🔥271) | ⚡ MiniMax-01: Scaling Foundation Models with Lightning Attention(MiniMax-01:基于闪电注意力机制扩展基础模型) [05:36] TOP3(🔥249) | 🧠 rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking(rStar-Math:小型语言模型通过自我进化的深度思考掌握数学推理) [08:13] TOP4(🔥103) | 🧠 Evolving Deeper LLM Thinking(演化更深层次的LLM思维) [10:28] TOP5(🔥99) | 📚 2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining(2.5年课堂:用于视觉-语言预训练的多模态教科书) [12:51] TOP6(🔥90) | 🚀 REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models(REINFORCE++:一种简单高效的大语言模型对齐方法) [15:15] TOP7(🔥90) | 🧠 Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though(迈向LLMs中的系统2推理:学习如何通过元思维链进行思考) [17:14] TOP8(🔥89) | 📊 The Lessons of Developing Process Reward Models in Mathematical Reasoning(数学推理中过程奖励模型开发的经验教训) [19:33] TOP9(🔥88) | 🤔 Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training(Agent-R:通过迭代自训练使语言模型代理具备反思能力) [21:35] TOP10(🔥87) | 🧠 The GAN is dead; long live the GAN! A Modern GAN Baseline(GAN已死;GAN万岁!一个现代的GAN基线) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 5 篇论文如下: [00:35] TOP1(🔥53) | 🧠 SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training(监督微调记忆,强化学习泛化:基础模型后训练的比较研究) [03:02] TOP2(🔥48) | 🧠 Humanity's Last Exam(人类最后的考试) [05:21] TOP3(🔥47) | 🛡 GuardReasoner: Towards Reasoning-based LLM Safeguards(GuardReasoner:面向基于推理的LLM安全防护) [07:44] TOP4(🔥45) | 🎙 Baichuan-Omni-1.5 Technical Report(百川全能1.5技术报告) [10:07] TOP5(🔥42) | 📚 Qwen2.5-1M Technical Report(Qwen2.5-1M 技术报告) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 8 篇论文如下: [00:25] 🛡 GuardReasoner: Towards Reasoning-based LLM Safeguards(GuardReasoner:面向基于推理的LLM安全防护) [01:04] 🩺 MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding(MedXpertQA:专家级医疗推理与理解基准测试) [01:58] 🧠 Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs(思维四处游走:关于o1类LLMs的浅思现象) [02:40] 🌐 Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch(带有重叠通信的流式DiLoCo:迈向分布式免费午餐) [03:20] 🌍 PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding(PhysBench:评估与增强视觉-语言模型在物理世界理解中的表现) [04:09] 🤖 WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training(WILDCHAT-50M:深入探讨合成数据在训练后阶段的作用) [05:04] 🛡 o3-mini vs DeepSeek-R1: Which One is Safer?(o3-mini 与 DeepSeek-R1:哪个更安全?) [05:41] 🤔 Large Language Models Think Too Fast To Explore Effectively(大语言模型思考过快导致探索效果不佳) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 5 篇论文如下: [00:25] 🧠 Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate(批评微调:学习批评比学习模仿更有效) [01:10] 🌍 Exploring the sustainable scaling of AI dilemma: A projective study of corporations' AI environmental impacts(探索AI可持续扩展的困境:企业AI环境影响的预测性研究) [01:50] 🌟 Atla Selene Mini: A General Purpose Evaluation Model(Atla Selene Mini:一种通用评估模型) [02:27] ⚠ Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation(OpenAI的o3-mini早期外部安全测试:部署前评估的见解) [03:06] 🦠 Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation(病毒:绕过防护机制的大语言模型有害微调攻击) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 8 篇论文如下: [00:26] 🧠 SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training(监督微调记忆,强化学习泛化:基础模型后训练的比较研究) [01:07] ⚡ Optimizing Large Language Model Training Using FP4 Quantization(优化使用FP4量化的超大语言模型训练) [01:47] 📚 Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling(过度分词的Transformer:词汇量通常值得扩展) [02:30] 🧠 Open Problems in Mechanistic Interpretability(机制解释性中的开放问题) [03:14] 🌐 DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation(DiffSplat:利用图像扩散模型进行可扩展的3D高斯喷洒生成) [03:58] 🔍 Low-Rank Adapters Meet Neural Architecture Search for LLM Compression(低秩适配器与神经架构搜索在大语言模型压缩中的应用) [04:41] 🌐 IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding(IndicMMLU-Pro:在多任务语言理解上评估印度语言大型语言模型) [05:27] 📚 Histoires Morales: A French Dataset for Assessing Moral Alignment(道德故事:评估道德一致性的法语数据集) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 9 篇论文如下: [00:26] 🎙 Baichuan-Omni-1.5 Technical Report(百川全能1.5技术报告) [01:03] 📚 Qwen2.5-1M Technical Report(Qwen2.5-1M 技术报告) [01:47] 🤖 Towards General-Purpose Model-Free Reinforcement Learning(面向通用无模型强化学习的研究) [02:25] 🗣 Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation(Emilia:一个大规模、广泛、多语言和多样化的语音生成数据集) [03:07] 🧠 ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer(ARWKV:预训练并非我们所需要的,基于RNN-注意力机制的语言模型诞生于Transformer) [03:52] 🧠 iFormer: Integrating ConvNet and Transformer for Mobile Application(iFormer:将卷积网络与Transformer集成应用于移动应用) [04:38] 🧠 Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models(参数 vs FLOPs:混合专家语言模型最优稀疏性的缩放规律) [05:19] 🧠 Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity(混合Mamba:通过模态感知稀疏性增强多模态状态空间模型) [06:09] 📊 Feasible Learning(可行学习) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 9 篇论文如下: [00:25] 🧠 Humanity's Last Exam(人类最后的考试) [01:06] 📊 Redundancy Principles for MLLMs Benchmarks(多模态大语言模型基准测试的冗余原则) [01:45] 🔗 Chain-of-Retrieval Augmented Generation(链式检索增强生成) [02:24] 📊 RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques(RealCritic:面向效果驱动的语言模型批评评估) [03:12] 👤 Relightable Full-Body Gaussian Codec Avatars(可重新照明的全身高斯编解码虚拟形象) [03:57] 📷 AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation(AdaIR:基于频率挖掘与调制的自适应全功能图像恢复) [04:40] 🌀 Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration(去噪作为适应:基于噪声空间的图像复原域适应) [05:20] 🌐 Multiview Equivariance Improves 3D Correspondence Understanding with Minimal Feature Finetuning(多视角等变性提升基于最小特征微调的3D对应理解) [06:01] 🌍 GeoPixel: Pixel Grounding Large Multimodal Model in Remote Sensing(GeoPixel:遥感领域中的像素级大尺度多模态模型) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 5 篇论文如下: [00:37] TOP1(🔥167) | 🧠 DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning(DeepSeek-R1:通过强化学习激励大语言模型的推理能力) [02:59] TOP2(🔥95) | 🧠 Evolving Deeper LLM Thinking(演化更深层次的LLM思维) [05:07] TOP3(🔥73) | 🤔 Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training(Agent-R:通过迭代自训练使语言模型代理具备反思能力) [07:15] TOP4(🔥73) | 🎥 MMVU: Measuring Expert-Level Multi-Discipline Video Understanding(MMVU:专家级多学科视频理解的测量) [09:29] TOP5(🔥64) | 👁 VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding(VideoLLaMA 3:面向图像与视频理解的前沿多模态基础模型) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 15 篇论文如下: [00:26] 🧠 SRMT: Shared Memory for Multi-agent Lifelong Pathfinding(SRMT:多智能体终身路径规划中的共享记忆) [01:05] 🎥 Improving Video Generation with Human Feedback(利用人类反馈改进视频生成) [01:40] ⚡ Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models(Sigma:查询、键和值的差分重缩放以实现高效语言模型) [02:20] 🖼 Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step(能否通过思维链生成图像?逐步验证和强化图像生成) [02:55] 🖼 IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models(IMAGINE-E:最先进文本到图像模型的图像生成智能评估) [03:32] 📚 Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos(Video-MMMU:评估从多学科专业视频中获取知识的能力) [04:14] 🎥 DiffuEraser: A Diffusion Model for Video Inpainting(DiffuEraser:基于扩散模型的视频修复) [04:50] 🎥 Temporal Preference Optimization for Long-Form Video Understanding(长视频理解中的时序偏好优化) [05:29] 🎨 One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt(一提示一故事:使用单一提示实现免费午餐式一致的文本到图像生成) [06:07] 🎥 EchoVideo: Identity-Preserving Human Video Generation by Multimodal Feature Fusion(EchoVideo:基于多模态特征融合的身份保持人类视频生成) [06:42] 🧠 Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback(Step-KTO:通过逐步二元反馈优化数学推理) [07:17] 🧠 Debate Helps Weak-to-Strong Generalization(辩论助力弱到强泛化) [07:53] 🤔 Evolution and The Knightian Blindspot of Machine Learning(进化与机器学习的奈特盲点) [08:30] 🧪 Hallucinations Can Improve Large Language Models in Drug Discovery(幻觉可以提升大语言模型在药物发现中的表现) [09:10] 🌀 GSTAR: Gaussian Surface Tracking and Reconstruction(GSTAR:高斯曲面跟踪与重建) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 9 篇论文如下: [00:24] 🧠 DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning(DeepSeek-R1:通过强化学习激励大语言模型的推理能力) [01:07] 🎬 FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in Virtual 3D Spaces(FilmAgent:虚拟3D空间中的端到端电影自动化多智能体框架) [01:48] 🔄 Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback(测试时偏好优化:通过迭代文本反馈实现即时对齐) [02:25] 👁 VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding(VideoLLaMA 3:面向图像与视频理解的前沿多模态基础模型) [03:03] 🚀 Kimi k1.5: Scaling Reinforcement Learning with LLMs(Kimi k1.5:利用大语言模型扩展强化学习) [03:40] 🧠 Autonomy-of-Experts Models(专家自主模型) [04:18] 🏆 Pairwise RM: Perform Best-of-N Sampling with Knockout Tournament(成对奖励模型:通过淘汰赛进行最佳N采样) [05:01] ✂ O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning(O1-Pruner:基于长度协调的微调用于O1类推理剪枝) [05:34] 🤖 IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI Systems(IntellAgent:用于评估对话AI系统的多智能体框架) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
与播客爱好者一起交流
添加微信好友,获取更多播客资讯
播放列表还是空的
去找些喜欢的节目添加进来吧