https://babi.com/
slot gacor
本期的 5 篇论文如下: [00:35] TOP1(🔥53) | 🧠 SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training(监督微调记忆,强化学习泛化:基础模型后训练的比较研究) [03:02] TOP2(🔥48) | 🧠 Humanity's Last Exam(人类最后的考试) [05:21] TOP3(🔥47) | 🛡 GuardReasoner: Towards Reasoning-based LLM Safeguards(GuardReasoner:面向基于推理的LLM安全防护) [07:44] TOP4(🔥45) | 🎙 Baichuan-Omni-1.5 Technical Report(百川全能1.5技术报告) [10:07] TOP5(🔥42) | 📚 Qwen2.5-1M Technical Report(Qwen2.5-1M 技术报告) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 8 篇论文如下: [00:25] 🛡 GuardReasoner: Towards Reasoning-based LLM Safeguards(GuardReasoner:面向基于推理的LLM安全防护) [01:04] 🩺 MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding(MedXpertQA:专家级医疗推理与理解基准测试) [01:58] 🧠 Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs(思维四处游走:关于o1类LLMs的浅思现象) [02:40] 🌐 Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch(带有重叠通信的流式DiLoCo:迈向分布式免费午餐) [03:20] 🌍 PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding(PhysBench:评估与增强视觉-语言模型在物理世界理解中的表现) [04:09] 🤖 WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training(WILDCHAT-50M:深入探讨合成数据在训练后阶段的作用) [05:04] 🛡 o3-mini vs DeepSeek-R1: Which One is Safer?(o3-mini 与 DeepSeek-R1:哪个更安全?) [05:41] 🤔 Large Language Models Think Too Fast To Explore Effectively(大语言模型思考过快导致探索效果不佳) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 5 篇论文如下: [00:25] 🧠 Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate(批评微调:学习批评比学习模仿更有效) [01:10] 🌍 Exploring the sustainable scaling of AI dilemma: A projective study of corporations' AI environmental impacts(探索AI可持续扩展的困境:企业AI环境影响的预测性研究) [01:50] 🌟 Atla Selene Mini: A General Purpose Evaluation Model(Atla Selene Mini:一种通用评估模型) [02:27] ⚠ Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation(OpenAI的o3-mini早期外部安全测试:部署前评估的见解) [03:06] 🦠 Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation(病毒:绕过防护机制的大语言模型有害微调攻击) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 8 篇论文如下: [00:26] 🧠 SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training(监督微调记忆,强化学习泛化:基础模型后训练的比较研究) [01:07] ⚡ Optimizing Large Language Model Training Using FP4 Quantization(优化使用FP4量化的超大语言模型训练) [01:47] 📚 Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling(过度分词的Transformer:词汇量通常值得扩展) [02:30] 🧠 Open Problems in Mechanistic Interpretability(机制解释性中的开放问题) [03:14] 🌐 DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation(DiffSplat:利用图像扩散模型进行可扩展的3D高斯喷洒生成) [03:58] 🔍 Low-Rank Adapters Meet Neural Architecture Search for LLM Compression(低秩适配器与神经架构搜索在大语言模型压缩中的应用) [04:41] 🌐 IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding(IndicMMLU-Pro:在多任务语言理解上评估印度语言大型语言模型) [05:27] 📚 Histoires Morales: A French Dataset for Assessing Moral Alignment(道德故事:评估道德一致性的法语数据集) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 9 篇论文如下: [00:26] 🎙 Baichuan-Omni-1.5 Technical Report(百川全能1.5技术报告) [01:03] 📚 Qwen2.5-1M Technical Report(Qwen2.5-1M 技术报告) [01:47] 🤖 Towards General-Purpose Model-Free Reinforcement Learning(面向通用无模型强化学习的研究) [02:25] 🗣 Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation(Emilia:一个大规模、广泛、多语言和多样化的语音生成数据集) [03:07] 🧠 ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer(ARWKV:预训练并非我们所需要的,基于RNN-注意力机制的语言模型诞生于Transformer) [03:52] 🧠 iFormer: Integrating ConvNet and Transformer for Mobile Application(iFormer:将卷积网络与Transformer集成应用于移动应用) [04:38] 🧠 Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models(参数 vs FLOPs:混合专家语言模型最优稀疏性的缩放规律) [05:19] 🧠 Mixture-of-Mamba: Enhancing Multi-Modal State-Space Models with Modality-Aware Sparsity(混合Mamba:通过模态感知稀疏性增强多模态状态空间模型) [06:09] 📊 Feasible Learning(可行学习) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 9 篇论文如下: [00:25] 🧠 Humanity's Last Exam(人类最后的考试) [01:06] 📊 Redundancy Principles for MLLMs Benchmarks(多模态大语言模型基准测试的冗余原则) [01:45] 🔗 Chain-of-Retrieval Augmented Generation(链式检索增强生成) [02:24] 📊 RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques(RealCritic:面向效果驱动的语言模型批评评估) [03:12] 👤 Relightable Full-Body Gaussian Codec Avatars(可重新照明的全身高斯编解码虚拟形象) [03:57] 📷 AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation(AdaIR:基于频率挖掘与调制的自适应全功能图像恢复) [04:40] 🌀 Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration(去噪作为适应:基于噪声空间的图像复原域适应) [05:20] 🌐 Multiview Equivariance Improves 3D Correspondence Understanding with Minimal Feature Finetuning(多视角等变性提升基于最小特征微调的3D对应理解) [06:01] 🌍 GeoPixel: Pixel Grounding Large Multimodal Model in Remote Sensing(GeoPixel:遥感领域中的像素级大尺度多模态模型) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 5 篇论文如下: [00:37] TOP1(🔥167) | 🧠 DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning(DeepSeek-R1:通过强化学习激励大语言模型的推理能力) [02:59] TOP2(🔥95) | 🧠 Evolving Deeper LLM Thinking(演化更深层次的LLM思维) [05:07] TOP3(🔥73) | 🤔 Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training(Agent-R:通过迭代自训练使语言模型代理具备反思能力) [07:15] TOP4(🔥73) | 🎥 MMVU: Measuring Expert-Level Multi-Discipline Video Understanding(MMVU:专家级多学科视频理解的测量) [09:29] TOP5(🔥64) | 👁 VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding(VideoLLaMA 3:面向图像与视频理解的前沿多模态基础模型) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 15 篇论文如下: [00:26] 🧠 SRMT: Shared Memory for Multi-agent Lifelong Pathfinding(SRMT:多智能体终身路径规划中的共享记忆) [01:05] 🎥 Improving Video Generation with Human Feedback(利用人类反馈改进视频生成) [01:40] ⚡ Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models(Sigma:查询、键和值的差分重缩放以实现高效语言模型) [02:20] 🖼 Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step(能否通过思维链生成图像?逐步验证和强化图像生成) [02:55] 🖼 IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models(IMAGINE-E:最先进文本到图像模型的图像生成智能评估) [03:32] 📚 Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos(Video-MMMU:评估从多学科专业视频中获取知识的能力) [04:14] 🎥 DiffuEraser: A Diffusion Model for Video Inpainting(DiffuEraser:基于扩散模型的视频修复) [04:50] 🎥 Temporal Preference Optimization for Long-Form Video Understanding(长视频理解中的时序偏好优化) [05:29] 🎨 One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt(一提示一故事:使用单一提示实现免费午餐式一致的文本到图像生成) [06:07] 🎥 EchoVideo: Identity-Preserving Human Video Generation by Multimodal Feature Fusion(EchoVideo:基于多模态特征融合的身份保持人类视频生成) [06:42] 🧠 Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback(Step-KTO:通过逐步二元反馈优化数学推理) [07:17] 🧠 Debate Helps Weak-to-Strong Generalization(辩论助力弱到强泛化) [07:53] 🤔 Evolution and The Knightian Blindspot of Machine Learning(进化与机器学习的奈特盲点) [08:30] 🧪 Hallucinations Can Improve Large Language Models in Drug Discovery(幻觉可以提升大语言模型在药物发现中的表现) [09:10] 🌀 GSTAR: Gaussian Surface Tracking and Reconstruction(GSTAR:高斯曲面跟踪与重建) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 9 篇论文如下: [00:24] 🧠 DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning(DeepSeek-R1:通过强化学习激励大语言模型的推理能力) [01:07] 🎬 FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in Virtual 3D Spaces(FilmAgent:虚拟3D空间中的端到端电影自动化多智能体框架) [01:48] 🔄 Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback(测试时偏好优化:通过迭代文本反馈实现即时对齐) [02:25] 👁 VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding(VideoLLaMA 3:面向图像与视频理解的前沿多模态基础模型) [03:03] 🚀 Kimi k1.5: Scaling Reinforcement Learning with LLMs(Kimi k1.5:利用大语言模型扩展强化学习) [03:40] 🧠 Autonomy-of-Experts Models(专家自主模型) [04:18] 🏆 Pairwise RM: Perform Best-of-N Sampling with Knockout Tournament(成对奖励模型:通过淘汰赛进行最佳N采样) [05:01] ✂ O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning(O1-Pruner:基于长度协调的微调用于O1类推理剪枝) [05:34] 🤖 IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI Systems(IntellAgent:用于评估对话AI系统的多智能体框架) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 16 篇论文如下: [00:24] 🤔 Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training(Agent-R:通过迭代自训练使语言模型代理具备反思能力) [00:59] 🎥 MMVU: Measuring Expert-Level Multi-Discipline Video Understanding(MMVU:专家级多学科视频理解的测量) [01:35] ⚖ Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models(细节中的魔鬼:实现负载均衡损失以训练专业化专家混合模型) [02:17] 🤖 UI-TARS: Pioneering Automated GUI Interaction with Native Agents(UI-TARS:开创性的原生GUI交互自动化代理) [02:55] 🤖 Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks(Mobile-Agent-E:面向复杂任务的自我进化移动助手) [03:31] 🎨 TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space(TokenVerse:基于令牌调制空间的多概念个性化方法) [04:14] 🏆 InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model(InternLM-XComposer2.5-Reward:一种简单而有效的多模态奖励模型) [04:57] 🎥 Video Depth Anything: Consistent Depth Estimation for Super-Long Videos(视频深度任意:超长视频的一致性深度估计) [05:39] 🤖 Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments(通过交互学习:现实环境中自适应代理的数据中心框架) [06:18] 🧠 Reasoning Language Models: A Blueprint(推理语言模型:蓝图) [06:58] 🎨 Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation(Hunyuan3D 2.0:扩展扩散模型以生成高分辨率纹理3D资产) [07:40] 🧠 Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement(Condor:通过知识驱动的数据合成与精炼增强大语言模型的对齐能力) [08:21] 🎥 EMO2: End-Effector Guided Audio-Driven Avatar Video Generation(EMO2:基于末端执行器引导的音频驱动虚拟形象视频生成) [08:55] 🎥 Go-with-the-Flow: Motion-Controllable Video Diffusion Models Using Real-Time Warped Noise(随流而动:使用实时扭曲噪声实现运动可控的视频扩散模型) [09:32] 🌍 GPS as a Control Signal for Image Generation(GPS作为图像生成的控制信号) [10:11] ⚠ MSTS: A Multimodal Safety Test Suite for Vision-Language Models(MSTS:面向视觉-语言模型的多模态安全测试套件) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 2 篇论文如下: [00:27] 🎮 GameFactory: Creating New Games with Generative Interactive Videos(GameFactory:利用生成式交互视频创造新游戏) [01:00] 🎥 VideoWorld: Exploring Knowledge Learning from Unlabeled Videos(VideoWorld:从未标注视频中探索知识学习) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
本期的 9 篇论文如下: [00:28] 🧠 Evolving Deeper LLM Thinking(演化更深层次的LLM思维) [01:04] 🔍 PaSa: An LLM Agent for Comprehensive Academic Paper Search(PaSa:基于大语言模型的全面学术论文搜索代理) [01:41] 🎨 Textoon: Generating Vivid 2D Cartoon Characters from Text Descriptions(Textoon:基于文本描述生成生动的2D卡通角色) [02:18] 🤔 Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong(多项选择题:推理使大型语言模型(LLMs)更加自信,即使它们是错误的) [02:53] 🌍 Bridging Language Barriers in Healthcare: A Study on Arabic LLMs(跨越医疗语言障碍:阿拉伯语大语言模型研究) [03:28] 🎬 X-Dyna: Expressive Dynamic Human Image Animation(X-Dyna:基于扩散模型的动态人体图像动画生成) [04:04] 🎙 HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial Network for High-Fidelity Speech Super-Resolution(HiFi-SR:一种用于高保真语音超分辨率的统一生成式Transformer-卷积对抗网络) [04:43] 🔍 ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario(ComplexFuncBench:探索长上下文场景下的多步和约束函数调用) [05:23] 🎭 GaussianAvatar-Editor: Photorealistic Animatable Gaussian Head Avatar Editor(高斯头像编辑器:可动画化的高斯头部头像编辑器) 【关注我们】 您还可以在以下平台找到我们,获得播客内容以外更多信息 小红书: AI速递
与播客爱好者一起交流
添加微信好友,获取更多播客资讯
播放列表还是空的
去找些喜欢的节目添加进来吧