康德的脚步,测量着柯尼斯堡的大地,也测量着人类理性的边界。
00:00:26 AI的自我修养:当机器开始给自己出题 00:03:54 给AI装上一个会“犯嘀咕”的大脑 00:08:57 你的AI客服,为什么总被人带跑偏? 00:13:36 AI的“速成班”:一个简单动作,让机器秒懂人心 00:17:36 如何让你身边的AI“开窍”? 本期介绍的几篇论文: [LG] Self-Questioning Language Models [CMU] https://arxiv.org/abs/2508.03682 --- [LG] Cognitive Loop via In-Situ Optimization: Self-Adaptive Reasoning for Science [Microsoft] https://arxiv.org/abs/2508.02789 --- [CL] Highlight & Summarize: RAG without the jailbreaks [Microsoft] https://arxiv.org/abs/2508.02872 --- [CL] Cropping outperforms dropout as an augmentation strategy for training self-supervised text embeddings [University of Tübingen] https://arxiv.org/abs/2508.03453 --- [LG] Agent Lightning: Train ANY AI Agents with Reinforcement Learning [Microsoft Research] https://arxiv.org/abs/2508.03680
真正的成长,无论是对人类还是对AI,都不是源于一帆风顺的灌输,而是源于一次次成功克服那些“必要之恶”之后,留下的深刻印记。
00:00:30 AI的“错题本”:高手是如何炼成的? 00:03:50 AI的自我修养:如何像高手一样不断精进? 00:07:10 AI当学徒:如何让机器自己学会“青出于蓝”? 00:11:25 AI的“装傻”艺术:我们还能相信它吗? 本期介绍的四篇论文: [CL] WarriorMath: Enhancing the Mathematical Ability of Large Language Models with a Defect-aware Framework [Microsoft & Peking University] https://arxiv.org/abs/2508.01245 --- [LG] Refine-n-Judge: Curating High-Quality Preference Chains for LLM-Fine-Tuning [Meta Reality Labs] https://arxiv.org/abs/2508.01543 --- [LG] CRINN: Contrastive Reinforcement Learning for Approximate Nearest Neighbor Search [University of Washington & DeepReinforce Team] https://arxiv.org/abs/2508.02091 --- [LG] LLMs Can Covertly Sandbag on Capability Evaluations Against Chain-of-Thought Monitoring [University College London] https://arxiv.org/abs/2508.00943
在这个AI可以写诗、可以作曲、可以为我们规划一切的时代,我们或许更需要掌握这套AI永远无法计算的“情绪算法”。
00:00:32 AI也会“偏科”?高手如何跳出舒适区 00:05:55 AI进化论:从“伸手党”到“高手的秘密” 00:10:42 AI的“体检”新思路:如何看穿一个模型的“小心思”? 00:15:17 一百万学生教会我们的事:简单,可能就是最优解 本期介绍的四篇论文: [LG] RL-PLUS: Countering Capability Boundary Collapse of LLMs in Reinforcement Learning with Hybrid-policy Optimization [Tongyi Lab, Alibaba Group & Peking University] https://arxiv.org/abs/2508.00222 --- [LG] MetaAgent: Toward Self-Evolving Agent via Tool Meta-Learning [BAAI] https://arxiv.org/abs/2508.00271 --- [LG] Watch the Weights: Unsupervised monitoring and control of fine-tuned LLMs [Carnegie Mellon University (CMU)] https://arxiv.org/abs/2508.00161 --- [LG] Learning to Optimize Feedback for One Million Students: Insights from Multi-Armed and Contextual Bandits in Large-Scale Online Tutoring [Carnegie Mellon University (CMU) & CK-12 Foundation] https://arxiv.org/abs/2508.00270
我们终其一生寻找的,或许不该是一个精确的目的地,而是一种心甘情愿的过程。
00:00:37 AI界的乐高革命:如何让你的模型“活”在当下 00:05:41 AI的“手术刀”:我们如何精准“切除”它的坏心思? 00:10:22 机器人学功夫:抄近道,还是练笨功? 00:15:03 AI怎么当学徒:让机器学会看“领导脸色” 00:20:18 给AI动个“小手术”,治好它的“选择困难症” 本期介绍的几篇论文: [LG] SequenceLayers: Sequence Processing and Streaming Neural Networks Made Easy [Google DeepMind] https://arxiv.org/abs/2507.23292 --- [LG] The Geometry of Harmfulness in LLMs through Subconcept Probing [Algoverse AI Research] https://arxiv.org/abs/2507.21141 --- [LG] Retrieve-Augmented Generation for Speeding up Diffusion Policy without Additional Training [The University of Tokyo] https://arxiv.org/abs/2507.21452 --- [LG] NPO: Learning Alignment and Meta-Alignment through Structured Human Feedback [Microsoft & Amrita Vishwa Vidyapeetham] https://arxiv.org/abs/2507.21131 --- [LG] TokenBlowUp: Resolving Representational Singularities in LLM Token Spaces via Monoidal Transformations [University of Washington] https://arxiv.org/abs/2507.19747
在你的生活中,是否也有一套需要重写的“人生代码”?那个让你内耗的“BUG”,或许就藏在你对世界过高的期待里。
00:00:37 你的AI管家,靠谱吗?一份来自未来的安全报告 00:04:40 AI“发疯”?科学家找到了它的“性格开关” 00:09:33 比结果更重要的,是“想明白”的过程 00:14:09 AI的“降维打击”:复杂世界里的简单活法 00:18:23 AI的“暖男”人设,可能是个陷阱? 本期介绍的几篇论文: [LG] Security Challenges in AI Agent Deployment: Insights from a Large Scale Public Competition [Gray Swan AI] https://arxiv.org/abs/2507.20526 --- [CL] Persona Vectors: Monitoring and Controlling Character Traits in Language Models [Anthropic Fellows Program & Constellation] https://arxiv.org/abs/2507.21509 --- [LG] RLVMR: Reinforcement Learning with Verifiable Meta-Reasoning Rewards for Robust Long-Horizon Agents [Tencent] https://arxiv.org/abs/2507.22844 --- [LG] Geometry of Neural Reinforcement Learning in Continuous State and Action Spaces [Brown University & Amazon Web Services] https://arxiv.org/abs/2507.20853 --- [CL] Training language models to be warm and empathetic makes them less reliable and more sycophantic [University of Oxford] https://arxiv.org/abs/2507.21919 --- [CL] On The Role of Pretrained Language Models in General-Purpose Text Embeddings: A Survey [Not explicitly stated, survey paper] https://arxiv.org/abs/2507.20783
“你思索之事,决定了你心智的品质。你的灵魂,会沾染上你思想的色彩。”
00:00:31 你的AI有多聪明?关键看它会不会“彩排” 00:05:06 AI内卷时代,如何找到“天选之子”? 00:09:38 AI的“自我修养”:如何让机器自己教自己? 00:13:34 给AI装个方向盘,指哪打哪 00:17:37 AI巨头里的“扫地僧”:没有他,AI秒变“人工智障” 本期介绍的五篇论文: [LG] SimuRA: Towards General Goal-Oriented Agent via Simulative Reasoning Architecture with LLM-Based World Model [Mohamed bin Zayed University of Artificial Intelligence & Samsung Research] https://arxiv.org/abs/2507.237 --- [LG] Consensus-Driven Active Model Selection [MIT & UMass Amherst] https://arxiv.org/abs/2507.23771 --- [CL] CoT-Self-Instruct: Building high-quality synthetic prompts for reasoning and non-reasoning tasks [FAIR at Meta & NYU] https://arxiv.org/abs/2507.237 --- [CL] Model Directions, Not Words: Mechanistic Topic Models Using Sparse Autoencoders [Columbia University] https://arxiv.org/abs/2507.23220 --- [CL] Unveiling Super Experts in Mixture-of-Experts Large Language Models [Meituan & Tsinghua University] https://arxiv.org/abs/2507.23279
与播客爱好者一起交流
添加微信好友,获取更多播客资讯
播放列表还是空的
去找些喜欢的节目添加进来吧