节目列表: AI可可AI生活 - EarsOnMe - 精选播客，一听即合

[感悟]这个时代最高级的“活法”

这个时代对普通人最底层的要求，其实就是做一个“正常人”。

1分钟

79

AI成长的秘密：如何拿捏“奖”与“罚”的尺度

[LG] Asymmetric REINFORCE for off-Policy Reinforcement Learning: Balancing positive and negative rewards [FAIR at Meta] arxiv.org

66

聪明的“调度员”：AI如何决定谁来干活？

[LG] Mastering Multiple-Expert Routing: Realizable H-Consistency and Strong Guarantees for Learning to Defer [Courant Institute of Mathematical Sciences & Google Research] arxiv.org

57

AI的“情商”密码：它怎么学会既说真话，又不得罪你？

[CL] Inside you are many wolves: Using cognitive models to interpret value trade-offs in LLMs [Harvard University] arxiv.org

48

让AI自己造AI，这事儿靠谱吗？

[LG] Language Modeling by Language Models [Allen Institute for AI] arxiv.org

53

你的“思考”方式，是唯一的吗？

[CL] DiffuCoder：Understanding and Improving Masked Diffusion Models for Code Generation [Apple] arxiv.org

56

AI前沿：从梯度下降模拟提示到数据效能的革命

[CL] Can Gradient Descent Simulate Prompting? [MIT CSAIL] https://arxiv.org/abs/2506.20989 --- [CL] Potemkin Understanding in Large Language Models [MIT & University of Chicago & Harvard University] https://arxiv.org/abs/2506.21521 --- [LG] The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research Ideas [Stanford University] https://arxiv.org/abs/2506.20803 --- [CL] Bridging Offline and Online Reinforcement Learning for LLMs [FAIR at Meta] https://arxiv.org/abs/2506.21495 --- [CL] Data Efficacy for Language Model Training [Microsoft Research] https://arxiv.org/abs/2506.21545

6分钟

93