主播
节目简介
来源:小宇宙
你有没有感觉AI好像更懂英文,对中文有点“慢半拍”?这一期,我们就从几篇最新论文出发,聊聊如何用一个巧妙的“补丁”为我们的语言争取公平待遇。我们还会看看AI是如何像我们读书一样给长篇大论“划重点”的,以及AI在向我们学习时,是如何像一场大型选举一样,不小心选出了平庸的“最大公约数”。最后,我们还将揭示一个惊人现象:为什么AI的自我提升,努力到尽头竟是彻底的崩溃。
00:00:34 你的语言,正在被“区别对待”
00:06:21 大海捞针,如何给长篇大论划重点?
00:10:32 AI大模型是如何“被投票”选出来的?
00:16:35 AI如何理解世界,一个点,还是一群点?
00:22:10 AI的“过度努力”陷阱,为什么进步的尽头是崩溃?
本期介绍的几篇论文:
[CL] LangMAP: A Language-Adaptive Approach to Tokenization
[EPFL & University of Cambridge]
https://arxiv.org/abs/2606.23566
---
[IR] Improving Long-Context Retrieval with Multi-Prefix Embedding
[University of Waterloo & University of Queensland]
https://arxiv.org/abs/2606.23642
---
[AI] AI Alignment From Social Choice Perspectives
[Google Research & University of Southern California & Harvard University]
https://arxiv.org/abs/2606.21550
---
[IR] Multi-Vector Embeddings are Provably More Expressive than Single Vector Embeddings
[Google Research]
https://arxiv.org/abs/2606.23475
---
[LG] Self-Improvement Can Self-Regress: The Rise-and-Collapse Failure Mode of LLM Self-Training
[MetaAI]
https://arxiv.org/abs/2606.21090
00:00:34 你的语言,正在被“区别对待”
00:06:21 大海捞针,如何给长篇大论划重点?
00:10:32 AI大模型是如何“被投票”选出来的?
00:16:35 AI如何理解世界,一个点,还是一群点?
00:22:10 AI的“过度努力”陷阱,为什么进步的尽头是崩溃?
本期介绍的几篇论文:
[CL] LangMAP: A Language-Adaptive Approach to Tokenization
[EPFL & University of Cambridge]
https://arxiv.org/abs/2606.23566
---
[IR] Improving Long-Context Retrieval with Multi-Prefix Embedding
[University of Waterloo & University of Queensland]
https://arxiv.org/abs/2606.23642
---
[AI] AI Alignment From Social Choice Perspectives
[Google Research & University of Southern California & Harvard University]
https://arxiv.org/abs/2606.21550
---
[IR] Multi-Vector Embeddings are Provably More Expressive than Single Vector Embeddings
[Google Research]
https://arxiv.org/abs/2606.23475
---
[LG] Self-Improvement Can Self-Regress: The Rise-and-Collapse Failure Mode of LLM Self-Training
[MetaAI]
https://arxiv.org/abs/2606.21090