[LG] These are Not All the Features You are Looking For: A Fundamental Bottleneck In Supervised Pretraining [Facebook AI Research (FAIR) at Meta & New York University] https://arxiv.org/abs/2506.18221
当你给出“新答案”的瞬间,你会发现,那道反复折磨你的题,就这么烟消云散了。
[LG] Robust Reward Modeling via Causal Rubrics [Google DeepMind] https://arxiv.org/abs/2506.16507
[LG] Latent Concept Disentanglement in Transformer-based Language Models [Purdue University & University of Southern California] https://arxiv.org/abs/2506.16975
[CL] When Does Divide and Conquer Work for Long Context LLM? A Noise Decomposition Framework [University of Chicago & Together AI] https://arxiv.org/abs/2506.16411
[LG] On the Theoretical Understanding of Identifiable Sparse Autoencoders and Beyond [Peking University & MIT] https://arxiv.org/abs/2506.15963
[CL] EvoLM: In Search of Lost Language Model Training Dynamics [Harvard & Stanford & EPFL] https://arxiv.org/abs/2506.16029
真正的自由,是从不再解释开始。当你不再需要向世界证明什么时,你才真正开始拥有了自己的人生。
[CV] Align Your Flow: Scaling Continuous-Time Flow Map Distillation [NVIDIA] https://arxiv.org/abs/2506.14603
[LG] Taming Polysemanticity in LLMs: Provable Feature Recovery via Sparse Autoencoders [Yale University & Shanghai Jiao Tong University] https://arxiv.org/abs/2506.14002
[LG] Flat Channels to Infinity in Neural Loss Landscapes [EPFL & Flatiron Institute] https://arxiv.org/abs/2506.14951
[LG] GrokAlign: Geometric Characterisation and Acceleration of Grokking [Rice University & Brown University] https://arxiv.org/abs/2506.12284
与播客爱好者一起交流
添加微信好友,获取更多播客资讯
播放列表还是空的
去找些喜欢的节目添加进来吧