[LG] HeurAgenix: Leveraging LLMs for Solving Complex Combinatorial Optimization Challenges [Microsoft Research Asia] https://arxiv.org/abs/2506.15196
[LG] Dense SAE Latents Are Features, Not Bugs [MIT & ETH Zürich] https://arxiv.org/abs/2506.156
[LG] Optimal Embedding Learning Rate in LLMs: The Effect of Vocabulary Size [UC Berkeley & Microsoft Research] https://arxiv.org/abs/2506.15025
[CL] Approximating Language Model Training Data from Weights [Cornell University] https://arxiv.org/abs/2506.155
不被外界的噪音裹挟,不被内心的惯性驱动,清清楚楚地知道自己要什么,不要什么,然后全力以赴,成为一个清醒的、主动的“人生选择者”。
[CL] Sampling from Your Language Model One Byte at a Time [University of Washington] https://arxiv.org/abs/2506.14123
[LG] Transformers Learn Faster with Semantic Focus [IBM Research] https://arxiv.org/abs/2506.14095
[CL] From Bytes to Ideas: Language Modeling with Autoregressive U-Nets [FAIR at Meta] https://arxiv.org/abs/2506.14761
[CL] Reasoning with Exploration: An Entropy Perspective [RUC & MSRA & SJTU] https://arxiv.org/abs/2506.14758
[LG] Less is More: Undertraining Experts Improves Model Upcycling [Université de Montréal & Concordia University] https://arxiv.org/abs/2506.14126
工作量会像气体一样膨胀填满给定的时间容器,因此主动设定紧迫的截止日期能激发最高效率,让你成为时间的主人。
[LG] Wanting to Be Understood Explains the Meta-Problem of Consciousness [Google DeepMind] https://arxiv.org/abs/2506.12086
与播客爱好者一起交流
添加微信好友,获取更多播客资讯
播放列表还是空的
去找些喜欢的节目添加进来吧