[LG] HeurAgenix: Leveraging LLMs for Solving Complex Combinatorial Optimization Challenges [Microsoft Research Asia] https://arxiv.org/abs/2506.15196
[LG] Dense SAE Latents Are Features, Not Bugs [MIT & ETH Zürich] https://arxiv.org/abs/2506.156
[LG] Optimal Embedding Learning Rate in LLMs: The Effect of Vocabulary Size [UC Berkeley & Microsoft Research] https://arxiv.org/abs/2506.15025
[CL] Approximating Language Model Training Data from Weights [Cornell University] https://arxiv.org/abs/2506.155
不被外界的噪音裹挟,不被内心的惯性驱动,清清楚楚地知道自己要什么,不要什么,然后全力以赴,成为一个清醒的、主动的“人生选择者”。
[CL] Sampling from Your Language Model One Byte at a Time [University of Washington] https://arxiv.org/abs/2506.14123
[LG] Transformers Learn Faster with Semantic Focus [IBM Research] https://arxiv.org/abs/2506.14095
[CL] From Bytes to Ideas: Language Modeling with Autoregressive U-Nets [FAIR at Meta] https://arxiv.org/abs/2506.14761
人生的智慧不在于追求一步到位的完美答案,而在于始终为自己保留更多的选择权和可能性。
[LG] Spectral Estimation with Free Decompression [UC Berkeley & University of Melbourne] https://arxiv.org/abs/2506.11994
[LG] TreeRL: LLM Reinforcement Learning with On-Policy Tree Search [Tsinghua University & California Institute of Technology] https://arxiv.org/abs/2506.11902
[CL] Large Language Models and Emergence: A Complex Systems Perspective [Santa Fe Institute] https://arxiv.org/abs/2506.11135
与播客爱好者一起交流
播放列表还是空的
去找些喜欢的节目添加进来吧