[LG] Text-to-LoRA: Instant Transformer Adaption [Sakana AI] https://arxiv.org/abs/2506.06105
[LG] On the Closed-Form of Flow Matching: Generalization Does Not Arise from Target Stochasticity [CNRS] https://arxiv.org/abs/2506.03719
[CL] Don't Pay Attention [Avey AI] https://arxiv.org/abs/2506.11305
真正的知识管理,不是收藏,而是创造。
[LG] AutoRule: Reasoning Chain-of-thought Extracted Rule-based Rewards Improve Preference Learning [CMU] https://arxiv.org/abs/2506.15651
[LG] HeurAgenix: Leveraging LLMs for Solving Complex Combinatorial Optimization Challenges [Microsoft Research Asia] https://arxiv.org/abs/2506.15196
[LG] Dense SAE Latents Are Features, Not Bugs [MIT & ETH Zürich] https://arxiv.org/abs/2506.156
[LG] Optimal Embedding Learning Rate in LLMs: The Effect of Vocabulary Size [UC Berkeley & Microsoft Research] https://arxiv.org/abs/2506.15025
[CL] Approximating Language Model Training Data from Weights [Cornell University] https://arxiv.org/abs/2506.155
不被外界的噪音裹挟,不被内心的惯性驱动,清清楚楚地知道自己要什么,不要什么,然后全力以赴,成为一个清醒的、主动的“人生选择者”。
[CL] Sampling from Your Language Model One Byte at a Time [University of Washington] https://arxiv.org/abs/2506.14123
[LG] Transformers Learn Faster with Semantic Focus [IBM Research] https://arxiv.org/abs/2506.14095
与播客爱好者一起交流
添加微信好友,获取更多播客资讯
播放列表还是空的
去找些喜欢的节目添加进来吧