HuggingFace 每日AI论文速递 - 2025.04.23 | 阿拉伯语性能提升；推理任务性能显著提高。 - EarsOnMe

HuggingFace 每日AI论文速递
2025.04.23 | 阿拉伯语性能提升；推理任务性能显著提高。

时长：

10分钟

播放：

109

发布：

4个月前

主播...

拨号上网

简介...

本期的 15 篇论文如下：

[00:22] 💡 Kuwain 1.5B: An Arabic SLM via Language Injection（Kuwain 1.5B：一种基于语言注入的阿拉伯语SLM）

[00:58] 🤖 TTRL: Test-Time Reinforcement Learning（测试时强化学习）

[01:40] 🌍 The Bitter Lesson Learned from 2,000+ Multilingual Benchmarks（从2000+多语种评测基准中汲取的惨痛教训）

[02:23] 🖼 Describe Anything: Detailed Localized Image and Video Captioning（描述一切：细粒度局部图像与视频字幕生成）

[03:00] 💡 Learning Adaptive Parallel Reasoning with Language Models（基于语言模型的自适应并行推理学习）

[03:34] 🖼 IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs（IV-Bench：多模态大语言模型中基于图像的视频感知与推理基准）

[04:19] 📖 BookWorld: From Novels to Interactive Agent Societies for Creative Story Generation（BookWorld：从小说到交互式智能体社会，用于创意故事生成）

[05:10] 🚀 Efficient Pretraining Length Scaling（高效预训练长度扩展）

[05:49] 🩻 CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning（CheXWorld：探索用于X射线影像表征学习的图像世界建模）

[06:26] 🖼 Personalized Text-to-Image Generation with Auto-Regressive Models（基于自回归模型的个性化文本到图像生成）

[07:08] 🗣 LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale（LiveCC：基于大规模流式语音转录学习视频大语言模型）

[07:47] 🎬 Vidi: Large Multimodal Models for Video Understanding and Editing（Vidi：用于视频理解与编辑的大型多模态模型）

[08:27] 🖼 From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning（从反思到完美：通过反思调优扩展文本到图像扩散模型的推理时优化）

[09:03] 🤖 LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities（LLM是贪婪的智能体：强化学习微调对决策能力的影响）

[09:44] 🤖 WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents（WALL-E 2.0：通过神经符号学习实现世界对齐，提升基于世界模型的LLM智能体性能）

【关注我们】

您还可以在以下平台找到我们，获得播客内容以外更多信息

小红书: AI速递

评价...

空空如也

小宇宙热门评论...

暂无小宇宙热门评论

去听...

小宇宙

谁收藏了...

EarsOnMe

空空如也

加入我们的 Discord

扫描微信二维码

播放列表