本期的 15 篇论文如下:
[00:19] 🧠 Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems(具身智能体的进展与挑战:从脑启发智能到进化、协作与安全系统)
[01:01] 🖼 Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing(超越像素的展望:推理驱动的视觉编辑基准测试)
[01:41] 🖼 GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation(GPT-ImgEval:一个用于诊断 GPT4o 在图像生成中表现的综合性基准)
[02:25] 🤖 Rethinking RL Scaling for Vision Language Models: A Transparent, From-Scratch Framework and Comprehensive Evaluation Scheme(重新思考视觉语言模型的强化学习扩展:一个透明的、从零开始的框架和综合评估方案)
[03:08] 🗣 Scaling Analysis of Interleaved Speech-Text Language Models(交错语音-文本语言模型的规模化分析)
[03:52] 🎬 SkyReels-A2: Compose Anything in Video Diffusion Transformers(SkyReels-A2:视频扩散Transformer中的任意元素组合)
[04:36] 🧊 ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers(ShortV:通过冻结无效层中的视觉 tokens 实现高效多模态大型语言模型)
[05:13] 📉 ZClip: Adaptive Spike Mitigation for LLM Pre-Training(ZClip:用于LLM预训练的自适应尖峰缓解)
[05:50] 🧠 Inference-Time Scaling for Generalist Reward Modeling(通用奖励建模的推理时扩展)
[06:32] 🗣 Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation(基于掩码选择性状态空间建模的音视频控制视频扩散,用于自然对话头部的生成)
[07:12] ⏱ Efficient Model Selection for Time Series Forecasting via LLMs(基于大型语言模型的时间序列预测高效模型选择)
[07:55] 🤖 Scaling Laws in Scientific Discovery with AI and Robot Scientists(人工智能与机器人科学家在科学发现中的规模法则)
[08:35] 🧠 Instruction-Guided Autoregressive Neural Network Parameter Generation(指令引导的自回归神经网络参数生成)
[09:18] 🤖 GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning(GenPRM:通过生成式推理扩展过程奖励模型的测试时计算)
[10:01] 🧠 Interpreting Emergent Planning in Model-Free Reinforcement Learning(解读免模型强化学习中涌现的规划能力)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递

空空如也
暂无小宇宙热门评论