Album
时长:
10分钟
播放:
123
发布:
3周前
主播...
简介...
https://xiaoyuzhoufm.com

本期的 14 篇论文如下:


[00:20] 🖥 D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI(D2E:利用桌面数据规模化视觉-动作预训练以迁移至具身智能)


[01:13] 📷 Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation(基于相机的统一多模态理解与生成模型)


[01:56] 🎨 TAG:Tangential Amplifying Guidance for Hallucination-Resistant Diffusion Sampling(TAG:抑制幻觉的扩散采样切向放大引导)


[02:31] 🧠 Multimodal Prompt Optimization: Why Not Leverage Multiple Modalities for MLLMs(多模态提示优化:为何不为多模态大模型释放全模态潜能)


[03:05] 🚀 AutoPR: Let's Automate Your Academic Promotion!(AutoPR:让学术晋升一键自动化!)


[03:39] 🧭 R-Horizon: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth?(R-HORIZON:你的大推理模型在广度与深度上究竟能走多远?)


[04:14] 🚀 Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels(Webscale-RL:把强化学习数据扩展到预训练体量的自动化流水线)


[04:56] 🛰 SpaceVista: All-Scale Visual Spatial Reasoning from mm to km(SpaceVista:毫米到千米全尺度视觉空间推理)


[05:37] 🎥 StreamingVLM: Real-Time Understanding for Infinite Video Streams(StreamingVLM:面向无限视频流的实时理解框架)


[06:19] 🌐 KORMo: Korean Open Reasoning Model for Everyone(KORMo:人人可用的韩语开放推理模型)


[06:42] ♻ Don't Waste Mistakes: Leveraging Negative RL-Groups via Confidence Reweighting(别浪费错误:通过置信度加权利用负RL组)


[07:25] 🧠 Bridging Reasoning to Learning: Unmasking Illusions using Complexity Out of Distribution Generalization(从推理到学习的桥梁:以复杂度分布外泛化揭穿幻觉)


[08:16] ⚡ DISCO: Diversifying Sample Condensation for Efficient Model Evaluation(DISCO:以模型分歧为导向的样本浓缩加速评测)


[08:56] 🚗 Progressive Gaussian Transformer with Anisotropy-aware Sampling for Open Vocabulary Occupancy Prediction(面向开放词汇占用预测的各向异性采样渐进高斯Transformer)





【关注我们】


您还可以在以下平台找到我们,获得播客内容以外更多信息


小红书: AI速递

评价...

空空如也

小宇宙热门评论...

暂无小宇宙热门评论

EarsOnMe

加入我们的 Discord

与播客爱好者一起交流

立即加入

扫描微信二维码

添加微信好友,获取更多播客资讯

微信二维码

播放列表

自动播放下一个

播放列表还是空的

去找些喜欢的节目添加进来吧