Album
时长:
11分钟
播放:
168
发布:
2周前
主播...
简介...
https://xiaoyuzhoufm.com
本期的 15 篇论文如下:
[00:24] 🧠 Qwen3-VL Technical Report(Qwen3-VL 技术报告)
[00:57] 🧠 PretrainZero: Reinforcement Active Pretraining(PretrainZero:强化主动预训练)
[01:36] 🎬 ViDiC: Video Difference Captioning(ViDiC:视频差异描述)
[02:24] 🧠 OneThinker: All-in-one Reasoning Model for Image and Video(OneThinker:面向图像与视频的全能推理模型)
[03:07] 🔄 Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation(重新思考文本到视觉生成中推理时扩展的提示设计)
[03:59] ⚙ Steering Vision-Language-Action Models as Anti-Exploration: A Test-Time Scaling Approach(引导视觉-语言-动作模型作为反探索:一种测试时缩放方法)
[04:46] 🤖 SpaceTools: Tool-Augmented Spatial Reasoning via Double Interactive RL(SpaceTools:通过双重交互式强化学习实现工具增强的空间推理)
[05:22] 🔧 Thinking with Programming Vision: Towards a Unified View for Thinking with Images(以编程视觉思考:迈向图像思维的统一视角)
[06:01] 🔄 Flowing Backwards: Improving Normalizing Flows via Reverse Representation Alignment(逆向流动:通过反向表征对齐改进标准化流)
[06:51] 🎮 RELIC: Interactive Video World Model with Long-Horizon Memory(RELIC:具备长时记忆的交互式视频世界模型)
[07:34] 🍳 CookAnything: A Framework for Flexible and Consistent Multi-Step Recipe Image Generation(CookAnything:灵活且一致的多步骤食谱图像生成框架)
[08:26] 🧠 SR-GRPO: Stable Rank as an Intrinsic Geometric Reward for Large Language Model Alignment(SR-GRPO:将稳定秩作为大语言模型对齐的内在几何奖励)
[09:01] 📊 AlignBench: Benchmarking Fine-Grained Image-Text Alignment with Synthetic Image-Caption Pairs(AlignBench:基于合成图像-描述对评估细粒度图文对齐的基准)
[09:38] 🧠 SkillFactory: Self-Distillation For Learning Cognitive Behaviors(SkillFactory:用于学习认知行为的自蒸馏方法)
[10:20] 📱 UniQL: Unified Quantization and Low-rank Compression for Adaptive Edge LLMs(UniQL:面向自适应边缘大语言模型的统一量化与低秩压缩)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
评价...

空空如也

小宇宙热门评论...

暂无小宇宙热门评论

EarsOnMe

加入我们的 Discord

与播客爱好者一起交流

立即加入

扫描微信二维码

添加微信好友,获取更多播客资讯

微信二维码

播放列表

自动播放下一个

播放列表还是空的

去找些喜欢的节目添加进来吧