本期的 15 篇论文如下:
[00:23] 🛠 Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving(Multi-SWE-bench:一个用于问题解决的多语言基准测试)
[01:07] 🧠 Agentic Knowledgeable Self-awareness(具身智能的知识型自我感知)
[01:49] 🧮 MegaMath: Pushing the Limits of Open Math Corpora(MegaMath:推动开放数学语料库的极限)
[02:32] 🤖 SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement(SynWorld:用于智能体行为知识精炼的虚拟场景合成)
[03:20] 🖼 MME-Unify: A Comprehensive Benchmark for Unified Multimodal Understanding and Generation Models(MME-Unify:统一多模态理解与生成模型的综合基准)
[04:03] 🖼 VARGPT-v1.1: Improve Visual Autoregressive Large Unified Model via Iterative Instruction Tuning and Reinforcement Learning(VARGPT-v1.1:通过迭代指令调优和强化学习改进视觉自回归大型统一模型)
[04:42] 🔄 TransMamba: Flexibly Switching between Transformer and Mamba(TransMamba:在Transformer和Mamba之间灵活切换)
[05:21] 🤖 APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay(APIGen-MT:基于模拟智能体-人类交互的多轮数据生成的主动式流程)
[05:59] 🧑 HumanDreamer-X: Photorealistic Single-image Human Avatars Reconstruction via Gaussian Restoration(HumanDreamer-X:基于高斯恢复的逼真单图像人体化身重建)
[06:39] 💡 Comprehensive Relighting: Generalizable and Consistent Monocular Human Relighting and Harmonization(全面重打光:通用且一致的单目人体重打光与和谐化)
[07:20] 👂 EvMic: Event-based Non-contact sound recovery from effective spatial-temporal modeling(EvMic:基于有效时空建模的事件相机非接触式声音恢复)
[08:02] 🫁 MedSAM2: Segment Anything in 3D Medical Images and Videos(MedSAM2:三维医学图像与视频中的通用分割模型)
[08:47] ⚖ BEATS: Bias Evaluation and Assessment Test Suite for Large Language Models(BEATS:大型语言模型偏见评估与评测测试套件)
[09:35] 🚄 Slow-Fast Architecture for Video Multi-Modal Large Language Models(面向视频多模态大语言模型的慢-快架构)
[10:14] 🎨 SPF-Portrait: Towards Pure Portrait Customization with Semantic Pollution-Free Fine-tuning(SPF-Portrait:面向纯粹人像定制的无语义污染微调)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递

空空如也
暂无小宇宙热门评论