本期的 9 篇论文如下:
[00:27] 💻 Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale(Windows Agent Arena: 大规模评估多模态操作系统代理)
[01:03] 🤖 Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers(大语言模型能否生成新颖的研究想法?一项与100多名NLP研究人员合作的大规模人类研究)
[01:37] 🖼 IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation(基于实例特征控制的接地文本到图像生成)
[02:13] 🖼 TextBoost: Towards One-Shot Personalization of Text-to-Image Models via Fine-tuning Text Encoder(TextBoost:通过微调文本编码器实现文本到图像模型的单次个性化)
[02:55] 🧑 DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors(DreamHOI:基于扩散先验的主体驱动生成3D人-物交互)
[03:41] 🔄 Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources(基于真实数据源的合成数据生成与筛选)
[04:28] 🌐 FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally(FlashSplat:二维到三维高斯喷射分割的最优解)
[05:03] 🔍 Can OOD Object Detectors Learn from Foundation Models?(基础模型能否助力分布外目标检测?)
[05:38] 🎥 PiTe: Pixel-Temporal Alignment for Large Video-Language Model(PiTe:大型视频-语言模型的像素-时间对齐)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递

空空如也
暂无小宇宙热门评论