Album
时长:
10分钟
播放:
79
发布:
6个月前
主播...
简介...
https://xiaoyuzhoufm.com

本期的 15 篇论文如下:


[00:25] 🎨 DDT: Decoupled Diffusion Transformer(解耦扩散Transformer)


[01:05] 🎬 GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography(GenDoP:基于自回归的相机轨迹生成,如同电影摄影师一般)


[01:49] 🔍 OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens(OLMoTrace:将语言模型的输出追溯到数万亿的训练文本)


[02:28] 🖼 A Unified Agentic Framework for Evaluating Conditional Image Generation(用于评估条件图像生成的统一代理框架)


[03:11] 🤔 Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?(缺失前提加剧过度思考:推理模型是否正在丧失批判性思维能力?)


[03:57] 🗣 FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis(FantasyTalking:通过连贯运动合成生成逼真会说话的人像)


[04:34] 🧐 A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility(冷静看待语言模型推理的进展:陷阱与可复现性之路)


[05:15] 🖼 OmniCaptioner: One Captioner to Rule Them All(万能字幕器:一统天下的字幕生成器)


[05:57] 🧩 Are We Done with Object-Centric Learning?(以对象为中心的学习是否已经结束?)


[06:35] 🤖 Self-Steering Language Models(自导向语言模型)


[07:09] 🇷 RuOpinionNE-2024: Extraction of Opinion Tuples from Russian News Texts(RuOpinionNE-2024:从俄语新闻文本中提取观点元组)


[07:51] 🤖 Masked Scene Modeling: Narrowing the Gap Between Supervised and Self-Supervised Learning in 3D Scene Understanding(掩码场景建模:缩小3D场景理解中监督学习和自监督学习之间的差距)


[08:30] 👂 DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion(DiTaiListener:基于扩散模型的可控高保真听者视频生成)


[09:05] 🤖 VideoChat-R1: Enhancing Spatio-Temporal Perception via Reinforcement Fine-Tuning(VideoChat-R1:通过强化微调增强时空感知能力)


[09:47] 🤖 WildGS-SLAM: Monocular Gaussian Splatting SLAM in Dynamic Environments(WildGS-SLAM:动态环境下的单目高斯溅射SLAM)





【关注我们】


您还可以在以下平台找到我们,获得播客内容以外更多信息


小红书: AI速递

评价...

空空如也

小宇宙热门评论...

暂无小宇宙热门评论

EarsOnMe

加入我们的 Discord

与播客爱好者一起交流

立即加入

扫描微信二维码

添加微信好友,获取更多播客资讯

微信二维码

播放列表

自动播放下一个

播放列表还是空的

去找些喜欢的节目添加进来吧