Album
时长:
15分钟
播放:
138
发布:
8个月前
主播...
简介...
https://xiaoyuzhoufm.com

本期的 21 篇论文如下:


[00:22] 🌐 Region-Adaptive Sampling for Diffusion Transformers(区域自适应采样扩散变换器)


[01:05] 🎥 Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model(步进视频生成技术报告:视频基础模型的实践、挑战与未来)


[01:48] 🌊 Large Language Diffusion Models(大规模语言扩散模型)


[02:31] 🧠 ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models(零基准:当代大型多模态模型的不可视觉基准)


[03:15] 🌟 MM-RLHF: The Next Step Forward in Multimodal LLM Alignment(MM-RLHF:多模态大语言模型对齐的下一步进展)


[03:58] 🖼 Precise Parameter Localization for Textual Generation in Diffusion Models(扩散模型中文本生成精确参数定位)


[04:40] 🧠 Diverse Inference and Verification for Advanced Reasoning(高级推理的多重推断与验证)


[05:22] 🧬 DarwinLM: Evolutionary Structured Pruning of Large Language Models(达尔文LM:大型语言模型的进化结构剪枝)


[06:02] 📈 AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasting(AdaPTS:将单变量基础模型适配到概率性多变量时间序列预测)


[06:40] 🖼 ImageRAG: Dynamic Image Retrieval for Reference-Guided Image Generation(ImageRAG:动态图像检索用于引导图像生成)


[07:23] 🤖 We Can't Understand AI Using our Existing Vocabulary(我们无法用现有词汇理解人工智能)


[08:03] 📊 FoNE: Precise Single-Token Number Embeddings via Fourier Features(FoNE:通过傅里叶特征实现精确的单标记数字嵌入)


[08:53] 🌍 Small Models, Big Impact: Efficient Corpus and Graph-Based Adaptation of Small Multilingual Language Models for Low-Resource Languages(小模型,大影响:面向低资源语言的多语言小模型的有效语料库与基于图的适应)


[09:41] 🔓 Jailbreaking to Jailbreak(越狱以越狱)


[10:23] 🤖 STMA: A Spatio-Temporal Memory Agent for Long-Horizon Embodied Task Planning(STMA:一种用于长时程具身任务规划的时空记忆代理)


[11:05] 📊 Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding(文本引导的稀疏体素剪枝用于高效的三维视觉定位)


[11:41] ⚡ MRS: A Fast Sampler for Mean Reverting Diffusion based on ODE and SDE Solvers(基于ODE和SDE求解器的均值回归扩散快速采样器)


[12:26] 🚗 V2V-LLM: Vehicle-to-Vehicle Cooperative Autonomous Driving with Multi-Modal Large Language Models(V2V-LLM:基于多模态大语言模型的车辆间协同自动驾驶)


[13:06] 🎵 CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages(CLaMP 3:跨模态与跨语言的通用音乐信息检索)


[13:49] 🧩 Cluster and Predict Latents Patches for Improved Masked Image Modeling(基于聚类与预测潜在补丁的改进掩码图像建模)


[14:31] 🧬 Agentic End-to-End De Novo Protein Design for Tailored Dynamics Using a Language Diffusion Model(基于语言扩散模型的端到端从头蛋白质设计以实现定制动力学)





【关注我们】


您还可以在以下平台找到我们,获得播客内容以外更多信息


小红书: AI速递

评价...

空空如也

小宇宙热门评论...

暂无小宇宙热门评论

EarsOnMe

加入我们的 Discord

与播客爱好者一起交流

立即加入

扫描微信二维码

添加微信好友,获取更多播客资讯

微信二维码

播放列表

自动播放下一个

播放列表还是空的

去找些喜欢的节目添加进来吧