本期的 15 篇论文如下:
[00:26] 🌐 OmniGen: Unified Image Generation(OmniGen:统一图像生成)
[01:02] 🌐 NVLM: Open Frontier-Class Multimodal LLMs(NVLM:开放前沿类多模态大语言模型)
[01:41] 🔍 Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think(微调图像条件扩散模型比你想象的更容易)
[02:15] 🌐 Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion(Phidias:一种利用参考增强扩散从文本、图像和3D条件生成3D内容的生成模型)
[02:59] 🎥 OSV: One Step is Enough for High-Quality Image to Video Generation(OSV:一步生成高质量图像到视频)
[03:38] 🤖 On the limits of agency in agent-based models(基于代理模型的代理限制研究)
[04:17] 🔍 Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models(提示检索器:指令训练的检索器可以像语言模型一样被提示)
[04:52] 📊 A Comprehensive Evaluation of Quantized Instruction-Tuned Large Language Models: An Experimental Analysis up to 405B(量化指令调优大型语言模型的综合评估:一项高达405B参数的实验分析)
[05:38] 🎵 EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer(EzAudio:利用高效扩散Transformer增强文本到音频生成)
[06:21] 🤖 Agile Continuous Jumping in Discontinuous Terrains(不连续地形中的敏捷连续跳跃)
[07:01] 🌐 SplatFields: Neural Gaussian Splats for Sparse 3D and 4D Reconstruction(SplatFields: 用于稀疏3D和4D重建的神经高斯Splats)
[07:34] 📈 Single-Layer Learnable Activation for Implicit Neural Representation (SL$^{2}$A-INR)(单层可学习激活函数用于隐式神经表示)
[08:11] 📈 Implicit Neural Representations with Fourier Kolmogorov-Arnold Networks(基于傅里叶科尔莫戈罗夫-阿诺德网络的隐式神经表示)
[08:53] 🎵 PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music Processing(PDMX:用于符号音乐处理的大规模公共领域MusicXML数据集)
[09:38] 🔍 Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse(通过基于属性的归因和学习拒绝来衡量和增强RAG中LLM的可信度)

【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递

空空如也
暂无小宇宙热门评论