评分
暂无评分
0人评价
5星
0%
4星
0%
3星
0%
2星
0%
1星
0%
AI智能总结...
AI/summary > _
AI 正在思考中...
本集内容尚未生成 AI 总结
简介...
https://xiaoyuzhoufm.com

本期的 19 篇论文如下:

[00:28] 🧠 HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks(HumanEval-V:通过编码任务评估大型多模态模型的视觉理解和推理能力)

[01:15] 🎥 VidEgoThink: Assessing Egocentric Video Understanding Capabilities for Embodied AI(VidEgoThink:评估具身AI的自中心视频理解能力)

[01:50] 🧠 The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio(多模态的诅咒:评估大型多模态模型在语言、视觉和音频中的幻觉)

[02:31] 🤖 Revealing the Barriers of Language Agents in Planning(揭示语言代理在规划中的障碍)

[03:15] 📄 DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception(DocLayout-YOLO:通过多样合成数据和全局到局部自适应感知增强文档布局分析)

[03:56] ⚙ Large Language Model Evaluation via Matrix Nuclear-Norm(大型语言模型评估通过矩阵核范数)

[04:38] 🧬 Exploring Model Kinship for Merging Large Language Models(探索大型语言模型合并中的模型亲缘关系)

[05:15] 📊 ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs(ProSA:评估和理解大型语言模型的提示敏感性)

[05:50] ⚡ ZipVL: Efficient Large Vision-Language Models with Dynamic Token Sparsification and KV Cache Compression(ZipVL:动态令牌稀疏化和KV缓存压缩的高效大视觉-语言模型)

[06:31] 📄 Improving Long-Text Alignment for Text-to-Image Diffusion Models(改进文本到图像扩散模型的长文本对齐)

[07:11] 🔄 Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models(简化、稳定和扩展连续时间一致性模型)

[07:55] 🛡 Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements(可控安全对齐:推理时适应多样安全需求)

[08:34] 🔍 Tracking Universal Features Through Fine-Tuning and Model Merging(通过微调和模型合并追踪通用特征)

[09:08] 🔄 Insights from the Inverse: Reconstructing LLM Training Goals Through Inverse RL(逆向洞察:通过逆向强化学习重构LLM训练目标)

[09:46] 🧠 Neural Metamorphosis(神经变形)

[10:25] 🌍 WorldMedQA-V: a multilingual, multimodal medical examination dataset for multimodal language models evaluation(世界医学QA-V:多语言、多模态医学考试数据集用于多模态语言模型评估)

[11:09] 🌐 OMCAT: Omni Context Aware Transformer(全上下文感知变压器)

[11:44] ⏳ ChroKnowledge: Unveiling Chronological Knowledge of Language Models in Multiple Domains(ChroKnowledge:揭示语言模型在多领域中的时间知识)

[12:22] 📚 DyVo: Dynamic Vocabularies for Learned Sparse Retrieval with Entities(DyVo:动态词汇表用于实体学习的稀疏检索)

【关注我们】

您还可以在以下平台找到我们,获得播客内容以外更多信息

小红书: AI速递

主播...
拨号上网
评价...

空空如也

小宇宙热门评论...

暂无小宇宙热门评论

EarsOnMe

加入我们的 Discord

与播客爱好者一起交流

立即加入

播放列表

自动播放下一个

播放列表还是空的

去找些喜欢的节目添加进来吧