主播
节目简介
来源:小宇宙
【赞助商】
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd
【目录】
本期的 15 篇论文如下:
[00:29] 🧠 Query as Anchor: Scenario-Adaptive User Representation via Large Language Model(查询作为锚点:基于大型语言模型的场景自适应用户表征)
[01:14] ⚛ Qute: Towards Quantum-Native Database(Qute:迈向量子原生数据库)
[01:59] 🧠 InnoEval: On Research Idea Evaluation as a Knowledge-Grounded, Multi-Perspective Reasoning Problem(InnoEval:将研究思想评估视为知识驱动、多视角推理问题)
[03:05] 🔍 REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents(REDSearcher:一种可扩展且经济高效的长视野搜索智能体框架)
[03:56] 🚀 BitDance: Scaling Autoregressive Generative Models with Binary Tokens(BitDance:使用二进制令牌扩展自回归生成模型)
[04:38] 🧠 Experiential Reinforcement Learning(经验性强化学习)
[05:24] 🧠 Embed-RL: Reinforcement Learning for Reasoning-Driven Multimodal Embeddings(Embed-RL:基于强化学习的推理驱动多模态嵌入方法)
[06:21] 🧩 UniWeTok: An Unified Binary Tokenizer with Codebook Size $\mathit{2^{128}}$ for Unified Multimodal Large Language Model(UniWeTok:一种用于统一多模态大语言模型的、具有$\mathit{2^{128}}$码本大小的统一二进制分词器)
[07:13] 🔍 BrowseComp-$V^3$: A Visual, Vertical, and Verifiable Benchmark for Multimodal Browsing Agents(BrowseComp-V³:面向多模态浏览代理的视觉、垂直与可验证基准)
[08:18] 🧠 LaViDa-R1: Advancing Reasoning for Unified Multimodal Diffusion Language Models(LaViDa-R1:推进统一多模态扩散语言模型的推理能力)
[09:02] 🗣 Conversational Image Segmentation: Grounding Abstract Concepts with Scalable Supervision(对话式图像分割:通过可扩展监督将抽象概念落地)
[10:00] 🧠 Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts(Nanbeige4.1-3B:一个能够推理、对齐与行动的小型通用模型)
[10:49] 🎨 FireRed-Image-Edit-1.0 Techinical Report(FireRed-图像编辑-1.0 技术报告)
[11:26] 🧬 Data Darwinism Part I: Unlocking the Value of Scientific Data for Pre-training(数据达尔文主义第一部分:释放科学数据在预训练中的价值)
[12:04] 🌐 WebWorld: A Large-Scale World Model for Web Agent Training(WebWorld:用于网络智能体训练的大规模世界模型)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
通勤路上就听AI每周谈。AI每周谈,每周带你回顾上周AI大事
传送门 🔗https://www.xiaoyuzhoufm.com/podcast/688a34636f5a275f1cba40fd
【目录】
本期的 15 篇论文如下:
[00:29] 🧠 Query as Anchor: Scenario-Adaptive User Representation via Large Language Model(查询作为锚点:基于大型语言模型的场景自适应用户表征)
[01:14] ⚛ Qute: Towards Quantum-Native Database(Qute:迈向量子原生数据库)
[01:59] 🧠 InnoEval: On Research Idea Evaluation as a Knowledge-Grounded, Multi-Perspective Reasoning Problem(InnoEval:将研究思想评估视为知识驱动、多视角推理问题)
[03:05] 🔍 REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents(REDSearcher:一种可扩展且经济高效的长视野搜索智能体框架)
[03:56] 🚀 BitDance: Scaling Autoregressive Generative Models with Binary Tokens(BitDance:使用二进制令牌扩展自回归生成模型)
[04:38] 🧠 Experiential Reinforcement Learning(经验性强化学习)
[05:24] 🧠 Embed-RL: Reinforcement Learning for Reasoning-Driven Multimodal Embeddings(Embed-RL:基于强化学习的推理驱动多模态嵌入方法)
[06:21] 🧩 UniWeTok: An Unified Binary Tokenizer with Codebook Size $\mathit{2^{128}}$ for Unified Multimodal Large Language Model(UniWeTok:一种用于统一多模态大语言模型的、具有$\mathit{2^{128}}$码本大小的统一二进制分词器)
[07:13] 🔍 BrowseComp-$V^3$: A Visual, Vertical, and Verifiable Benchmark for Multimodal Browsing Agents(BrowseComp-V³:面向多模态浏览代理的视觉、垂直与可验证基准)
[08:18] 🧠 LaViDa-R1: Advancing Reasoning for Unified Multimodal Diffusion Language Models(LaViDa-R1:推进统一多模态扩散语言模型的推理能力)
[09:02] 🗣 Conversational Image Segmentation: Grounding Abstract Concepts with Scalable Supervision(对话式图像分割:通过可扩展监督将抽象概念落地)
[10:00] 🧠 Nanbeige4.1-3B: A Small General Model that Reasons, Aligns, and Acts(Nanbeige4.1-3B:一个能够推理、对齐与行动的小型通用模型)
[10:49] 🎨 FireRed-Image-Edit-1.0 Techinical Report(FireRed-图像编辑-1.0 技术报告)
[11:26] 🧬 Data Darwinism Part I: Unlocking the Value of Scientific Data for Pre-training(数据达尔文主义第一部分:释放科学数据在预训练中的价值)
[12:04] 🌐 WebWorld: A Large-Scale World Model for Web Agent Training(WebWorld:用于网络智能体训练的大规模世界模型)
【关注我们】
您还可以在以下平台找到我们,获得播客内容以外更多信息
小红书: AI速递
评价
空空如也
小宇宙热评
暂无小宇宙热门评论