AgentIR with Zijian Chen and Xueguang Ma - Weaviate Podcast #136!

April 27, 2026·1h 3m

Episode Description from the Publisher

Zijian Chen and Xueguang Ma from the University of Waterloo join the Weaviate Podcast to discuss AgentIR and why retrieval systems need to be redesigned from the ground up for AI agents. The conversation opens with a striking reframe: agents have become the primary consumers of search, inserting themselves as middleware between humans and information. Humans used to query search engines directly, now they delegate to ChatGPT, which searches on their behalf. This means retrieval algorithms are no longer optimized for their actual users.The discussion distinguishes reasoning-intensive retrieval from reasoning-aware retrieval. Reasoning-intensive tasks like BRIGHT involve single-hop queries where the connection between query and document is obscure but still one step. Agent IR tackles a fundamentally different problem, extremely multi-hop queries from benchmarks like BrowseComp-Plus, where each hop strictly depends on the previous one. The key insight behind AgentIR is that agents reveal their entire reasoning process in their reasoning traces, unlike humans who never write out their thought process. Existing retrievers discard this rich signal entirely. AgentIR jointly embeds the query and reasoning trace, training a retriever from scratch to exploit this agent-specific context.From there, the conversation covers BrowseComp-Plus, which extends OpenAI's BrowseComp with a fixed corpus to enable disentangled evaluation of agents and retrievers separately, something impossible when both the web and the search provider are black boxes. Building the corpus required over 400 hours of human annotation to ensure every hop in every reasoning chain had its supporting documents present. The discussion then moves into agent context management, contrasting compaction approaches with just-in-time memory retrieval from paged memory, referencing InfoFlow and the AgentFold paper. Xueguang shares a provocative take that neither single-vector nor multi-vector representations are optimal, arguing the field needs embeddings at the right granularity based on information density. The episode closes with Steven introducing AICI, Agent-Computer Interaction, as the successor to HCI, and Xueguang framing the open question of scaling search along two dimensions: deeper (more turns) versus wider (more parallel queries).

Podzilla Summary coming soon

Get Free Summaries →

Free forever for up to 3 podcasts. No credit card required.