You are here

Toward Whole-Session Relevance: Exploring Intrinsic Diversity in Web Search

KEC 1001
Monday, May 19, 2014 - 4:00pm to 4:50pm
Speaker Information
Paul Bennett
Senior Researcher
Context, Learning & User Experience for Search (CLUES) group
Microsoft Research

Current research on web search has focused on optimizing and evaluating single queries. However, a significant fraction of user queries are part of more complex tasks which span multiple queries across one or more search sessions. An ideal search engine would not only retrieve relevant results for a user's particular query but also be able to identify when the user is engaged in a more complex task and aid the user in completing that task. Toward optimizing whole-session or task relevance, we characterize and address the problem of intrinsic diversity (ID) in retrieval, a type of complex task that requires multiple interactions with current search engines. Unlike existing work on extrinsic diversity that deals with ambiguity in intent across multiple users, ID queries often have little ambiguity in intent but seek content covering a variety of aspects on a shared theme. In such scenarios, the underlying needs are typically exploratory, comparative, or breadth-oriented in nature. We identify and address three key problems for ID retrieval: identifying authentic examples of ID tasks from post-hoc analysis of behavioral signals in search logs; learning to identify initiator queries that mark the start of an ID search task; and given an initiator query, predicting which content to pre-fetch and rank.

This is joint work with Karthik Raman and Kevyn Collins-Thompson and was the winner of the SIGIR 2013 Best Student Paper Award.

Speaker Bio

Paul Bennett is a Senior Researcher in the Context, Learning & User Experience for Search (CLUES) group at Microsoft Research where he focuses on the development, improvement, and analysis of machine learning and data mining methods as components of real-world, large-scale adaptive systems. His research has advanced techniques for ensemble methods and the combination of information sources, calibration, consensus methods for noisy supervision labels, active learning and evaluation, supervised classification (with an emphasis on hierarchical classification) and ranking with applications to information retrieval, crowdsourcing, behavioral modeling and analysis, and personalization. His recent work has been recognized with a SIGIR 2012 Best Paper Honorable Mention and a SIGIR 2013 Best Student Paper award. He completed his dissertation on combining text classifiers using reliability indicators in 2006 at Carnegie Mellon where he was advised by Profs. Jaime Carbonell and John Lafferty.