Naama Kraus, M.Sc. Thesis Seminar
Query auto completion is known to provide poor predictions
of the user's query when her input prefix is very short (e.g.,
one or two characters). In this work we show that context,
such as the user's recent queries, can be used to improve
the prediction quality considerably even for such short prefixes.
We propose a context-sensitive query auto completion
algorithm, NearestCompletion, which outputs the completions
of the user's input that are most similar to the context
queries. To measure similarity, we represent queries and contexts
as high-dimensional term-weighted vectors and resort
to cosine similarity. The mapping from queries to vectors
is done through a new query expansion technique that we
introduce, which expands a query by traversing the query
recommendation tree rooted at the query.
In order to evaluate our approach, we performed extensive
experimentation over the public AOL query log. We
demonstrate that when the recent user's queries are relevant
to the current query she is typing, then after typing a
single character, NearestCompletion's MRR is 48% higher
relative to the MRR of the standard MostPopularCompletion
algorithm on average. When the context is irrelevant,
however, NearestCompletion's MRR is essentially zero. To
mitigate this problem, we propose HybridCompletion, which
is a hybrid of NearestCompletion with MostPopularCompletion.
HybridCompletion is shown to dominate both NearestCompletion
and MostPopularCompletion, achieving a total
improvement of 31.5% in MRR relative to MostPopular-
Completion on average.