UCSC-SOE-11-05: Content promotion by utility measure optimization

Hungyu Henry Lin
02/04/2011 09:00 AM
Computer Science
One common approach to selecting the best or most relevant document from a query is to use a ranking model and then to select only the first element from the sorted list. However, because ranking models optimize for all ranks, the model may sacrifice accuracy of the top rank for the sake of overall accuracy. This is an unnecessary trade-off. Other approaches (e.g. graph-based methods) also make similar trade-offs.

Instead of using existing models, we use a boosting algorithm that optimizes explicitly for the top rank. Our approach greatly simplifies the required objective function and allows us to use a tighter, more accurate approximation than those used by machine learning rankers. We also demonstrate that our algorithm does out-performs a number of baselines on benchmark datasets as well as domain-specific datasets.