Date of Award
6-2025
Document Type
Dissertation
Publisher
Santa Clara : Santa Clara University, 2025
Degree Name
Doctor of Philosophy (PhD)
Department
Computer Science and Engineering
First Advisor
Yi Fang
Abstract
Neural ranking models have become central to modern information retrieval (IR) systems, powering applications such as web search, product recommendation, and question answering. However, their effectiveness often hinges on access to abundant labeled supervision, a condition that is rarely met in real-world scenarios. In many domains, including e-commerce, healthcare, and legal search, labeled user interactions (e.g., clicks, purchases, or expert annotations) are sparse, especially for long-tail queries and new content. This data sparsity challenges the generalization, adaptability, and fairness of traditional ranking approaches.
This thesis presents a unified investigation of neural ranking under sparse supervision, introducing novel methods and analytical frameworks across four key dimensions: query, label, model, and corpus. First, we propose Meta-Learning to Rank (MLTR), a metalearning- based framework that enables fast adaptation to weakly supervised or unseen queries, enhancing query-level generalization. Second, we introduce a Multi-Task Learning (MTL) framework for product ranking in e-commerce, which jointly models diverse engagement signals, such as clicks, add-to-cart actions, and purchases, to improve supervision in imbalanced data settings. Third, we develop Passage-Specific Prompt Tuning (PSPT), a parameter-efficient method for adapting large language models (LLMs) to open-domain question answering tasks, where both task specificity and training data are limited. Finally, we conduct the first systematic fairness evaluation of Retrieval- Augmented Generation (RAG) systems, identifying how demographic biases emerge across retriever, refiner, and generator components under sparse or skewed retrieval conditions.
Together, these contributions form a comprehensive approach to improving the performance, adaptability, and trustworthiness of neural ranking systems in data-limited environments. By integrating meta-learning, multi-task optimization, LLM adaptation, and fairness-aware evaluation, this research offers both theoretical advances and practical insights toward building effective and responsible ranking models under sparse supervision.
Recommended Citation
Wu, Xuyang, "Neural Ranking in Sparse Data Environments" (2025). Engineering Ph.D. Theses. 58.
https://scholarcommons.scu.edu/eng_phd_theses/58
