Date of Award

6-2025

Document Type

Dissertation

Publisher

Santa Clara : Santa Clara University, 2025

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science and Engineering

First Advisor

Yi Fang

Abstract

Neural ranking models have become central to modern information retrieval (IR) systems, powering applications such as web search, product recommendation, and question answering. However, their effectiveness often hinges on access to abundant labeled supervision, a condition that is rarely met in real-world scenarios. In many domains, including e-commerce, healthcare, and legal search, labeled user interactions (e.g., clicks, purchases, or expert annotations) are sparse, especially for long-tail queries and new content. This data sparsity challenges the generalization, adaptability, and fairness of traditional ranking approaches.

This thesis presents a unified investigation of neural ranking under sparse supervision, introducing novel methods and analytical frameworks across four key dimensions: query, label, model, and corpus. First, we propose Meta-Learning to Rank (MLTR), a metalearning- based framework that enables fast adaptation to weakly supervised or unseen queries, enhancing query-level generalization. Second, we introduce a Multi-Task Learning (MTL) framework for product ranking in e-commerce, which jointly models diverse engagement signals, such as clicks, add-to-cart actions, and purchases, to improve supervision in imbalanced data settings. Third, we develop Passage-Specific Prompt Tuning (PSPT), a parameter-efficient method for adapting large language models (LLMs) to open-domain question answering tasks, where both task specificity and training data are limited. Finally, we conduct the first systematic fairness evaluation of Retrieval- Augmented Generation (RAG) systems, identifying how demographic biases emerge across retriever, refiner, and generator components under sparse or skewed retrieval conditions.

Together, these contributions form a comprehensive approach to improving the performance, adaptability, and trustworthiness of neural ranking systems in data-limited environments. By integrating meta-learning, multi-task optimization, LLM adaptation, and fairness-aware evaluation, this research offers both theoretical advances and practical insights toward building effective and responsible ranking models under sparse supervision.

Recommended Citation

Wu, Xuyang, "Neural Ranking in Sparse Data Environments" (2025). Engineering Ph.D. Theses. 58.
https://scholarcommons.scu.edu/eng_phd_theses/58

Download

Included in

Computer Engineering Commons

COinS

Engineering Ph.D. Theses

Neural Ranking in Sparse Data Environments

Date of Award

Document Type

Publisher

Degree Name

Department

First Advisor

Abstract

Recommended Citation

Included in

Search

Browse

Author Corner

Links

Engineering Ph.D. Theses

Neural Ranking in Sparse Data Environments

Author

Date of Award

Document Type

Publisher

Degree Name

Department

First Advisor

Abstract

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Links