Date of Award

9-2025

Document Type

Thesis

Publisher

Santa Clara : Santa Clara University, 2025

Degree Name

Master of Science (MS)

Department

Computer Science and Engineering

First Advisor

Younghyun Cho

Abstract

Choosing the best OpenMP parameters such as thread count, scheduling type, and chunk size is essential for optimizing parallel program performance. One of the promising approaches is to infer a (pre-trained) performance model at runtime to determine the parameters to run OpenMP parallel regions. Such a performance prediction model can require programs’ code information such as intermediate representation (IR) and other at-runtime information (e.g., input sizes) to make a performance prediction. In such a scenario, extracting or querying the IR information at runtime can create a significant runtime overhead. This thesis proposes a compiler-asssited tuning framework that shifts IR extraction and instrumentation to compile time using custom LLVM passes, embedding static code features and enabling lightweight runtime inference just before parallel region execution. The approach reduces runtime overhead, and supports selective tuning based on input size and problem complexity. Experimental evaluation on Polybench and NPB benchmarks shows that compiler-assisted inference substantially reduces runtime overhead compared to runtime-only methods, making autotuning practical at scale. Optimizations including a Cython-based inference backend, batched inference calls, and static filtering further enhance efficiency.

Share

COinS