Date of Award

6-12-2025

Document Type

Thesis

Publisher

Santa Clara : Santa Clara University, 2025

Departments

Computer Science and Engineering; Electrical and Computer Engineering

First Advisor

Hoeseok Yang

Second Advisor

Yi Fang

Abstract

Large Language Models (LLMs) are becoming increasingly popular in modern society. However, despite their popularity, the deployment of LLMs in real-world scenarios is extremely challenging due to substantial computational costs and memory constraints. Edge devices, like smartphones and IoT devices, lack resources needed to run these models locally, instead offloading computations for cloud computing. Cloud computing requires users to send their data over the internet leading to numerous privacy and security concerns. In some domains, such as health and finances, sending such sensitive information is not an option. Existing solutions to compress or increase inference speed include Small Language Models (SLMs), model compression techniques, and inference optimization strategies. However, all of these techniques require extensive human effort and manual tuning to find the optimal settings for increased speed without significant degradation of the quality of output. We propose an online hyperparameter finetuning method that autonomously discovers the optimal settings based on tangible metrics during inference. Our approach monitors performance metrics in real-time and dynamically adjusts any tunable parameter without human intervention. We demonstrate this framework on dynamic sparsity prediction, achieving 1.67÷ speedup while maintaining accuracy, but the method generalizes to any tunable parameters.

Recommended Citation

Lin, Ethan; Yu, Nathan; and Chang, Jeromy, "Online Hyperparameter Tuning for LLM Optimization" (2025). Computer Science and Engineering Senior Theses. 329.
https://scholarcommons.scu.edu/cseng_senior/329

Download

Included in

Computer Engineering Commons, Electrical and Computer Engineering Commons

COinS

Computer Science and Engineering Senior Theses

Online Hyperparameter Tuning for LLM Optimization

Date of Award

Document Type

Publisher

Departments

First Advisor

Second Advisor

Abstract

Recommended Citation

Included in

Search

Browse

Author Corner

Links

Computer Science and Engineering Senior Theses

Online Hyperparameter Tuning for LLM Optimization

Author

Date of Award

Document Type

Publisher

Departments

First Advisor

Second Advisor

Abstract

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Links