Author

Lebin Zhou

Date of Award

6-11-2024

Document Type

Thesis

Publisher

Santa Clara : Santa Clara University, 2024

Degree Name

Master of Science (MS)

Department

Computer Science and Engineering

First Advisor

Nam Ling

Abstract

This thesis presents a novel super-resolution model based on Vector Quantized Generative Adversarial Network (VQGAN) to enhance image resolution. Inspired by recent advancements in the field of image reconstruction, we apply VQGAN to the super-resolution task, leveraging its powerful generative capabilities to produce higher quality high-resolution images.

Building on the VQGAN framework, we propose an improved architecture that incorporates an additional ConvNeXt feature extractor based on Convolutional Neural Networks (CNN) to effectively capture and refine features from low-resolution images. To further enhance model performance, we implemented various strategies to optimize the utilization of the codebook, including capacity optimization, improved initialization, and an Exponential Moving Average (EMA) dynamic updating strategy to ensure more efficient and diverse codebook usage. Additionally, we divided the training process into two phases to improve training stability and convergence. In the first phase, we focus on training the codebook and decoder to create a strong high-resolution prior. In the second phase, we concentrate on training the encoder, the ConvNeXt-based feature extractor, and the GAN. This phased training approach enables the model to progressively learn and refine the complex details required for high-quality image reconstruction and enhances model stability.

Experimental results demonstrate that our model shows improvements to the existing methods in image quality as measured by Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Learned Perceptual Image Patch Similarity (LPIPS). This work highlights the potential of combining VQGAN with advanced feature extractors in super-resolution models, paving the way for future research and development in this field.

Share

COinS