Date of Award
Dissertation - SCU Access Only
Santa Clara : Santa Clara University, 2018.
Doctor of Philosophy (PhD)
In image and video coding applications, an image/frame or its difference from a predicted value (prediction residue) is divided into component blocks or patches and is then encoded using some type of transform. Traditional transforms such as Discrete Cosine Transform use a complete dictionary of basis functions that have the property of energy compaction. The original signal can then be represented by a smaller set of transform coefficients. These coefficients are quantized to improve coding efficiency. While there have been many improvements in the different sub-areas of image and video coding over the years, the performance of transform coding in terms of coding efficiency has plateaued.
With the growing advances in machine learning, there has been an interest in using overcomplete dictionaries of basis functions to improve transform coding efficiency. These dictionaries are trained using a large set of training images and capture some of the complex features of the images such as edges and shapes. Transforms using these dictionaries result in a sparse set of coefficients that can used to represent the original signal. This is called sparse coding.
The current research aims to use sparse coding for encoding residual images/frames, with an objective of improving Rate Distortion (RD) performance over traditional transforms such as DCT. This research address three main problems in the area of sparse coding for image compression:
Achieving the best sparse representation of the given input: Orthogonal Matching Pursuit (OMP) is a greedy algorithm that is used to find the sparse approximation of an input signal. In this research, we consider a Rate Distortion Optimization (RDO) technique to select the best sparse representation of a signal subject to a given sparsity constraint. The performance of this method is verified using a number of standard test images.
Usage of sparse coding in conjunction with DCT and the quantization of sparse coefficients: A natural extension of sparse coding is to use it in conjunction with DCT. Some blocks or patches of an image/frame can be coded by DCT and the rest would be sparse coded. Such an adaptive sparse coding method that is based on RD Optimization for each patch is researched. Quantization of coefficients in sparse coded patches is another topic of this research. Given an initial Quantization parameter (QP), for the image/frame, an RDO based search to find the optimal QP for each patch is proposed. Based on experimental results, a recommendation is made regarding this QP search.
Improving the effectiveness of the RDO process that we propose for problem 1: The RD optimization process used in the selection of sparse coefficients uses a Lagrange multiplier that is specified in the H.264 video coding standard, and one that is also similar to that specified in the current HEVC video coding standard. This multiplier is specified as a function of Quantization Parameter (QP). The characteristics of the patch are not considered in calculating this multiplier. Assuming a Laplace 7 distribution for sparse coefficients, and using analytical expressions for their Rate and Distortion, we derive an analytical expression for the Lagrange multiplier and refine it using experimental results. This multiplier is adaptive to the content each patch. Experimental results show that the proposed adaptive Lagrange multiplier gives superior RD performance compared to the traditional value.
Kalluri, Madhusudan, "Rate-Distortion Optimization for Sparse Coding in Image Compression" (2018). Engineering Ph.D. Theses. 16.