Author

Cihan Ruan

Date of Award

5-2025

Document Type

Dissertation

Publisher

Santa Clara : Santa Clara University, 2025

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science and Engineering

First Advisor

Nam Ling

Abstract

DNA storage is rapidly emerging as a transformative solution for long-term, ultra-dense, and energy-efficient archival systems. This dissertation builds upon a deep understanding of image compression principles to design encoding frameworks that are uniquely attuned to the biochemical constraints of DNA storage. By fusing insights from traditional coding with molecular-level design, we develop a series of technically advanced solutions tailored for this emerging medium.

Our first contribution introduces Dynamic DNA-Fountain, a constrained codec that maps H.266 /VVC-compressed bitstreams to quaternary DNA sequences, optimizing for both information density and biochemical synthesis constraints. Building upon this foundation, we shift focus toward robustness—leveraging the inherent redundancy in image data. To this end, we propose the first residual convolutional neural network (ResCNN)–based decoder tailored for DNA-stored images, pioneering the use of visual residual learning to correct insertion, deletion, and substitution (IDS) errors. This is further extended in DSI-ResCNN, a dropout-aware system that introduces a novel mechanism to mitigate molecular chain breakage—addressing a failure mode long overlooked in the literature.

Progressing beyond modular encoding and correction, we construct two end-to-end compression frameworks that integrate semantic encoding with self-corrective capacity. HybridFlow-DNA employs a dual-stream generative pipeline to balance ultra-low bitrate compression and semantic preservation, while HDCompression-DNA incorporates diffusion-based recovery to enhance robustness under severe noise.

Finally, we extend the data domain beyond images into tactile sensory signals. The proposed Tactile-DNA system couples residual vector quantization (RVQGAN) with an updated, signalaware dynamic DNA-Fountain mapping, enabling multimodal archival grounded in biological constraints.

Share

COinS