Genome Data Compression using Digital Chaos

Authors: Sai Venkatesh Balasubramanian

Efficient techniques of Genome Data handling and storage are the need of the hour in the present genetic engineering era. The present work purports to the design and implementation of a Genome Sequence Data Compression Technique without the use of references and lookup. This is achieved by first generating a digital chaotic bit stream, formed by performing XOR operations on three square waves with mismatched frequencies. The generated bit stream is XORed with the Genome Sequence bit stream after necessary data conditioning, and the result is stored as a 2D array (image). The png format is chosen, owing to its inherent lossless properties. It is seen that the perfectly reversible operations of compression and decompression result in compression ratios of around 2.6-3.5 being achieved with absolute zero error. The use of digital chaos provides an additional layer of security, since the frequencies of the input square wave signals form a secure key, which when mismatched during decompression even by 1 percent, can result in error rates of upto 60 percent.

