Inicio  /  Algorithms  /  Vol: 13 Par: 4 (2020)  /  Artículo
ARTÍCULO
TITULO

A New Lossless DNA Compression Algorithm Based on A Single-Block Encoding Scheme

Deloula Mansouri    
Xiaohui Yuan and Abdeldjalil Saidani    

Resumen

With the emergent evolution in DNA sequencing technology, a massive amount of genomic data is produced every day, mainly DNA sequences, craving for more storage and bandwidth. Unfortunately, managing, analyzing and specifically storing these large amounts of data become a major scientific challenge for bioinformatics. Therefore, to overcome these challenges, compression has become necessary. In this paper, we describe a new reference-free DNA compressor abbreviated as DNAC-SBE. DNAC-SBE is a lossless hybrid compressor that consists of three phases. First, starting from the largest base (Bi), the positions of each Bi are replaced with ones and the positions of other bases that have smaller frequencies than Bi are replaced with zeros. Second, to encode the generated streams, we propose a new single-block encoding scheme (SEB) based on the exploitation of the position of neighboring bits within the block using two different techniques. Finally, the proposed algorithm dynamically assigns the shorter length code to each block. Results show that DNAC-SBE outperforms state-of-the-art compressors and proves its efficiency in terms of special conditions imposed on compressed data, storage space and data transfer rate regardless of the file format or the size of the data.

 Artículos similares

       
 
Manasi Rath,Suvendu Rup     Pág. 29 - 37
This paper is a methodological review paper on image compression using Burrows Wheeler Transform. Normally BWT is used for text compression but it has been recently applied to image compression field. Basically it is a lossless compression technique whic... ver más

 
Seung Woo Son,Zhengzhang Chen,William Hendrix,Ankit Agrawal,Wei-keng Liao,Alok Choudhary     Pág. 76 - 88
While periodic checkpointing has been an important mechanism for tolerating faults in high-performance computing (HPC) systems, it is cost-prohibitive as the HPC system approaches exascale. Applying compression techniques is one common way to mitigate su... ver más