data compression concepts and algorithms and their applications to bioinformatics数据压缩的概念和算法及其应用生物信息学.pdf
文本预览下载声明
Entropy 2010, 12, 34-52; doi:10.3390/
OPEN ACCESS
entropy
ISSN 1099-4300
/journal/entropy
Review
Data Compression Concepts and Algorithms and Their
Applications to Bioinformatics
¨
˜
Ozkan U. Nalbantoglu, David J. Russell and Khalid Sayood
Department of Electrical Engineering, University of Nebraska-Lincoln, NE 68588-0511, USA;
E-Mails: nalbantoglu@ (O.U.N.); drussell@ (D.J.R.)
Author to whom correspondence should be addressed; E-Mail: ksayood@;
Tel.: +1-402-472-6688; Fax: +1-402-472-4732.
Received: 04 December 2009 / Accepted: 17 December 2009 / Published: 29 December 2009
Abstract: Data compression at its base is concerned with how information is organized
in data. Understanding this organization can lead to efficient ways of representing the
information and hence data compression. In this paper we review the ways in which ideas
and approaches fundamental to the theory and practice of data compression have been used in
the area of bioinformatics. We look at how basic theoretical ideas from data compression, such
as the notions of entropy, mutual information, and complexity have been used for analyzing
biological sequences in order to discover hidden patterns, infer phylogenetic relationships
between organisms and study viral populations. Finally, we look at how inferred grammars
for biological sequences have been used to uncover structure in biological sequences.
Keywords: bioinformatics; data co
显示全部