the future of dna sequence archivingdna序列存档的未来.pdf
文本预览下载声明
Cochrane et al. GigaScience 2012, 1:2
/content/1/1/2
REVIEW Open Access
The future of DNA sequence archiving
*
Guy Cochrane , Charles E Cook and Ewan Birney
Abstract
Archives operating under the International Nucleotide Sequence Database Collaboration currently preserve all
submitted sequences equally, but rapid increases in the rate of global sequence production will soon require
differentiated treatment of DNA sequences submitted for archiving. Here, we propose a graded system in which
the ease of reproduction of a sequencing-based experiment and the relative availability of a sample for
resequencing define the level of lossy compression applied to stored data.
Keywords: DNA, Sequence, Archive, Compression, Storage, Image
Background DNaseI-seq. We have even witnessed the development of
The vast majority of living organisms utilise nucleic acid as DNA sequencing-based methods with no direct biological
their primary store of genetic information. The technology role, such as the mathematical exploration of a combinato-
to sequence DNA routinely was developed in the 1970s, ric space and the development of unique synthetic tags for
but advances over time have since reduced cost and property tracking.
increased output. As the cost of sequencing has fallen, the DNA sequences determined for research purposes have
number of species for which partial or complete genetic in- been routinely archived since 1982, when the EMBL Data
formation has been derived has risen at a corresponding Library was founded. This was closely followed by the for-
pace; starting with the first complete sequence of the Phi X mation o
显示全部