targeted enrichment beyond the consensus coding dna sequence exome reveals exons with higher variant densities浓缩目标共识编码dna序列以外的外显子组揭示了外显子密度较高的变体.pdf
文本预览下载声明
Bainbridge et al. Genome Biology 2011, 12:R68
/2011/12/7/R68
RESEARCH Open Access
Targeted enrichment beyond the consensus
coding DNA sequence exome reveals exons with
higher variant densities
1,2 1 1 1 1 3
Matthew N Bainbridge , Min Wang , Yuanqing Wu , Irene Newsham , Donna M Muzny , John L Jefferies ,
Thomas J Albert4, Daniel L Burgess4 and Richard A Gibbs1*
Abstract
Background: Enrichment of loci by DNA hybridization-capture, followed by high-throughput sequencing, is an
important tool in modern genetics. Currently, the most common targets for enrichment are the protein coding
exons represented by the consensus coding DNA sequence (CCDS). The CCDS, however, excludes many actual or
computationally predicted coding exons present in other databases, such as RefSeq and Vega, and non-coding
functional elements such as untranslated and regulatory regions. The number of variants per base pair (variant
density) and our ability to interrogate regions outside of the CCDS regions is consequently less well understood.
Results: We examine capture sequence data from outside of the CCDS regions and find that extremes of GC
content that are present in different subregions of the genome can reduce the local capture sequence coverage
to less than 50% relative to the CCDS. This effect is due to biases inherent in both the Illumina and SOLiD
sequencing platforms that are exacerbated by the capture process. Interestingly, for two subregion types,
microRNA and predicted exons, the capture process yields higher than expected coverage when compared to
whole genome sequencing. Lastly, we examine the variation present in non-CCDS regions and find that predicted
exons, as well as exonic regions specific to Re
显示全部