文档详情

systematic association of genes to phenotypes by genome and literature mining系统的关联基因表型的基因和文学挖掘.pdf

发布:2017-09-07约9.25万字共11页下载文档
文本预览下载声明
Open access, freely available online PLoS BIOLOGY Systematic Association of Genes to Phenotypes by Genome and Literature Mining 1[ 1[ 1,2 3 4 1 Jan O. Korbel , Tobias Doerks , Lars J. Jensen , Carolina Perez-Iratxeta , Szymon Kaczanowski , Sean D. Hooper , Miguel A. Andrade3, Peer Bork1,2* ¨ 1 European Molecular Biology Laboratory, Heidelberg, Germany, 2 Max Delbruck Center for Molecular Medicine, Berlin-Buch, Germany, 3 Ontario Genomics Innovation Centre, Ottawa Health Research Institute, Ottawa, Canada, 4 Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warsaw, Poland One of the major challenges of functional genomics is to unravel the connection between genotype and phenotype. So far no global analysis has attempted to explore those connections in the light of the large phenotypic variability seen in nature. Here, we use an unsupervised, systematic approach for associating genes and phenotypic characteristics that combines literature mining with comparative genome analysis. We first mine the MEDLINE literature database for terms that reflect phenotypic similarities of species. Subsequently we predict the likely genomic determinants: genes specifically present in the respective genomes. In a global analysis involving 92 prokaryotic genomes we retrieve 323 clusters containing a total of 2,700 significant gene–phenotype associations. Some clusters contain mostly known relationships, such as genes involved in motility or plant degradation, often with additional hypothetical proteins associated with those phenotypes. Other clusters comprise
显示全部
相似文档