computational bacterial genome-wide analysis of phylogenetic profiles reveals potential virulence genes of streptococcus agalactiae计算细菌全基因组系统资料的分析,将揭示链球菌agalactiae的潜在基因毒性.pdf
文本预览下载声明
Computational Bacterial Genome-Wide Analysis of
Phylogenetic Profiles Reveals Potential Virulence Genes
of Streptococcus agalactiae
1 2 1,3 3 3
Frank Po-Yen Lin *, Ruiting Lan , Vitali Sintchenko , Gwendolyn L. Gilbert , Fanrong Kong , Enrico
Coiera1
1 Centre for Health Informatics, University of New South Wales, Sydney, Australia, 2 School of Biotechnology and Biomolecular Sciences, University of New South Wales,
Sydney, Australia, 3 Centre for Infectious Diseases and Microbiology-Public Health, Westmead Hospital and Sydney Medical School, The University of Sydney, Sydney,
Australia
Abstract
The phylogenetic profile of a gene is a reflection of its evolutionary history and can be defined as the differential presence
or absence of a gene in a set of reference genomes. It has been employed to facilitate the prediction of gene functions.
However, the hypothesis that the application of this concept can also facilitate the discovery of bacterial virulence factors
has not been fully examined. In this paper, we test this hypothesis and report a computational pipeline designed to identify
previously unknown bacterial virulence genes using group B streptococcus (GBS) as an example. Phylogenetic profiles of all
GBS genes across 467 bacterial reference genomes were determined by candidate-against-all BLAST searches,which were
then used to identify candidate virulence genes by machine learning models. Evaluation experiments with known GBS
virulence genes suggested good functional and model consistency in cross-validation analyses (areas under ROC curve, 0.80
and 0.98 respectively). Inspection of the top-10 genes in each of the 15 virulence functional groups revealed at least 15 (of
119) homologous genes implicated in
显示全部