text mining improves prediction of protein functional sites文本挖掘提高预测蛋白质功能的网站.pdf
文本预览下载声明
Text Mining Improves Prediction of Protein Functional Sites
1 . 2 1 2,3 .
Karin M. Verspoor * , Judith D. Cohn , Komandur E. Ravikumar , Michael E. Wall *
1 University of Colorado School of Medicine, Aurora, Colorado, United States of America, 2 Computer, Computational, and Statistical Sciences Division, Los Alamos
National Laboratory, Los Alamos, New Mexico, United States of America, 3 Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico,
United States of America
Abstract
We present an approach that integrates protein structure analysis and text mining for protein functional site prediction,
called LEAP-FS (Literature Enhanced Automated Prediction of Functional Sites). The structure analysis was carried out using
Dynamics Perturbation Analysis (DPA), which predicts functional sites at control points where interactions greatly perturb
protein vibrations. The text mining extracts mentions of residues in the literature, and predicts that residues mentioned are
functionally important. We assessed the significance of each of these methods by analyzing their performance in finding
known functional sites (specifically, small-molecule binding sites and catalytic sites) in about 100,000 publicly available
protein structures. The DPA predictions recapitulated many of the functional site annotations and preferentially recovered
binding sites annotated as biologically relevant vs. those annotated as potentially spurious. The text-based predictions were
also substantially supported by the functional site annotations: compared to other residues, residues mentioned in text
were roughly six times more likely to be found in a functional site. The
显示全部