文档详情

Sequence Homology Search Based on Database Indexing Using the Profile Hidden Markov Model.pdf

发布：2015-09-23约5.36万字共6页下载文档

文本预览下载声明

Sequence Homology Search Based on Database Indexing Using the Proﬁle Hidden Markov Model Qiang Xue , James Cole , Sakti Pramanik Department of Computer Science and Engineering Department of Microbiology Michigan State University, East Lansing, Michigan 48824, USA Email: xueqiang@, pramanik@, colej@ Abstract— The Proﬁle Hidden Markov Model (PHMM) has pairwise methods (e.g., BLAST or FASTA) that use a position- received increasing attention in the ﬁeld of protein homology independent scoring system. detection, since proﬁle-based methods are much more sensitive in detecting distant homologous relationships than pairwise meth- B. The Proﬁle Hidden Markov Model ods. Pure dynamic-programming-based systems are often used for PHMM searches. However, these dynamic-programming- Functional biological sequences typically come in families. based systems are very time consuming for a large database. For Just as a pairwise alignment captures the relationship between instance, it may take approximately 15 minutes to search a short two sequences, a multi-sequence alignment can show how the model of length 12 in the GenBank protein sequence database. sequences in a family relate to each other. It is desirable to Instead of searching the database sequentially, we search the provide a consensus model for a multi-sequence alignment, so database based on a tree-structured database indexing, called the HD-tree. The HD-tree is able to reduce the PHMM search that the relationship between a new sequence and the family time signiﬁcantly with

显示全部

相似文档