Statistical alignment based on fragment insertion and deletion models

  • Motivation: The topic of this paper is the estimation of alignments and mutation rates based on stochastic sequence-evolution models that allow insertions and deletions of subsequences ("fragments") and not just single bases. The model we propose is a variant of a model introduced by Thorne, Kishino, and Felsenstein (1992). The computational tractability of the model depends on certain restrictions in the insertion/deletion process; possible effects we discuss. Results: The process of fragment insertion and deletion in the sequence-evolution model induces a hidden Markov structure at the level of alignments and thus makes possible efficient statistical alignment algorithms. As an example we apply a sampling procedure to assess the variability in alignment and mutation parameter estimates for HVR1 sequences of human and orangutan, improving results of previous work. Simulation studies give evidence that estimation methods based on the proposed model also give satisfactory results when applied to data for which the restrictions in the insertion/deletion process do not hold. Availability: The source code of the software for sampling alignments and mutation rates for a pair of DNA sequences according to the fragment insertion and deletion model is freely available from under the terms of the GNU public license (GPL, 2000).

Download full text files

Export metadata

Author:Dirk Metzler
Pubmed Id:
Document Type:Preprint
Year of Completion:2002
Date of first Publication:2002/10/07
Publishing Institution:Universitätsbibliothek Johann Christian Senckenberg
Release Date:2018/01/11
Tag:FID model; Markov chain Monte Carlo Method; Thorne Kishino Felsenstein model; hidden Markov model; hypervariable region; mutation parameter estimation; pair HMM; sequence alignment; statistical alignment
Page Number:18
First Page:1
Last Page:18
Preprint, erschienen in: Bioinformatics, 19.2003, Nr. 4, S. 490–499, doi:10.1093/bioinformatics/btg026
Institutes:Informatik und Mathematik / Mathematik
Dewey Decimal Classification:5 Naturwissenschaften und Mathematik / 51 Mathematik / 510 Mathematik
Licence (German):License LogoDeutsches Urheberrecht