Loading…

A comparison of scoring functions for protein sequence profile alignment

Motivation:In recent years, several methods have been proposed for aligning two protein sequence profiles, with reported improvements in alignment accuracy and homolog discrimination versus sequence–sequence methods (e.g. BLAST) and profile–sequence methods (e.g. PSI-BLAST). Profile–profile alignmen...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics 2004-05, Vol.20 (8), p.1301-1308
Main Authors: Edgar, Robert C., Sjölander, Kimmen
Format: Article
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Motivation:In recent years, several methods have been proposed for aligning two protein sequence profiles, with reported improvements in alignment accuracy and homolog discrimination versus sequence–sequence methods (e.g. BLAST) and profile–sequence methods (e.g. PSI-BLAST). Profile–profile alignment is also the iterated step in progressive multiple sequence alignment algorithms such as CLUSTALW. However, little is known about the relative performance of different profile–profile scoring functions. In this work, we evaluate the alignment accuracy of 23 different profile–profile scoring functions by comparing alignments of 488 pairs of sequences with identity ≤30% against structural alignments. We optimize parameters for all scoring functions on the same training set and use profiles of alignments from both PSI-BLAST and SAM-T99. Structural alignments are constructed from a consensus between the FSSP database and CE structural aligner. We compare the results with sequence–sequence and sequence–profile methods, including BLAST and PSI-BLAST. Results: We find that profile–profile alignment gives an average improvement over our test set of typically 2–3% over profile–sequence alignment and ∼40% over sequence–sequence alignment. No statistically significant difference is seen in the relative performance of most of the scoring functions tested. Significantly better results are obtained with profiles constructed from SAM-T99 alignments than from PSI-BLAST alignments. Availability: Source code, reference alignments and more detailed results are freely available at http://phylogenomics.berkeley.edu/profilealignment/
ISSN:1367-4803
1460-2059
1367-4811
DOI:10.1093/bioinformatics/bth090