Loading…

Improved sensitivity of biological sequence database searches

We have increased the sensitivity ofDNA and protein sequence database searches by allowing similar but non-identical amino acids or nucleotides to match. In addition, one can match k-tuples or words instead of matching individual residues in order to speed the search. A matching matrix specifies whi...

Full description

Saved in:

Bibliographic Details
Published in:	Bioinformatics 1990-07, Vol.6 (3), p.237-245
Main Authors:	Brutlag, Douglas L., Dautricourt, Jean-Pierre, Maulik, Sunil, Relnh, John
Format:	Article
Language:	English
Subjects:	Amino Acid Sequence Base Sequence Biological and medical sciences Databases, Factual Diverse techniques Fundamental and applied biological sciences. Psychology Mathematical Computing Molecular and cellular biology Software User-Computer Interface
Citations:	Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	We have increased the sensitivity ofDNA and protein sequence database searches by allowing similar but non-identical amino acids or nucleotides to match. In addition, one can match k-tuples or words instead of matching individual residues in order to speed the search. A matching matrix specifies which k-tuples match each other. The matching matrix can be calculated from a similarity matrix of amino acids and a threshold of similarity required for matching. This permits amino acid similarity matrices or replacement matrices (PAM matrices) to be used in the first step of a sequence comparison rather than in a secondary scoring phase. The concept of matching non-identical k-tuples also increases the power ofDNA database searches. For example, a matrix that specifies that any 3-tuple in a DNA sequence can match any other 3-tuple encoding the same amino acid permits a DNA database search using a DNA query sequence for regions that would encode a similar amino acid sequence.
ISSN:	1367-4803 0266-7061 1460-2059
DOI:	10.1093/bioinformatics/6.3.237