Efficient exact pattern-matching in proteomic sequences
Sérgio Deusdado1, Paulo Carvalho2
Escola Superior Agrária1
Instituto Politécnico de Bragança
P-5300 Bragança, Portugal
E-mail: sergiod at ipb.pt
Universidade do Minho2
Departamento de Informática
P-4710-057 Braga, Portugal
E-mail: pmc at di.uminho.pt
Abstract
This paper proposes a novel algorithm for complete exact
pattern-matching focusing the specificities of protein sequences
(alphabet of 20 symbols) but, also highly efficient considering larger
alphabets. The searching strategy uses large search windows allowing
multiple alignments per iteration. A new filtering heuristic, named
compatibility rule, contributed decisively to the efficiency
improvement. The new algorithm’s performance is, on average, superior
in comparison with its best-rated competitors.
IWPACBB'09,
Salamanca, Spain, June 2009