GRASPm: An efficient algorithm for exact pattern-matching in
genomic sequences
Sérgio Deusdado1, Paulo Carvalho2
Instituto Politécnico de Bragança1
P-5300 Bragança, Portugal
E-mail: sergiod at ipb.pt
Universidade do Minho2
Departamento de Informática
P-4710-057 Braga, Portugal
E-mail: pmc at di.uminho.pt
Abstract
In this paper, we propose Genomic-oriented Rapid Algorithm for String
Pattern-match (GRASPm), an algorithm centred on overlapped 2-grams
analysis, which introduces a novel filtering heuristic – the
compatibility rule – achieving significant efficiency gain. GRASPm’s
foundations rely especially on a wide searching window having the
central duplet as reference for fast filtering of multiple alignments.
Subsequently, superfluous detailed verifications are summarily avoided
by filtering the incompatible alignments using the idcd (involving
duplet of central duplet) concept combined with pre-processed
conditions, allowing fast parallel testing for multiple alignments.
Comparative performance analysis, using diverse genomic data, shows
that GRASPm is faster than its competitors.
in
International Journal of Bioinformatics Research and Applications,
InderScience Publishers, vol.5, n.4, 2009