String matching is a problem that occurs in many application domains
such as text editing, term rewriting, computational molecular biology.
A basic matching problem is to find the locations of a word
in a text. This problem has been generalized to considering
a {\em set of words} by Aho and Corasick who proposed an elegant
automata technique to solve it efficiently.
In this talk we shall discuss a further extension that
considers {\em patterns} rather than words.
Given an alphabet $A$, a {\em pattern} $p$ is
a word $v_1@\ldots@v_m$, where $v_i\in A^*$ and $@\notin A$ is a
distinguished symbol
called {\em variable length don't care symbol}. Pattern $p$ is
said to {\em match} a text $t\in A^*$ if $t=u_0 v_1 u_1 \ldots u_{m-1}
v_m u_m$ for some $u_0,\ldots,u_m\in A^*$.
We adress the following problem: given a set $P$ of patterns and a
text $t$, test whether one of the patterns of $P$ matches $t$.
We shall describe an algorithm that solves the problem in time
$\O{(|t|+|P|)\log |P|}$.
In contrast to most of the existing string matching
algorithms this algorithm is not composed of two
successive stages -- preprocessing the pattern (resp. the text) and reading
through the text (resp. the pattern) -- but has these two
stages essentially interleaved.
The basic data structure that it used is the
DAWG (Directed Acyclic Word Graph) used by Blumer Blumer, Haussler,
Ehrenfeucht, Chen, Seiferas)
for constructing the minimal {\em factor
automaton} for a (set of) word(s) in linear time.
We shall discuss a strong relation of our problem with the so-called
dynamic dictionary matching problem.