Sunday, November 02, 2008

Suffix Tree

Tries and suffix tree

wiki - Suffix Trees

http://allisons.org/ll/AlgDS/Tree/Suffix/

image

 

image

The suffix tree can be built in O(n) time (linear time regarding the length of strings) due to Ukkonen (1995).

Applications: [1,2,3] from http://allisons.org/ll/AlgDS/Tree/Suffix/

  1. String Search: Searching for a substring, pat[1..m], in txt[1..n], can be solved in O(m) time (after the suffix tree for txt has been built in O(n) time).
  2. Palindromes: The longest palindrome of txt[1..n] can be found in O(n) time, e.g. by building the suffix tree for txt$reverse(txt)# or by building the generalized suffix tree for txt and reverse(txt).
  3. Longest Common String:  The longest common substring of two strings, txt1 and txt2, can be found by building a generalized suffix tree for txt1 and txt2: Each node is marked to indicate if it represents a suffix of txt1 or txt2 or both. The deepest node marked for both txt1 and txt2 represents the longest common substring.  Equivalently, one can build a (basic) suffix tree for the string txt1$txt2#, where `$' is a special terminator for txt1 and `#' is a special terminator for txt2. The longest common substring is indicated by the deepest fork node that has both `...$...' and `...#...' (no $) beneath it.