1. Introduction
A Suffix Tree is a compressed trie of all the suffixes of a given string. It is a powerful data structure that can be used in various string processing and pattern matching tasks, including substring checks, pattern matching, and other applications. Suffix trees provide faster query times than suffix arrays but take more space and are more complex to construct.
2. Program Steps
1. Create a Node class that will represent each node in the tree.
2. Initialize the Suffix Tree with a given string and build the tree.
3. For every suffix of the string, insert it into the tree.
3. Code Program
class SuffixTree {
private static final int ALPHABET_SIZE = 256;
Node root = new Node();
static class Node {
Node[] children = new Node[ALPHABET_SIZE];
int start;
int end;
Node() {
this.start = -1;
this.end = -1;
}
Node(int start, int end) {
this.start = start;
this.end = end;
}
}
public SuffixTree(String str) {
for (int i = 0; i < str.length(); i++) {
insertSuffix(str, i);
}
}
private void insertSuffix(String str, int idx) {
Node currentNode = root;
for (int j = idx; j < str.length(); j++) {
if (currentNode.children[str.charAt(j)] == null) {
currentNode.children[str.charAt(j)] = new Node(j, str.length() - 1);
return;
}
currentNode = currentNode.children[str.charAt(j)];
}
}
public boolean contains(String pattern) {
Node currentNode = root;
for (char c : pattern.toCharArray()) {
if (currentNode.children[c] == null) return false;
currentNode = currentNode.children[c];
}
return true;
}
public static void main(String[] args) {
SuffixTree tree = new SuffixTree("banana");
System.out.println(tree.contains("ana")); // true
System.out.println(tree.contains("ban")); // true
System.out.println(tree.contains("apple")); // false
}
}
Output:
true true false
4. Step By Step Explanation
1. We define the SuffixTree class which will be used to construct the tree.
2. The Node class represents each node in our tree. Each node has an array of children representing characters from the alphabet. Each child points to another node in the tree.
3. The start and end in each node are used to denote the substring's start and end positions represented by the node.
4. The SuffixTree constructor iterates over the string, creating a suffix from each character to the end of the string and inserts this suffix into the tree.
5. The insertSuffix function inserts a given suffix into the tree.
6. The contains function checks if a pattern exists within the tree. It traverses the tree according to the characters in the pattern. If it can traverse the tree without any missing nodes, the tree contains the pattern.
7. In the main function, we create a SuffixTree for the string "banana" and then check for the presence of a few substrings in the tree.
Note: The construction of the Suffix Tree in this example is simplified for the sake of clarity. Advanced construction algorithms like Ukkonen's can build the tree in linear time.