1. Introduction
A Suffix Array is a sorted array of all the suffixes of a given string. It is a powerful data structure that can be used in various string processing and pattern matching tasks. An example of its use is in the implementation of the Burrows-Wheeler Transform, which is used in data compression. In this post, we will discuss the construction of a suffix array and demonstrate its utility.
2. Program Steps
1. Generate all the suffixes of the given string.
2. Sort the generated suffixes.
3. Store the indices of the sorted suffixes in an array. This array is the Suffix Array.
3. Code Program
public class SuffixArray {
private String[] suffixes;
private int[] sa;
private String text;
public SuffixArray(String text) {
this.text = text;
int n = text.length();
this.suffixes = new String[n];
this.sa = new int[n];
for (int i = 0; i < n; i++) {
suffixes[i] = text.substring(i);
}
for (int i = 0; i < n; i++) {
sa[i] = i;
}
java.util.Arrays.sort(sa, (a, b) -> suffixes[a].compareTo(suffixes[b]));
}
public int[] getSuffixArray() {
return sa;
}
public static void main(String[] args) {
String str = "banana";
SuffixArray sa = new SuffixArray(str);
int[] suffixArr = sa.getSuffixArray();
for (int i : suffixArr) {
System.out.println(i + " : " + str.substring(i));
}
}
}
Output:
5 : a 3 : ana 1 : anana 0 : banana 4 : na 2 : nana
4. Step By Step Explanation
1. We start by initializing the SuffixArray class with a text. The goal is to generate a suffix array for this text.
2. In the constructor, we generate all the suffixes of the given string.
3. We then initialize the sa (suffix array) with indices ranging from 0 to n-1.
4. The suffix array (sa) is populated with indices in such a way that the suffixes referenced by these indices are in sorted order.
5. The getSuffixArray method returns the constructed suffix array.
6. In the main function, we create a SuffixArray for the string "banana" and then print each index from the suffix array followed by its corresponding suffix.
7. The output shows the indices of the sorted suffixes of the string "banana".
Note: Constructing a suffix array in this way (by sorting all the suffixes) has a time complexity of O(n^2 log n) where n is the length of the string. There are more efficient algorithms to construct a suffix array in O(n log n) time, but they are more complex and are beyond the scope of this simple introduction.