In this work, we explore a novel data structure called the suffix array, which, when combined with an index for direct access on a text $T$, allows for efficient handling of various pattern matching queries.
We demonstrate how to compute the smallest suffix array for $T[1\dots n]$ in $O(\frac{n\log \sigma}{\sqrt{\log n}}+\min(r,\bar{r})\log^\epsilon n)$ time, where $\sigma$ is the alphabet size of $T$, and $r$ and $\bar{r}$ are the counts of equal-letter runs in the Burrows-Wheeler transforms of $T$ and its reverse $\overline{T}$, respectively.
This time complexity becomes sublinear when $\sigma$ is small enough and $\min(r,\bar{r})=o(\frac{n}{\log^\epsilon n})$, yielding a significant improvement over state-of-the-art algorithms. We also present a series of related algorithmic results.
Blogger's Review: The proposed suffix array structure offers an innovative solution to pattern matching problems, showing significant performance advantages in cases of small alphabets, making it a promising avenue for further exploration in practical applications.