[CS.DS] Linear-Time Algorithm for Smallest Suffixient Sets

Recent studies have highlighted suffix arrays as a novel data structure, offering pattern matching functionalities with less asymptotic space compared to the traditional Run-Length BWT. Various algorithms exist for constructing these from suffix array data structures. We present the first construction algorithm that meets the following criteria: (i) linear time complexity, (ii) one-pass over the structures, and (iii) practical implementation. This makes the algorithm particularly useful for large text collections.

We empirically demonstrate its advantage in the space/time tradeoff map, showcasing its superiority in practical applications.

Blogger's Review: The linear-time algorithm proposed in this paper provides a fresh perspective on constructing suffix arrays, significantly enhancing efficiency when dealing with large datasets, and holds substantial practical value.