In string processing, a non-empty string is closed if its length is one or its longest border appears exactly twice. A closed substring's occurrence is a maximal closed substring (MCS) if it cannot be extended to the left or right while retaining closedness. MCSs can be viewed as a general class of maximal repetitive structures, including runs. This paper studies the online computation of MCSs of a string, where one character is appended at a time.
Our algorithm detects newly formed MCSs using the rightmost previous occurrences of suffixes. To support this efficiently, we introduce a novel data structure called the link-cut suffix tree (LCST), which combines an online suffix tree with a link-cut tree. The LCST maintains rightmost occurrence information for substrings represented in the suffix tree in $O(n \log n)$ total time and $O(n)$ space, where $n$ is the length of the input string.
Using the LCST, we obtain an $O(n \log n)$-time online algorithm for computing all MCSs, which is worst-case optimal. Additionally, direct applications of the LCST include online algorithms for rightmost LZ77 factorizations and most recent match queries.
Blogger's Review: The LCST data structure presented in this paper offers a novel approach to online string processing, particularly excelling in real-time computation of maximal closed substrings. Its optimization in both space and time enhances efficiency when dealing with long strings, indicating a wide range of potential applications.