Algorithms for the Lyndon Unique Maximal Factorization.

David E. Daykin1
1Deptartment of Mathematics, University of Reading, U.K

Abstract

Let \(\Sigma\) be a totally ordered set. We work on finite strings \(b = b_1 b_2 \ldots b_m\) of \(b_i\) elements from \(\Sigma\). Such a \(b\) is a lyn(Lyndon word) if \(m \geq 1\), and \(b\) is the unique first in lex(lexicographic order) among the \(m\) rows of the \(m \times m\) circulant matrix with \(b\) as first row.

A classic result is that every string \(b\) has a unique max factorization \(umf(b)\) into Lyns , each Lyn of maximum possible size in \(b\).

In 1983 J. P. Duval [6] published Algorithm 1, which finds \(umf(b)\). It was studied in 1991 by A. Apostolico and M. Crochemore [1]. Then their work was studied in 1994 by J.W. Daykin, C.S. Iliopoulos, and W.F. Smyth [5].

Since Duval used a programming language, we start by giving a new simple account of his Algorithm 1. Then our Algorithm 2 given here modifies Duval’s Algorithm 1 to find \(umf(a)\), when \(a\) is a string \(a = A_1 A_2 \ldots A_p\) of Lyndon words \(A_I\).

Our Algorithm 3 is also for a string \(a = A_1 A_2 \ldots A_p\) of Lyndon words \(A_I\). It is completely different from Algorithms 1 and 2. It snakes right, left, right, and so on. It revealed the fact that Lyndon words have a special structure. We give an example where Algorithm 3 needs almost \(2m\) tests; we think that is the most needed, but cannot give a rigorous proof.

We find interesting properties of Lyns, some of which may be new.