Algorithms for Two Versions of LCS Problem for Indeterminate Strings

Iliopoulos, Costas; Rahman, M. Sohel; Rytter, Wojciech

Abstract

References

Journal of Combinatorial Mathematics and Combinatorial Computing

Volume 071
Pages: 155-172

Research article

Algorithms for Two Versions of LCS Problem for Indeterminate Strings

^¹,², ^³,⁴, ^¹,⁴

¹Department of Mathematics and Informatics Copernicus University, Torun, Poland

²Algorithm Design group Department of Computer Science King’s College London Strand, London WO2R 2LS, England

³Department of Computer Science & Engineering Bangladesh University of Engineering & Technology Dhaka-1000, Bangladesh

⁴Institute of Informatics Warsaw University Warsaw, Poland

Published: 30/11/2009

Copyright Link
License

Abstract

We study the complexity of the longest common subsequence (LCS) problem from a new perspective. By an indeterminate string (i-string, for short) we mean a sequence \( \tilde{X} = \tilde{X}[1]\tilde{X}[2]\ldots \tilde{X}[n] \), where \( \tilde{X}[i] \subseteq \Sigma \) for each \( i \), and \( \Sigma \) is a given alphabet of potentially large size. A subsequence of \( \tilde{X} \) is any usual string over \( \Sigma \) which is an element of the finite (but usually of exponential size) language \( \tilde{X}[i_1]\tilde{X}[i_2]\ldots \tilde{X}[i_p] \), where \( 1 \leq i_1 < i_2 < i_3 \ldots < i_p \leq n \), \( p \geq 0 \). Similarly, we define a supersequence of \( x \). Our first version of the LCS problem is Problem ILCS: for given i-strings \( \tilde{X} \) and \( \tilde{Y} \), find their longest common subsequence. From the complexity point of view, new parameters of the input correspond to \( |\Sigma| \), and maximum size \( \ell \) of the subsets in \( \tilde{X} \) and \( \tilde{Y} \). There is also a third parameter \( \mathcal{R} \), which gives a measure of similarity between \( \tilde{X} \) and \( \tilde{Y} \). The smaller the \( \mathcal{R} \), the lesser is the time for solving Problem ILCS. Our second version of the LCS problem is Problem CILCS (constrained ILCS): for given i-strings \( \tilde{X} \) and \( \tilde{Y} \) and a plain string \( Z \), find the longest common subsequence of \( \tilde{X} \) and \( \tilde{Y} \) which is, at the same time, a supersequence of \( Z \). In this paper, we present several efficient algorithms to solve both ILCS and CILCS problems. The efficiency in our algorithms is obtained in particular by using an efficient data structure for special types of range maxima queries and fast multiplication of boolean matrices.

Contents

Journal of Combinatorial Mathematics and Combinatorial Computing

Algorithms for Two Versions of LCS Problem for Indeterminate Strings

Abstract

Information

Guidelines

CP Initiatives

Follow CP