2

Suppose I have the data set {A,B,C,D}, of arbitrary type, and I want to compare it to another data set. I want the comparison to be true for {A,B,C,D}, {B,C,D,A}, {C,D,A,B}, and {D,A,B,C}, but not for {A,C,B,D} or any other set that is not ordered similarly. What is a fast way to do this?

Storing them in arrays,rotating, and doing comparison that way is an O(n^2) task so that's not very good.

My first intuition would be to store the data as a set like {A,B,C,D,A,B,C} and then search for a subset, which is only O(n). Can this be done any faster?

Dr. John A Zoidberg
  • 1,168
  • 2
  • 14
  • 25
  • 2
    Possible duplicate of [How to check whether two lists are circularly identical in Python](http://stackoverflow.com/questions/26924836/how-to-check-whether-two-lists-are-circularly-identical-in-python) – Salvador Dali Jun 20 '16 at 06:34

3 Answers3

6

There is a fast algorithm for finding the minimum rotation of a string - https://en.wikipedia.org/wiki/Lexicographically_minimal_string_rotation. So you can store and compare the minimum rotation.

mcdowella
  • 19,301
  • 2
  • 19
  • 25
2

One option is to use a directed graph. Set up a graph with the following transitions:

A -> B
B -> C
C -> D
D -> A

All other transitions will put you in an error state. Thus, provided each member is unique (which is implied by your use of the word set), you will be able to determine membership provided you end on the same graph node on which you started.

If a value can appear multiple times in your search, you'll need a smarter set of states and transitions.

This approach is useful if you precompute a single search and then match it to many data points. It's not so useful if you have to constantly regenerate the graph. It could also be cache-inefficient if your state table is large.

paddy
  • 60,864
  • 6
  • 61
  • 103
0

Well Dr Zoidberg, if you are interested in order, as you are, then you need to store your data in a structure that preserves order and also allows for easy rotation. In Python a list would do.

Find the smallest element of the list then rotate each list you want to compare until the smallest element of them is at the beginning. Note: this is not a sort, but a rotation. With all the lists for comparison so normalised, a straight forward list compare between any two would tell if they are the same after rotation.

>>> def rotcomp(lst1, lst2):
    while min(lst1) != lst1[0]:
        lst1 = lst1[1:] + [lst1[0]]
    while min(lst2) != lst2[0]:
        lst2 = lst2[1:] + [lst2[0]]
    return lst1 == lst2

>>> rotcomp(list('ABCD'), list('CDAB'))
True
>>> rotcomp(list('ABCD'), list('CDBA'))
False
>>> 
>>> rotcomp(list('AABC'), list('ABCA'))
False
>>> def rotcomp2(lst1, lst2):
    return repr(lst1)[1:-1] in repr(lst2 + lst2)

>>> rotcomp2(list('ABCD'), list('CDAB'))
True
>>> rotcomp2(list('ABCD'), list('CDBA'))
False
>>> rotcomp2(list('AABC'), list('ABCA'))
True
>>> 

NEW SECTION: WITH DUPLICATES?

If the input may contain duplicates then, (from the possible twin question mentioned under the question), An algorithm is to see if one list is a sub-list of the other list repeated twice.

function rotcomp2 uses that algorithm and a textual comparison of the repr of the list contents.

Paddy3118
  • 4,704
  • 27
  • 38