1

Surely there is something out there but googling doesn't give me what I'm looking for. Perhaps it's because I don't know the name of the algorithm to lookout for?

Basically, I have two lists with the same content and size.

List1: {10, 30, 2, 4, 4}
List2: {4, 4, 10, 30, 2}

Notice that the sequence is the same for both lists. ie: List2 can be seen as a starting from the previous to last position in List1 and continuing to iterate from the start of List1 until back at the starting position.

List1: {10, 30, 2, 4, 4} 10, 30, 2
                   |  |  |   |   |
List2:            {4, 4, 10, 30, 2}

The two lists are then considered equivalent.

The following two lists are not:

List1: {10, 30, 2, 4, 3} 10, 30, 2
                   |  |  X   X   |
List2:            {4, 4, 30, 10, 2}

Update

What I am doing right now is having List1 concatenated to itself and searching List2 inside of it.

I feel this is innefficient.

Suppose I want to iterate each list once?

Update

Ok, in the end I went with the algo described at: Check if a string is rotation of another WITHOUT concatenating and adapted it to my data types.

Community
  • 1
  • 1
Stécy
  • 11,951
  • 16
  • 64
  • 89
  • You could do the general n^2 algorithm (check every element between the lists) – But I'm Not A Wrapper Class Sep 04 '14 at 18:42
  • Yes I could do that but I was looking for the algo name and/or existing implementation. – Stécy Sep 04 '14 at 18:43
  • This is called a rotation and there is a fairly simple algorithm for detecting it in strings - see [this answer](http://stackoverflow.com/a/2553533/152602) for example. Finding it for general sequences would require detecting whether one sequence is a subsequence of another and I don't think there's a built-in linq operator for doing so. – Lee Sep 04 '14 at 18:45
  • 1
    [This](http://stackoverflow.com/questions/6105087/check-if-two-arrays-are-cyclic-permutations) should help. – But I'm Not A Wrapper Class Sep 04 '14 at 18:46
  • @Lee That won't probably work. The problem is with items that are in both lists - they will fail the comparison on concatenation. – Eugene Podskal Sep 04 '14 at 18:48
  • @Stécy Try at first to do it in a bruteforce way. If it will be too slow, then you can always try to optimize it adapting some clever substring search algorithm. – Eugene Podskal Sep 04 '14 at 18:50
  • Just curious, how long are these lists? If they're not very long efficiency is almost meaningless. – RenniePet Sep 04 '14 at 18:53
  • Lists can be several thousands of items... – Stécy Sep 04 '14 at 18:53
  • 2
    You can use an adaptation of the KMP string matching algorithm to match the elements in the list. – Mike Dinescu Sep 04 '14 at 18:59
  • The technique linked to by Lee could be optimized hugely if it was implemented in machine code. Turn the two lists into memory areas A and BB, and search for A in BB by using the machine instruction that compares a specified number of bytes. – RenniePet Sep 04 '14 at 18:59
  • Something similar was covered here on SO yesterday (I think it was): You need to find the head of each list, and you can do that by splitting each array in half. If the low element in one half is less than the high element, that part is sorted, so look at the other half. Repeat recursively until you've found the head. – 500 - Internal Server Error Sep 04 '14 at 19:20
  • The terminology is, like in the thread @CyberneticTwerkGuruOrc linked to, that you want to test if one list is a _cyclic permutation_ of the other list. – Jeppe Stig Nielsen Sep 04 '14 at 19:58
  • You could get faster if you know what type of your input data actually is. Perhaps you get much better results with a probabilistic compare where you check the first and last number of a potential permutation. If the data is structured this way you have a fast way to exit early. Or you could compare 16 integers in a row and skip then 16 integers and so on. That would be the length of a cache line (64 bytes on modern CPUs) which should give you pretty good performance when walking through the array. – Alois Kraus Sep 04 '14 at 21:18
  • If you have anything above 10 thousand items, a O(n^2) algorithm is a poor choice. – Juan Lopes Sep 04 '14 at 22:59
  • possible duplicate of [Check if a string is rotation of another WITHOUT concatenating](http://stackoverflow.com/questions/12028416/check-if-a-string-is-rotation-of-another-without-concatenating) – Peter O. Sep 05 '14 at 04:27

4 Answers4

5

Search for List2 inside List1+List1. You can use KMP to do that in linear time.

UPDATE: optimized the code to be at least as fast as other solutions in the best case and amazingly faster in the worst-case

Some code. The code uses a modified version of KMP, that receives List1, but actually considers as it was already doubled (for performance).

using System;
using System.Collections.Generic;
using System.Linq;

namespace Test
{
    public class CyclicKMP<T> {
        private readonly IList<T> P;
        private readonly int[] F;
        private readonly EqualityComparer<T> comparer;

        public CyclicKMP(IList<T> P) {
            this.comparer = EqualityComparer<T>.Default;
            this.P = P;
            this.F = new int[P.Count+1];

            F[0] = 0;  F[1] = 0;  
            int i = 1, j = 0;
            while(i<P.Count) {
                if (comparer.Equals(P[i], P[j]))
                    F[++i] = ++j;
                else if (j == 0)
                    F[++i] = 0;
                else
                    j = F[j];
            }
        }

        public int FindAt(IList<T> T, int start=0) {
            int i = start, j = 0;
            int n = T.Count, m = P.Count;

            while(i-j <= 2*n-m) {
                while(j < m) {
                    if (comparer.Equals(P[j], T[i%n])) {
                        i++; j++;
                    } else break;
                }
                if (j == m) return i-m;
                else if (j == 0) i++;
                j = F[j];
            }
            return -1;
        }
    }

    class MainClass
    {
        public static bool Check<T>(IList<T> list1, IList<T> list2) {
            if (list1.Count != list2.Count)
                return false;
            return new CyclicKMP<T> (list2).FindAt (list1) != -1;
        }

        public static void Main (string[] args)
        {
            Console.WriteLine (Check(new[]{10, 30, 2, 4, 4}, new[]{4, 4, 10, 30, 2}));
            Console.WriteLine (Check(new[]{10, 30, 2, 4, 3}, new[]{4, 4, 10, 30, 2}));
        }
    }
}
Juan Lopes
  • 10,143
  • 2
  • 25
  • 44
  • Small aside, this would also require testing for cardinality. Otherwise if L1 = "1, 2, 3" and L2 = "2", just performing the search would incorrectly return true. – Justin R. Sep 04 '14 at 20:00
0

So to start with we need a method to shift a list some number of times, wrapping the items while doing so. This is actually fairly straightforward to do:

public static IEnumerable<T> Shift<T>(this IEnumerable<T> source, int number)
{
    return source.Skip(number)
        .Concat(source.Take(number));
}

Here we're able to iterate over a sequence with a given shift.

With this all we need to do is see if one of the lists, shifted to any number of times up to its size, is equal to the other.

public static bool EqualWithShifting<T>(this IList<T> first, IList<T> second)
{
    return first.Count == second.Count &&
        Enumerable.Range(0, first.Count-1)
        .Any(shift => first.SequenceEqual(second.Shift(shift)));
}

Note that while this algorithm does have n^2 complexity in the worst case, due to the short circuiting of both SequenceEqual and Any this won't actually do nearly that much work for pretty much any real data.

Servy
  • 202,030
  • 26
  • 332
  • 449
  • Interesting but Shift is enumerating the source list twice. – Stécy Sep 04 '14 at 18:56
  • @Stécy Well, sometimes. Only if it gets through the entirety of the un-skipped items. But it can, yes, which is why I typed the parameter of the equality check as an `IList`, to ensure that it's only ever a materialized list that shouldn't have any problems being iterated multiple times. – Servy Sep 04 '14 at 19:20
0

How about:

void Main()
{

bool same = true;

var list1 = new List<int> {10, 30, 2, 4, 4};
var list2 = new List<int> {4, 4, 10, 30, 2};

var start = -1;

for(int i=0; i<list1.Count();i++)
{
 if(list1[0] == list2[i])
 { 
  start = i;
  break;
 }
}

start.Dump("Start");

if(-1 == start)
{ 
 same = false;
}
else
{
int x = 0;
int y = x + start;

int scanned = 0;

while(scanned < list2.Count)
{
 scanned++; 

 if(list1[x] == list2[y])
 {
    x++;
    y++;     if(y >= list2.Count)  {  y = 0; }
 }
 else
 {
   same = false;
   break;
 }
}
}

same.Dump("same");
}

// Define other methods and classes here
Ismail Hawayel
  • 2,167
  • 18
  • 16
0

Update

The more efficient algo (so far) when having a lot of unequal lists is the one from Juan.

I updated the answer.


Here's a LinqPad code to show the perf comparison between several algorithm.

It uses LinqPad extensions from @GitHub : zaus / LinqPad.MyExtensions.cs

void Main()
{
    var first = "abcabcdefabcabcabcabcdefabcabc".ToList();
    var second = "abcdefabcabcabcabcdefabcabcabc".ToList();

    var p = new Perf<bool>
    {
        { "AreEqualByRotation", n => AreEqualByRotation (first, second) },
        { "IsEqualWithShifting", n => IsEqualWithShifting (first, second) },
        { "EqualWithShifting", n => EqualWithShifting (first, second) },
        { "CheckIt", n => CheckIt (first, second) },
    }.Vs("Shifting", 100000);
}

// Define other methods and classes here
bool AreEqualByRotation<T> (List<T> s1, List<T> s2)
{
    if (s1.Count != s2.Count)
        return false;

    for (int i=0; i<s1.Count; ++i)
    {
        bool res = true;
        int index = i;
        for (int j=0; j<s1.Count; ++j)
        {
            if (!s1[j].Equals(s2[index]))
            {
                res = false;
                break;
            }
            index = (index+1)%s1.Count;
        }
        
        if (res == true)
            return true;
    }
    return false;
}

bool IsEqualWithShifting<T> (List<T> list1, List<T> list2)
{
    if (list1.Count != list2.Count) 
        return false;

    for (var i = 0; i < list1.Count; i++)
    {
        for (var j = 0; j < list2.Count; j++)
        {
            int countFound = 0;

            while (list1[(i + countFound) % list1.Count].Equals(list2[(j + countFound) % list2.Count]) && countFound < list1.Count)
                countFound++;

            if (countFound == list1.Count)
                return true;
        }
    }

    return false;
}

public static IEnumerable<T> Shift<T>(IEnumerable<T> source, int number)
{
    return source.Skip(number)
        .Concat(source.Take(number));
}

public static bool EqualWithShifting<T>(IList<T> first, IList<T> second)
{
    return first.Count == second.Count &&
        Enumerable.Range(0, first.Count-1)
        .Any(shift => first.SequenceEqual(Shift(second, shift)));
}

public class KMP<T> {
   private readonly IList<T> P;
   private readonly int[] F;

   public KMP(IList<T> P) {
       this.P = P;
       this.F = new int[P.Count+1];

       F[0] = 0;  F[1] = 0;  
       int i = 1, j = 0;
       while(i<P.Count) {
           if (Object.Equals(P[i], P[j]))
               F[++i] = ++j;
           else if (j == 0)
               F[++i] = 0;
           else
               j = F[j];
       }
   }

   public int FindAt(IList<T> T, int start=0) {
       int i = start, j = 0;
       int n = T.Count, m = P.Count;

       while(i-j <= n-m) {
           while(j < m) {
               if (Object.Equals(P[j], T[i])) {
                   i++; j++;
               } else break;
           }
           if (j == m) return i-m;
           else if (j == 0) i++;
           j = F[j];
       }
       return -1;
   }
}

public static bool CheckIt<T> (IList<T> list1, IList<T> list2)
{
    return Check (list1, list2) != -1;
}

public static int Check<T>(IList<T> list1, IList<T> list2)
{
    if (list1.Count != list2.Count)
        return -1;
    return new KMP<T> (list2).FindAt (new List<T>(Enumerable.Concat(list1, list1)));
}

Result is :

|      Algorithm      |              Time spent             | Result |  
|:-------------------:|:-----------------------------------:|--------|  
| AreEqualByRotation  | 1262852 ticks elapsed (126.2852 ms) | True   |  
| IsEqualWithShifting | 1508527 ticks elapsed (150.8527 ms) | True   |  
| EqualWithShifting   | 4691774 ticks elapsed (469.1774 ms) | True   |  
| CheckIt             | 4079400 ticks elapsed (407.94 ms)   | True   |  

winner: AreEqualByRotation

Update

Juan Lopes remarked that his solution is faster when the strings are not equal.

Here's with this input and the results for 1000 iterations:

var first = new string('a', 200).ToList();
var second = (new string('a', 199)+'b').ToList();

|      Algorithm      |              Time spent                | Result |  
|:-------------------:|:--------------------------------------:|--------|  
| AreEqualByRotation  | 187613414 ticks elapsed (18761.3414 ms)| False  |  
| IsEqualWithShifting | Too long !!!                           | False  |  
| EqualWithShifting   | 766858561 ticks elapsed (76685.8561 ms)| False  |  
| CheckIt             | 28222332 ticks elapsed (2822.2332 ms)  | False  |  

winner: CheckIt

Community
  • 1
  • 1
Stécy
  • 11,951
  • 16
  • 64
  • 89
  • You said your lists can be several thousand items long. With an O(n^2) algorithm you may have a hard time. Try: `var a = new int[100000]; var b = new int[100000]; a[a.Length-1] = 1; Console.WriteLine (AreEqualByRotation(a, b));`. I keep my point. KMP is still the best option. – Juan Lopes Sep 04 '14 at 22:57
  • I compared several algo including yours and my timings indicate yours comes 3rd out of 4 algo! – Stécy Sep 05 '14 at 12:31
  • You said your lists may contain "several thousands of items". Your algorithm performs poorly in worst-case as the lists grow. Your benchmark proves nothing and is totally missing the point: the complexity of all algorithms here is O(n^2), while mine has a O(n) complexity. – Juan Lopes Sep 05 '14 at 16:38
  • Using a simple 200 char string may reveal surprising results: `var first = new string('a', 200).ToList(); var second = (new string('a', 199)+"b").ToList();` – Juan Lopes Sep 05 '14 at 17:04
  • Holy cow! Your method is really faster on the unhappy path! I will update this answer and mark yours as the one to use. It is sad that we can't have the best of both worlds (fast for the happy and unhappy paths). – Stécy Sep 05 '14 at 17:13
  • Updated my code to avoid creating the doubled List1 and unnecessary boxing of primitive types. It seems to be running as fast as the other solutions in the original test. – Juan Lopes Sep 05 '14 at 17:30