3

I'm having a List of String like

List<string> MyList = new List<string>
{ 
    "A-B", 
    "B-A", 
    "C-D", 
    "C-E", 
    "D-C",
    "D-E",
    "E-C",
    "E-D",
    "F-G",
    "G-F"
};

I need to remove duplicate from the List i.e, if "A-B" and "B-A" exist then i need to keep only "A-B" (First entry)

So the result will be like

"A-B"   
"C-D"
"C-E"   
"D-E"
"F-G"

Is there any way to do this using LINQ?

abatishchev
  • 98,240
  • 88
  • 296
  • 433
Thorin Oakenshield
  • 14,232
  • 33
  • 106
  • 146

6 Answers6

14

Implement IEqualityComparer witch returns true on Equals("A-B", "B-A"). And use Enumerable.Distinct method

gandjustas
  • 1,925
  • 14
  • 12
  • 1
    Example of IEqualityComparer implementation will bring you an answer approval ;) – abatishchev Sep 21 '10 at 06:47
  • 4
    @abatischev: I don't know - giving an IEqualityComparer imlpementation would feel like doing someone else's homework... – Niki Sep 21 '10 at 07:12
12

This returns the sequence you look for:

var result = MyList
    .Select(s => s.Split('-').OrderBy(s1 => s1))
    .Select(a => string.Join("-", a.ToArray()))
    .Distinct();

foreach (var str in result)
{
    Console.WriteLine(str);
}

In short: split each string on the - character into two-element arrays. Sort each array, and join them back together. Then you can simply use Distinct to get the unique values.

Update: when thinking a bit more, I realized that you can easily remove one of the Select calls:

var result = MyList
    .Select(s => string.Join("-", s.Split('-').OrderBy(s1 => s1).ToArray()))
    .Distinct();

Disclaimer: this solution will always keep the value "A-B" over "B-A", regardless of the order in which the appear in the original sequence.

Fredrik Mörk
  • 155,851
  • 29
  • 291
  • 343
  • 4
    Downvoters, please leave a comment so any errors can be corrected. – Fredrik Mörk Sep 21 '10 at 07:03
  • +1, minor quibble though. The `Distinct` method is defined as returning an unordered collection so to be 100% correct you'd need to sort the result to get the ordered specified by the OP. Then again it's implemented as an ordered collection so it's at best a nitpick. – JaredPar Sep 21 '10 at 07:16
4

You can use the Enumerable.Distinct(IEnumerable<TSource>, IEqualityComparer<TSource>) overload.

Now you just need to implement IEqualityComparer. Here's something for you to get started:

class Comparer : IEqualityComparer<String>
{

    public bool Equals(String s1, String s2)
    {
        // will need to test for nullity
        return Reverse(s1).Equals(s2);
    }

    public int GetHashCode(String s)
    {
        // will have to implement this
    }

}

For a Reverse() implementation, see this question

Community
  • 1
  • 1
NullUserException
  • 83,810
  • 28
  • 209
  • 234
  • This assumes that the bits before and after the `-` are always single characters (or that you want palindromic equality). The asker has indicated that this assumption does not hold. If you fix the answer, I’ll upvote. (While you’re at it, in the interest of good education, please give it a more descriptive name than simply “Comparer”.) – Timwi Sep 21 '10 at 07:05
1

Very basic, but could be written better (but it's just working):

class Comparer : IEqualityComparer<string>
  {
      public bool Equals(string x, string y)
      {
          return (x[0] == y[0] && x[2] == y[2]) || (x[0] == y[2] && x[2] == y[0]);
      }

      public int GetHashCode(string obj)
      {
          return 0;
      }
  }

var MyList = new List<String>
{ 
    "A-B", 
    "B-A", 
    "C-D", 
    "C-E", 
    "D-C",
    "D-E",
    "E-C",
    "E-D",
    "F-G",
    "G-F"
}
.Distinct(new Comparer());

foreach (var s in MyList)
{
    Console.WriteLine(s);
}
  • 2
    There's a subtle bug: if you run the debugger, you should notice that E-C equals E-D using your code, which is incorrect... – code4life Sep 21 '10 at 07:10
1

You need to implement the IEqualityComparer like this:

public class CharComparer : IEqualityComparer<string>
{
    #region IEqualityComparer<string> Members

    public bool Equals(string x, string y)
    {
        if (x == y)
            return true;

        if (x.Length == 3 && y.Length == 3)
        {
            if (x[2] == y[0] && x[0] == y[2])
                return true;

            if (x[0] == y[2] && x[2] == y[0])
                return true;
        }

        return false;
    }

    public int GetHashCode(string obj)
    {
        // return 0 to force the Equals to fire (otherwise it won't...!)
        return 0;
    }

    #endregion
}

The sample program:

class Program
{
    static void Main(string[] args)
    {
        List<string> MyList = new List<string>
        { 
            "A-B", 
            "B-A", 
            "C-D", 
            "C-E", 
            "D-C",
            "D-E",
            "E-C",
            "E-D",
            "F-G",
            "G-F"
        };

        var distinct = MyList.Distinct(new CharComparer());
        foreach (string s in distinct)
            Console.WriteLine(s);

        Console.ReadLine();
    }
}

The result:

"A-B"   
"C-D"
"C-E"   
"D-E"
"F-G"
code4life
  • 15,655
  • 7
  • 50
  • 82
-2
int checkID = 0;
while (checkID < MyList.Count)
{
 string szCheckItem = MyList[checkID];
 string []Pairs = szCheckItem.Split("-".ToCharArray());
 string szInvertItem = Pairs[1] + "-" + Pairs[0];
 int i=checkID+1;
 while (i < MyList.Count)
 {
  if((MyList[i] == szCheckItem) || (MyList[i] == szInvertItem))
  {
   MyList.RemoveAt(i);
   continue;
  }
  i++;
 }

 checkID++;
}
Minh Nguyen
  • 1,989
  • 3
  • 17
  • 30