0

I have a 2D list (nested list) named "fnc" and I want to remove duplicates and display in rich text box using LINQ. I tried "groupby" but it is not working. How can I do this? Thx for the help.

var grptest = fnc.GroupBy(x=>x);
foreach (var sublist in grptest)
{
    foreach (var value in sublist)
    {
        ResultRTB.AppendText(value.ToString());
    }
}

I have exactly these values inside

fnc = ((C1,C2,C3),(C2),(C1,C3),(C1,C3),(C1,C2,C3),(C3)) 

I want to remove duplicates result will be

fnc = ((C1,C2,C3),(C2),(C1,C3),(C3)) 
acarlon
  • 16,764
  • 7
  • 75
  • 94
Mustyby2
  • 13
  • 3
  • 1
    Can you explain why you chose `GroupBy` and not, perhaps, `Destinct`? What do you expect `GroupBy` to do? – Magus Jan 20 '14 at 20:54
  • Do you want to just remove items duplicated within their inner array, and ignore any duplicates in another group? If you want all items from all groups to be distinct, how do you plan to determine which duplicate should be kept? – Servy Jan 20 '14 at 20:55
  • How are you defining a duplicate list in your outer list vs. a duplicate in your inner list? Suppose you have a list of int lists `var myList = new List>()`. Are `myList[0]` and `myList[1]` duplicates if they have the same data or only if they share the same reference? You need to be more clear about what `fnc` is and what your requirements for identifying a duplicate are before this question can be answered – akousmata Jan 20 '14 at 20:57
  • I have exactly these values inside fnc = ((C1,C2,C3),(C2),(C1,C3),(C1,C3),(C1,C2,C3),(C3)) I want to remove duplicates result will be fnc = ((C1,C2,C3),(C2),(C1,C3),(C3)) – Mustyby2 Jan 20 '14 at 20:59
  • Did you try `Distinct`? It may not work in your case, but it probably will. – Magus Jan 20 '14 at 21:01
  • No I didn't. How can I do this? Like 1D lists? Nested lists are confusing. – Mustyby2 Jan 20 '14 at 21:07
  • Exactly the same way you use `GroupBy`. It's another Linq extension method. just call `.Distinct()` and if that doesn't work, try using some lambda to specify 'all elements are the same'. – Magus Jan 20 '14 at 21:12

2 Answers2

2

Default GroupBy() and Distinct() won't do what you want because you are looking for distinct lists, rather than individual values in those lists. Since a List is a reference type, two lists might have the same elements, but Distinct() will still return two lists.

If you want (as you say in your comments):

fnc = ((C1,C2,C3),(C2),(C1,C3),(C1,C3),(C1,C2,C3),(C3)) I want to remove duplicates result will be fnc = ((C1,C2,C3),(C2),(C1,C3),(C3))

Then you can use Distinct, but provide your own Equality Comparer. This requires you to override Equals (to say when the values in both lists are the same) and to implement the GetHashCode() method. Code for GetHashCode() taken from here. If you want to know more about why you need Equals and GetHashCode then see here.

The output of the following code is this:

Fill
( 1 2 )
( 1 )
( 1 2 )
( 1 )
( 1 2 )
De-dupe
( 1 2 )
( 1 )

Code:

static class Program
{

    class ListEqualityComparer : IEqualityComparer<List<int>>
    {
        public bool Equals( List<int> x, List<int> y )
        {
            return ( x.Count == y.Count ) && new HashSet<int>( x ).SetEquals( y );
        }

        public int GetHashCode( List<int> list )
        {
            int hash = 19;
            foreach( int val in list )
            {
                hash = hash * 31 + val.GetHashCode();
            }
            return hash;
        }
    }
    //Fill is just a method to fill the original lists for testing
    static void Fill( List<List<int>> toFill )
    {
        Console.WriteLine( "Fill" );
        for( int i = 1; i <= 5; i++ )
        {
            Console.Write( "( " );
            List<int> newList = new List<int>();
            toFill.Add(newList);
            for( int j = 1; j <= i%2 + 1; j++ )
            {
                newList.Add( j );
                Console.Write( j + " " );                    
            }
            Console.WriteLine( ")" );
        }
    }
    static void Main( string[] args )
    {
        List<string> result = new List<string>();
        List<List<int>> fnc = new List<List<int>>();
        Fill( fnc );
        var results = fnc.Distinct( new ListEqualityComparer() );
        Console.WriteLine( "De-dupe" );
        foreach( var list in results )
        {
            Console.Write( "( " );
            foreach( var element in list )
            {
                Console.Write( element + " " );
            }
            Console.WriteLine( ")" );
        }
    }
}
Community
  • 1
  • 1
acarlon
  • 16,764
  • 7
  • 75
  • 94
1

I will assume based on your comment that you are using strings because you didn't really specify otherwise. Lists are objects so they have references. So in the following example,

var fnc = new List<List<string>>
    {
        new List<string>{"a", "b", "c"},
        new List<string>{"a", "b", "c"}
    };

var groupedFnc = fnc.GroupBy(x => x);
var distinctFnc = fnc.Distinct();

the calls to GroupBy and Distinct will not do anything to the list because the references for each list are indeed already unique. What you would need to do in that situation is write a custom function to pass to the group by that would iterate back through the lists and identify which internal lists are duplicates of this list.

Another alternative would be write a custom comparer class and override the Equals and GetHashCode implementations and provide your own so that when you perform a GroupBy operation, the references are not the parts that are causing duplicates not to get identified. More information on how to do that here:

How to group by on a reference type property in linq?

Community
  • 1
  • 1
akousmata
  • 1,005
  • 14
  • 34