-1

So I'm in the end of making a reading game for children when I see that I sometimes have duplicate words. So I think to myself, lets do a Distinct. But it's a bit more complex.

So I understand how to use Distinct on a list of integers or strings and how to use DistinctBy() on a property within a list.

But what I can't figure out is how to do this on a property within a list of strings that is within a list of objects.

Plan B is a hole lot of ForEaches but I'm looking for something more elegant.

So this is the structure:

List<MyObject>()
{
    OtherObject BsObject1;
    OtherObject BsObject2;
    List<String>();
}

So the date looks like this:

MyObject1: List<String>: "word1", "word2"
MyObject2: List<String>: "word3", "word4"
MyObject3: List<String>: "word1", "word5"
MyObject4: List<String>: "word5", "word6"

As soon as 1 word is in within any of the other list that word that object can be deleted.

Any ideas?

  • 1
    "As soon as 1 word is in within any of the other list that word that object can be deleted." Do you really mean the whole object can be deleted? What if you had `MyObject5: List: "word6", "word7"`? `MyObject5` already has a "word6", but "word7" doesn't exist anywhere else. Which object do you delete in that case? `MyObject4` and lose "word5" along with the delete or `MyObject5` and lose "word7"? If you just meant the word can be deleted, why not just `.Concat()` all the words and then `.Distinct()`? – itsme86 Nov 30 '16 at 17:51
  • Note that, given your definition, the output is undefined. Consider this data set `[ [A, B], [B, C], [C, D] ]`. The first two objects share an item (B), and the second and third share an item, (C), but the first and third don't. If you compare the first two before you compare the second two, you end up with `[ [A, B], [C, D] ]`, if you compare the second and third first you end up with `[ [A, B], [B, C] ]`, and if you do a second pass, then you end up with just `[ [A, B] ]`. Now the order that you compare items changes the result. – Servy Nov 30 '16 at 18:10
  • I would recommend keeping track of the used words in a `HashSet` and then add object to the list (and its words to the used words list) only if none of its words are in the used words list. – Slai Nov 30 '16 at 18:46
  • *itsme86 in that case MyObject5 would be deleted. If I would concat and then distinct then I would lose the structure that I need. And that's not an option. - *Servy Didn't even think of that. I guess the hole problem is flawed and I will have to fix it before I add all the data. Like Slai suggested. - *Slai that would indeed me a good solution. Tkx – SpittingLlama Dec 01 '16 at 08:38
  • you can use @userNameWithoutSpaces in commends for the person to be notified (or click help below the Add Comment button for examples). If by any chance you pick the words at random from a list of words, you can just shuffle the list (swapping words at random positions of the list) and then take them in order http://stackoverflow.com/questions/273313/randomize-a-listt – Slai Dec 01 '16 at 12:29

1 Answers1

0

Based on your question, I assume that in your example the only remaining object would be MyObject2, since word1 exists in object 1 and 3, and word5 exists in object 3 and 4.

List<MyObject> objects; // Populate your data here... Strings is the name of the List<String> property.
var distinctObjects = objects.Where(
    mo => !mo.Strings.Intersect(
        objects.Except(new List<MyObject>() { mo }).SelectMany(mobj => mobj .Strings))
    .Any()).ToList();

This statement is checking, for each item, if the strings in it's Strings list are contained in the union of the Strings found in all other items. If this is not empty (!Any()), then the item remains.

PartlyCloudy
  • 711
  • 5
  • 14