1

I have got a List of strings like:

  • String1
  • String1.String2
  • String1.String2.String3
  • Other1
  • Other1.Other2
  • Test1
  • Stuff1.Stuff1
  • Text1.Text2.Text3
  • Folder1.Folder2.FolderA
  • Folder1.Folder2.FolderB
  • Folder1.Folder2.FolderB.FolderC

Now I would like to group this into:

  • String1.String2.String3
  • Other1.Other2
  • Test1
  • Stuff1.Stuff1
  • Text1.Text2.Text3
  • Folder1.Folder2.FolderA
  • Folder1.Folder2.FolderB.FolderC

If "String1" is in the next item "String1.String2" I will ignore the first one and if the second item is in the third I will only take the third "String1.String2.String3" and so on (n items). The string is structured like a node/path and could be split by a dot.

As you can see for the Folder example Folder2 has got two different Subfolder items so I would need both strings.

Do you know how to handle this with Linq? I would prefer VB.Net but C# is also ok.

Regards Athu

Phil
  • 42,255
  • 9
  • 100
  • 100
Athu
  • 79
  • 5

4 Answers4

0

Pretty simple one. Try this:

var lst = new List<string> { /*...*/ };

var sorted =
    from item in lst
    where lst.Last() == item || !lst[lst.IndexOf(item) + 1].Contains(item)
    select item;
Jan P.
  • 3,261
  • 19
  • 26
  • Highly inefficient! `Last()` will enumerate whole collection every time! `IndexOf()` is linear as well. – MarcinJuraszek Mar 22 '13 at 12:27
  • You'll get an ArgumentOutOfRange exception. – Phil Mar 22 '13 at 12:29
  • @Phil: I added `list.Last() == item` to not get an exception, works in LinqPad. – Jan P. Mar 22 '13 at 12:30
  • 1
    @MarcinJuraszek: Not correct. If `source` implements `IList` `Last` uses `Count` and the indexer. However, this is an implementation detail, so maybe one shouldn't rely on it... – Daniel Hilgarth Mar 22 '13 at 12:31
  • @jaydotnet: you'll need to make an edit to your post so I can remove my down vote, although I think you should use StartsWith not Contains. – Phil Mar 22 '13 at 12:40
  • @Phil: http://stackoverflow.com/questions/3120056/contains-is-faster-than-startswith – Jan P. Mar 22 '13 at 12:46
  • @jaydotnet - ok that's interesting. However what if the list was {"String1", "String2.String1"} or {"String1", "String11.String2"}? Actually the same question applies to my solution :( – Phil Mar 22 '13 at 12:57
  • @Phil: You are right... to solve this problem you have to define some string splitting, which is not covered by any of the given solutions. – Jan P. Mar 26 '13 at 12:03
  • @yaydotnet Shall I undelete my answer - I thought it ended up a bit overcomplicated? – Phil Mar 26 '13 at 12:17
0

LINQ isn't really the correct approach here, because you need to access more than one item at a time.

I would go with something like this:

public static IEnumerable<string> Filter(this IEnumerable<string> source)
{
    string previous = null;
    foreach(var current in source)
    {
        if(previous != null && !current.Contains(previous))
            yield return previous;
        previous = current;
    }
    yield return previous;
}

Usage:

var result = strings.Filter();
Daniel Hilgarth
  • 171,043
  • 40
  • 335
  • 443
0
    Dim r = input.Where(Function(e, i) i = input.Count - 1 OrElse Not input(i + 1).StartsWith(e + ".")).ToList()

Condition within Where method checks if element is last from input or is not followed by element, that contains current one.

That solution uses the fact, that input is List(Of String), so Count and input(i+1) are available on O(1) time.

MarcinJuraszek
  • 124,003
  • 15
  • 196
  • 263
0

the following simple line can do the trick, I'm not sure about the performance cost through

        List<string> someStuff = new List<string>();
        //Code to the strings here, code not added for brewity
        IEnumerable<string> result = someStuff.Where(s => someStuff.Count(x => x.StartsWith(s)) == 1);
Vamsi
  • 4,237
  • 7
  • 49
  • 74