1

I have a list of models of this type:

public class TourDude {
    public int Id { get; set; }
    public string Name { get; set; }
}

And here is my list:

    public IEnumerable<TourDude> GetAllGuides {
        get {
            List<TourDude> guides = new List<TourDude>();
            guides.Add(new TourDude() { Name = "Dave Et", Id = 1 });
            guides.Add(new TourDude() { Name = "Dave Eton", Id = 1 });
            guides.Add(new TourDude() { Name = "Dave EtZ5", Id = 1 });
            guides.Add(new TourDude() { Name = "Danial Maze A", Id = 2 });
            guides.Add(new TourDude() { Name = "Danial Maze B", Id = 2 });
            guides.Add(new TourDude() { Name = "Danial", Id = 3 });
            return guides;

        }
    }

I want to retrieve these records:

{ Name = "Dave Et", Id = 1 } 
{ Name = "Danial Maze", Id = 2 }
{ Name = "Danial", Id = 3 }

The goal mainly to collapse duplicates and near duplicates (confirmable by the ID), taking the shortest possible value (when compared) as name.

Where do I start? Is there a complete LINQ that will do this for me? Do I need to code up an equality comparer?

Edit 1:

        var result = from x in GetAllGuides
                     group x.Name by x.Id into g
                     select new TourDude {
                         Test = Exts.LongestCommonPrefix(g),
                         Id = g.Key,
                     };

        IEnumerable<IEnumerable<char>> test = result.First().Test;

        string str = test.First().ToString();
Smithy
  • 2,170
  • 6
  • 29
  • 61
  • 1
    AFAIK nothing built-in will do this. You might want to group by Id and then write your own code to find the Name you want to use. – Tim S. May 21 '13 at 16:53

2 Answers2

3

If you want to group the items by Id and then find the longest common prefix of the Names within each group, then you can do so as follows:

var result = from x in guides
             group x.Name by x.Id into g
             select new TourDude
             {
                 Name = LongestCommonPrefix(g),
                 Id = g.Key,
             };

using the algorithm for finding the longest common prefix from here.

Result:

{ Name = "Dave Et", Id = 1 }
{ Name = "Danial Maze ", Id = 2 }
{ Name = "Danial", Id = 3 }

static string LongestCommonPrefix(IEnumerable<string> xs)
{
    return new string(xs
        .Transpose()
        .TakeWhile(s => s.All(d => d == s.First()))
        .Select(s => s.First())
        .ToArray());
}
Community
  • 1
  • 1
dtb
  • 213,145
  • 36
  • 401
  • 431
  • How do I get the LongestCommonPrefix from the function? In my edit 1 str == "System.Char[]" I can't seem to get the actual, longest common prefix from out of all the IEnumerables :\ – Smithy May 21 '13 at 18:50
  • 1
    I've added a `LongestCommonPrefix` method that uses the `Transpose` method from [here](http://stackoverflow.com/a/2070434/76217). – dtb May 21 '13 at 19:01
  • How do I use the true x.ID for the rather than g.Key? My Ids are being overwritten in real world application – Smithy May 22 '13 at 10:55
  • @Smithy: I'm not sure what you're talking about. Maybe [open a new question](http://stackoverflow.com/questions/ask) with an example? – dtb May 22 '13 at 15:03
  • Sorry, My real world data is in a different language, one I don't speak and I mixed up the columns. Your example is perfect, thanks again :) – Smithy May 24 '13 at 11:42
2

I was able to achieve this by grouping the records on the ID then selecting the first record from each group ordered by the Name length:

var result = GetAllGuides.GroupBy(td => td.Id)
    .Select(g => g.OrderBy(td => td.Name.Length).First());

foreach (var dude in result)
{
    Console.WriteLine("{{Name = {0}, Id = {1}}}", dude.Name, dude.Id);
}
wgraham
  • 1,383
  • 10
  • 16
  • That way of getting the name won't quite work as intended in the case of `Danial Maze A` and `Danial Maze B`. – Tim S. May 21 '13 at 16:55