3

User input will be like 'BY1 2PX', which will split and stored into list like below

var items = new List<string> {'BY1 2PX', 'BY12', 'BY1', 'BY'};

I have source list of Products

public class Product
{
    public string Name {get;set;}
    public string Id {get;set;}
}

Below is a sample product list. There is no guarentee on ordering, it could be in any order.

var products = new List<Product>{ 
    new Product("1", "BY1 2PX"),
    new Product("2", "BY12"),
    new Product("3", "BY1"),
    new Product("4", "BY"),
    new Product("5", "AA2 B2X"),
    //...etc
}

my output should fetch 1, because its most specific match. If Id = 1 is not there then it should have fetched Id =2 like that...etc Could anyone help me in writing a linq query. I have tried something like below, is this fine?

var result =  items.Select(x => products.FirstOrDefault(p =>
                      string.Equals(p.Name.Trim(), x, StringComparison.OrdinalIgnoreCase)))
                   .FirstOrDefault();
Gilad Green
  • 36,708
  • 7
  • 61
  • 95
Bhaskar
  • 1,680
  • 7
  • 26
  • 40
  • 1
    Is your product list always naturally sorted in the order where your "most specific" matches occur before lesser specific matches? If not, how do you you define the order of preference for matches? – rory.ap Aug 03 '16 at 14:08
  • Why do you need `List`, wouldn't it simplify things if you use `Dictionary` ? Dictionary lookups are much faster. – Fabjan Aug 03 '16 at 14:12
  • "Most specific" is slightly unclear, since you are using `string.Equals`. This will find find the first *exact* match. – vgru Aug 03 '16 at 14:15
  • @roryap ProductList will be in any order. – Bhaskar Aug 03 '16 at 14:19
  • @Groo I want to find the exact match avaialble in items list. – Bhaskar Aug 03 '16 at 14:20
  • 1
    In that case a dictionary would be a more appropriate collection, like @Fabjan wrote above. It gives constant lookup time ("jumps" to the correct product), vs `O(n)` when iterating through a list. – vgru Aug 03 '16 at 14:24
  • @Lamps - can you have a case like this? `new Product("1", "aaaaaaa BY1 2PX"),`. If so will product 1 still be "most specific"? – Gilad Green Aug 03 '16 at 15:04

4 Answers4

2

Well, you can use dictionary with its fast lookups :

var productsDict = products.ToDictionary(p => p.Name, p => p);

var key = items.FirstOrDefault(i => productsDict.ContainsKey(i));

Product result = key != null ? productsDict[key] : null;

Or as Tim suggested, if you have multiple elements with same names you can use Lookup :

var productsDict = products.ToLookup(p => p.Name, p => p);

var key = items.FirstOrDefault(i => productsDict.Contains(i));

Product result = key != null ? productsDict[key] : null;
Fabjan
  • 13,506
  • 4
  • 25
  • 52
1

If you want to select the best-matching product you need to select from the product- not the string-list. You could use following LINQ approach that uses List.FindIndex:

Product bestProduct = products
    .Select(p => new {
        Product = p,
        Index = items.FindIndex(s => String.Equals(p.Name, s, StringComparison.OrdinalIgnoreCase))
     })
    .Where(x => x.Index != -1)
    .OrderBy(x => x.Index)   // ensures the best-match logic
    .Select(x => x.Product)
    .FirstOrDefault();

The Where ensures that you won't get an arbitrary product if there is no matching one.


Update:

A more efficient solution is this query:

Product bestProduct = items
    .Select(item => products.FirstOrDefault(p => String.Equals(p.Name, item, StringComparison.OrdinalIgnoreCase)))
    .FirstOrDefault(p != null);   // ensures the best-match logic
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • How does this address the OP's need to get the "most specific" element, i.e. if an exact match isn't found, it will pick the next-most-speicifc match? – rory.ap Aug 03 '16 at 14:22
  • @roryap, IMO, as index of most specific *seach criteria* in `items` list is the smallest, ordering list by that index will solve it. – Adil Mammadov Aug 03 '16 at 14:24
  • @AdilMammadov -- The OP clarified in comments that the list isn't ordered like that... – rory.ap Aug 03 '16 at 14:25
  • @roryap: maybe i was wrong but the question suggested that the item-list indexes determine the sort criterium, so the index of the item specifies how exact the match is. If that was clarified meanwhile my answer doesn't help much. – Tim Schmelter Aug 03 '16 at 14:26
  • 2
    @roryap, I think OP stated that, `Product` list can be in any order, not `items` list. – Adil Mammadov Aug 03 '16 at 14:28
  • @AdilMammadov yes you are right. Items list order can be guarenteed but not the product list. – Bhaskar Aug 03 '16 at 14:37
  • @TimSchmelter Thanks for your answers. Regarding the updated solution - resharpers suggests to remove the where condition and add null check inside FirstOrDefault – Bhaskar Aug 03 '16 at 14:45
  • @Lamps: doesn't make a great difference, but if you find that more readable, it's the same approach. Edited my answer accordingly. – Tim Schmelter Aug 03 '16 at 14:52
0

You can try to find resemblance of words by using a specific algorythm called Levenshtein's distance algorythm, which is mostly used on "Did you mean 'word'" on most search websites.

This solution can be found here: https://stackoverflow.com/a/9453762/1372750

Once you find the distance difference, you can measure which word or phrase is more "like" the searched one.

Community
  • 1
  • 1
Rafael A. M. S.
  • 637
  • 1
  • 6
  • 27
0

This will find for each product what is the "most specific" (the longest) match in items and will return the product with the longest match (regardless to order of either of the collections)

var result = products
            .Select(p => new
                {
                    Product = p,
                    MostSpecific = items.Where(item => p.Name.Contains(item))
                                        .OrderByDescending(match => match.Length
                                        .FirstOrDefault()
                })
            .Where(x => x.MostSpecific != null)
            .OrderByDescending(x => x.MostSpecific.Length)
            .Select(x => x.Product)
            .FirstOrDefault();
Gilad Green
  • 36,708
  • 7
  • 61
  • 95