3

How to find closest string(s) in list:

 var list = new List<string>
 {
    "hello how are you",
    "weather is good today",
    "what is your name",
    "what time is it",
    "what is your favorite color",
    "hello world",
    "how much money you got",
    "where are you",
    "like you"
 };

and if updated input is:

  string input = "how are you";

and another one with type error:

  string input = "how are ytou";

For both cases would be good to get this:

hello how are you
where are you

or even this result:

hello how are you
where are you
how much money you got

or at least just:

hello how are you

I need it to avoid minimal type error in user request to make response.

  • Its always about the percentage of matched characters in words blooming out to sequencing of words. Little above basic regex capability. –  Jun 09 '17 at 22:00
  • @Wiktor Stribiżew edited –  Jun 09 '17 at 22:37

2 Answers2

6

A simple approach would be to use String.Compare to get the

lexical relationship between the two comparands

Order your available items after comparing with the input and take the best match like

string bestMacht = list.OrderBy(s => string.Compare(s, input)).First();

This is only the first approach because the order of words should be ignored. Let's improve this to a full solution. After splitting the strings

string[] splittedInput = input.Split(' ');

you are able to compare the single words using a IEqualityComparer. You are free to define how many characters are possible to fail every word (in this case 2).

private class NearMatchComparer : IEqualityComparer<string>
{
    public bool Equals(string x, string y)
    {
        return string.Compare(x, y) < 2;
    }

    public int GetHashCode(string obj)
    {
        return obj.GetHashCode();
    }
}

Use this comparer and compare the words of the input and your dictionary. If two words (define it like required) are matching (whatever order) select the string.

List<string> matches = list.Where(s => s.Split(' ')
    .Intersect(splittedInput, new NearMatchComparer()).Count() >= 2)
    .ToList();

The result is a list of potential matches.

Fruchtzwerg
  • 10,999
  • 12
  • 40
  • 49
  • Hello, so your output with `string input = "how are you";` is: `hello how are you, how much money you got, where are you` and with `string input = "how are ytou";` only `hello how are you `. if I use `OrderBy` over the first result it is equivalent to own result. Seems like it is proper way to get result for such solution, even if it is not equal to output which theoretically avoids single equal word containing in the string and does same with type error in word with containing combination. As I'm not asking for direct solution, I'm going to mark it, because it is answers to the goal –  Jun 09 '17 at 22:06
4

I would use a Levenshtein distance. This gives you a value of how different strings are. Just choose the min distance of your set.

How to calculate distance similarity measure of given 2 strings?

Gabriel Littman
  • 552
  • 3
  • 13