0

Stored procedure with a string parameter that will contain a few words.

Table with about 30k rows with a field that will have words

We want to pull the BEST match based on this Ranking (order of importance and if more than one qualify keep testing downward)

  • exact match
  • full match inside string
  • Most words matching
  • first word closest to front of string
  • Matches with least words left over
  • Order of Matches
  • Top 1

Here's the some sample data: http://sqlfiddle.com/#!6/c8665/1

Using SQL Azure. Might User Functions be an option for this?

We were using this LINQ code but need to move to a Stored procedure:

public PKB Best(string words)
{
    if (string.IsNullOrEmpty(words))
    {
        words = "pkb-not-found";
    }

    try
    {
        var ctx = new MiamiDbContext();
        //var allRows = ctx.PKB.ToList();

        var allRows = ctx.PKB.Where(x => x.PKBTitle.Contains("$")).ToList();

        var max = allRows.Max(x => CountChars(x.PKBTitle, words));

        if (max == 0)
        {
            throw new Exception("No records found");
        }

        var allMax = allRows.Where(x => CountChars(x.PKBTitle, words) == max).OrderByDescending(x => x.PKBDate).ToList();
        return allMax.First();
    }
    catch (SqlException ex)
    {
        if (ex.Number == 40615)
        {
            throw new Exception("Unable to connect to remote DB");
        }

        throw;
    }
}

public double CountChars(string match, string words)
{
    double result = 0;
    string correctMatch = match.ToLower();

    foreach (var word in words.Split('-'))
    {
        var correctWord = word.ToLower();

        if (correctMatch.Contains(correctWord))
        {
            result = result + correctWord.Length - (double)(correctMatch.IndexOf(correctWord, StringComparison.Ordinal)) / 1000;
        }
    }

    return result;
}
Hell.Bent
  • 1,667
  • 9
  • 38
  • 73
  • Based off of your "best match" description, I feel like this is possibly a use case for a case statement. You could use a case statement in the SQL to check which of the conditions it matches from highest weight to lowest weight. Then to get the one that matches the most, you'd sort the results by the weight and take the highest/lowest, which ever weight you gave to the most matches. Related: https://stackoverflow.com/questions/4622/sql-case-statement-syntax – Taplar Dec 21 '17 at 23:48
  • The parameter is going to be in what form? (e.g. a comma separated string?). There is some tiny amount of data at the sqlfiddle, but where are your queries? How do you anticipate achieving "order of matches" or "first word closest to front" without dividing the data into individual words (as well as the parameter for that matter). what version of SQL Server are you using? – Paul Maxwell Dec 22 '17 at 01:32
  • Paramater is a string with words space delimited. We have some code in LINQ we were using that worked pretty well. Not sure how to best approach this in SQL DB is in Azure SQL – Hell.Bent Dec 22 '17 at 03:50
  • 1
    Have you looked into the [FullText] (https://learn.microsoft.com/en-us/sql/relational-databases/search/full-text-search) search capabilities of SQL Server? Could save you the headache of attempting to codify all of the possible match conditions. – Forty3 Dec 22 '17 at 19:07

0 Answers0