0

Let's say I'm parsing a document and am trying to find Test©/Testing© Keyword Finder by taking some keywords and matching them to the aforementioned string.

If I were using a document where each block of text is separate and clearly defined so that there wouldn't be any confusion as to what words were connected in a sentence/title, is there a way in Swift to compare such a block of words with keywords to find a match?

ch1maera
  • 1,369
  • 5
  • 20
  • 42

1 Answers1

3

The Levenshtein distance, aka edit distance, is a metric for measuring the difference between two strings. The implementation of it is here:

func levenshteinDist(test: String, key: String) -> Int {
    let empty = Array<Int>(repeating:0, count: key.count)
    var last = [Int](0...key.count)

    for (i, testLetter) in test.enumerated() {
        var cur = [i + 1] + empty
        for (j, keyLetter) in key.enumerated() {
            cur[j + 1] = testLetter == keyLetter ? last[j] : min(last[j], last[j + 1], cur[j]) + 1
        }
        last = cur
    }
    return last.last!
}

You can use it as follows:

let test1 = "Test©/Testing© Keyword Finder"
let test2 = "Test© Word Finder"
let test3 = "Number Categorizer"
let test4 = "Alphabet in Order"

let key = "Test Testing Keyword"

print(levenshteinDist(test: test1, key: key)) // 10
print(levenshteinDist(test: test2, key: key)) // 14
print(levenshteinDist(test: test3, key: key)) // 18
print(levenshteinDist(test: test4, key: key)) // 15
print(levenshteinDist(test: key, key: key)) // 0

As you can see, levensthlindist(test: key, key: key) outputs 0, since the strings are same. Also, the minimum output is 10 and it corresponds to the expected test string.

Dorukhan Arslan
  • 2,676
  • 2
  • 24
  • 42