6

I have scanned text:

Mils, chiiese, wh_ite ch$col_te

And expression list, example:

- cheese
- bread
- white chocolate
- etc.

I need compare broken expression with expression from my list, ex. "white chocolate" with "wh_ite ch$col_te."

Maybe you recommend some frameworks.

Artem Krachulov
  • 557
  • 6
  • 23

4 Answers4

12

String distance - Levenshtein distance

What you need to do is measure the difference between two string. To do that, you can use the Levenshtein distance.

For your luck, somebody already implemented this algorihtm in Swift HERE.

To make it work in Swift 1.2, you'll just have to autofix some errors that occour, nothing too fancy.

You can then use it like this:

println(levenshtein("wh_ite ch$col_te", bStr: "white chocolate")) // prints 3, because you have to change 3 letters to get from aStr to bStr

println(levenshtein("wh_ite ch$col_te", bStr: "whsdfdsite chosdfsdfcolate")) // prints 13, because you have to change 13 letters to get from aStr to bStr

You then just set the tolerance and you are done!

Community
  • 1
  • 1
Dejan Skledar
  • 11,280
  • 7
  • 44
  • 70
5

Dejan Skledar's on the right track -- you want to make use of Levenshtein distance. The implementation he points to needs tweaking to work in Swift 1.2, and it tends to be slow. Here's a Swift 1.2-compatible, faster implementation.

Simply include the Tools class in your project. Once you've done that, you can get a number representing the difference between two strings this way:

Tools.levenshtein("cheese", bStr: "chee_e") // returns 1
Tools.levenshtein("butter", bStr: "b_tt_r") // returns 2
Tools.levenshtein("milk", bStr: "butter")   // returns 6
Joey deVilla
  • 8,403
  • 2
  • 29
  • 15
3

Please find the Swift 4 implementation of Joey deVilla's answer here

You have to call the function like below:

Tools.levenshtein(aStr: "Example", bStr: "Examples")
ViruMax
  • 1,216
  • 3
  • 16
  • 41
1

Use StringMetric and be happy

https://github.com/autozimu/StringMetric.swift

import StringMetric

...

"kitten".distance(between: "sitting")    // => 0.746
"君子和而不同".distance(between: "小人同而不和")    // => 0.555
Caio Santos
  • 1,665
  • 15
  • 10