My problem is that we want our users to enter the code like this:
639195-EM-66-XA-53-WX
somewhere in the input, so the result may look like this: The code is 639195-EM-66-XA-53-WX, let me in
. We still want to match the string if they make a small error in the code (Levenshtein distance of 1). For example The code is 739195-EM-66-XA-53-WX, let me in
. (changed 6
to 7
in the first letter of the code)
The algorithm should match even if user skips dashes, and it should ignore lowercase/uppercase letters. These requirements are easy to fulfil, because I can remove all dashes and do to_uppercase.
Is there an algorithm for something like that?
Generating all strings with the distance of 1 from original code is computationally expensive.
I was also thinking about using something like Levenshtein distance, but ignoring missing letters that user added in the second string, but that would allow wrong letters in the middle of the code.
Searching for the code in user input seems a little bit better, but still not very clean.