12

I want to create a function in Delphi that computes different levels of two strings. If two strings are equal (ignoring case), then it should return 0, but if they are not equal, it should return the number of different characters. This function can be very useful for checking spelling.

function GetDiffStringLevel(S1,S2:string):Integer;
begin
  if SameText(S1,S2) then Exit(0);
  // i want get different chars count
end

samples code:

Diff:=GetDiffStringLevel('Hello','Hello');// Diff:=0;
Diff:=GetDiffStringLevel('Hello','2Hello');// Diff:=1;
Diff:=GetDiffStringLevel('Hello','H2ello');// Diff:=1;
Diff:=GetDiffStringLevel('Hello','Hello W');// Diff:=2;
Diff:=GetDiffStringLevel('Hello','World');// Diff:=6; or 5
Rob Kennedy
  • 161,384
  • 21
  • 275
  • 467
MajidTaheri
  • 3,813
  • 6
  • 28
  • 46
  • 2
    See also: [Need a routine to detect strings that are similar but not identical](http://stackoverflow.com/q/10402858/576719). – LU RD May 14 '12 at 09:14

2 Answers2

16

Fast and compact implementation.

About 3 times as fast as smasher's implementation with normal strings. More than 100 times as fast if one of the strings is empty.

Smasher's function is case insensitive though, which can be useful as well.

function LevenshteinDistance(const s, t: string): integer;inline;
var
  d   : array of array of integer;
  n, m, i, j : integer;
begin
  n := length(s);
  m := length(t);
  if n = 0 then Exit(m);
  if m = 0 then Exit(n);

  SetLength(d, n + 1, m + 1);
  for i := 0 to n do d[i, 0] := i;
  for j := 0 to m do d[0, j] := j;

  for i := 1 to n do
    for j := 1 to m do
      d[i, j] := Min(Min(d[i-1, j]+1, d[i,j-1]+1), d[i-1,j-1]+Integer(s[i] <> t[j]));

  Result := d[n, m];
end;

Note: the inline directive reduces the execution time to less than 70% on my machine, but only for the win32 target platform. If you compile to 64bits (Delphi XE2), inlining actually makes it a tiny bit slower.

Wouter van Nifterick
  • 23,603
  • 7
  • 78
  • 122
9

What you want is known as the Levenshtein distance (the minimum number of edits to transform one string into the other, where an edit is either a character insertion, character deletion or character substitution). The wikipedia site has a pseudo code implementation.

Delphi implementation:

function LevenshteinDistance(String1 : String; String2 : String) : Integer;

var
  Length1, Length2      : Integer;
  WorkMatrix            : array of array of Integer;
  I, J                  : Integer;
  Cost                  : Integer;
  Val1, Val2, Val3      : Integer;

begin
String1 := TCharacter.ToUpper (String1);
String2 := TCharacter.ToUpper (String2);
Length1 := Length (String1);
Length2 := Length (String2);
SetLength (WorkMatrix, Length1+1, Length2+1);
for I := 0 to Length1 do
  WorkMatrix [I, 0] := I;
for J := 0 to Length2 do
  WorkMatrix [0, J] := J;
for I := 1 to Length1 do
  for J := 1 to Length2 do
    begin
    if (String1 [I] = String2 [J]) then
      Cost := 0
    else
      Cost := 1;
    Val1 := WorkMatrix [I-1, J] + 1;
    Val2 := WorkMatrix [I, J-1] + 1;
    Val3 := WorkMatrix[I-1, J-1] + Cost;
    if (Val1 < Val2) then
      if (Val1 < Val3) then
        WorkMatrix [I, J] := Val1
      else
        WorkMatrix [I, J] := Val3
    else
      if (Val2 < Val3) then
        WorkMatrix [I, J] := Val2
      else
        WorkMatrix [I, J] := Val3;
    end;
Result := WorkMatrix [Length1, Length2];
end;
jpfollenius
  • 16,456
  • 10
  • 90
  • 156
  • 2
    @MajidTaheri: You asked for a function that would calculate the difference between two words, and Smasher's function is the answer to your question. You never said in your question *how exactly* you would use the function. – Andriy M May 14 '12 at 09:34
  • 2
    @MajidTaheri, you can try [this](http://stackoverflow.com/a/54798/576719) implementation of `Levenshtein Distance`. – LU RD May 14 '12 at 09:36
  • @LU RD, EditDistance function is better. – MajidTaheri May 14 '12 at 10:00
  • You can quit the function if one of the two strings is empty. That's a lot faster when there's an empty string, and it doesn't cost much otherwise. Just add: `if Length1=0 then exit(Length2); if length2=0 then exit(length1);` – Wouter van Nifterick May 15 '12 at 03:20