I'm posting this in the spirit of answering your own questions.
The question I had was: How can I implement the Levenshtein algorithm for calculating edit-distance between two strings, as described here, in Delphi?
Just a note on performance: This thing is very fast. On my desktop (2.33 Ghz dual-core, 2GB ram, WinXP), I can run through an array of 100K strings in less than one second.
-
function EditDistance(s, t: string): integer; var d : array of array of integer; i,j,cost : integer; begin { Compute the edit-distance between two strings. Algorithm and description may be found at either of these two links: http://en.wikipedia.org/wiki/Levenshtein_distance http://www.google.com/search?q=Levenshtein+distance } try //initialize our cost array SetLength(d,Length(s)+1); for i := Low(d) to High(d) do begin SetLength(d[i],Length(t)+1); end; for i := Low(d) to High(d) do begin d[i,0] := i; for j := Low(d[i]) to High(d[i]) do begin d[0,j] := j; end; end; //store our costs in a 2-d grid for i := Low(d)+1 to High(d) do begin for j := Low(d[i])+1 to High(d[i]) do begin if s[i] = t[j] then begin cost := 0; end else begin cost := 1; end; //to use "Min", add "Math" to your uses clause! d[i,j] := Min(Min( d[i-1,j]+1, //deletion d[i,j-1]+1), //insertion d[i-1,j-1]+cost //substitution ); end; //for j end; //for i //now that we've stored the costs, return the final one Result := d[Length(s),Length(t)]; finally //cleanup for i := Low(d) to High(d) do begin for j := Low(d[i]) to High(d[i]) do begin SetLength(d[i],0); end; //for j end; //for i SetLength(d,0); end; //try-finally end;
From JosephStyons
0 comments:
Post a Comment