Levenshtein Distance

Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (i.e. insertions, deletions or substitutions) required to change one word into the other. It is named after Vladimir Levenshtein, who considered this distance in 1965.



									function MIN3($a, $b, $c)
{
	return (($a) < ($b) ? (($a) < ($c) ? ($a) : ($c)) : (($b) < ($c) ? ($b) : ($c)));
}

function LevenshteinDistance($s1, $s2)
{
	$x; $y; $lastdiag; $olddiag;
	$s1len = strlen($s1);
	$s2len = strlen($s2);
	$column = array();

	for ($y = 1; $y <= $s1len; ++$y)
		$column[$y] = $y;

	for ($x = 1; $x <= $s2len; ++$x)
	{
		$column[0] = $x;

		for ($y = 1, $lastdiag = $x - 1; $y <= $s1len; ++$y)
		{
			$olddiag = $column[$y];
			$column[$y] = MIN3($column[$y] + 1, $column[$y - 1] + 1, ($lastdiag + ($s1[$y - 1] == $s2[$x - 1] ? 0 : 1)));
			$lastdiag = $olddiag;
		}
	}

	return $column[$s1len];
}
								


Example

									$result = LevenshteinDistance("kitten", "sitting");
								


Output

									result: 3