Last active
December 29, 2015 09:59
-
-
Save guns/7653838 to your computer and use it in GitHub Desktop.
A naive implementation of minimum edit distance (Levenshtein distance) in Clojure.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(ns minimum-edit-distance | |
"A naive implementation of minimum edit distance in Clojure. | |
From: https://www.stanford.edu/class/cs124/lec/med.pdf | |
Initialization: | |
D(i,0) = i | |
D(0,j) = j | |
Recurrence Relation: | |
For each i = 1…M | |
For each j = 1…N | |
⎧ D(i-1,j) + 1 | |
D(i,j) = min ⎨ D(i,j-1) + 1 | |
⎪ ⎧ 2: if X(i) ≠ Y(j) | |
⎩ D(i-1,j-1) + ⎨ | |
⎩ 0: if X(i) = Y(j) | |
Termination: | |
D(N,M) is a distance") | |
(defn D | |
"Simple recursive implementation for strings." | |
([^String s₁ ^String s₂] | |
(D s₁ (.length s₁) s₂ (.length s₂))) | |
([^String s₁ ^Integer l₁ ^String s₂ ^Integer l₂] | |
(cond | |
(zero? l₂) l₁ | |
(zero? l₁) l₂ | |
:else | |
(min (inc (D s₁ (dec l₁) s₂ l₂)) | |
(inc (D s₁ l₁ s₂ (dec l₂))) | |
(+ (D s₁ (dec l₁) s₂ (dec l₂)) | |
(if (not= (.charAt s₁ (dec l₁)) (.charAt s₂ (dec l₂))) | |
2 ;; Number of operations per substitution | |
0)))))) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment