JaroWinkler

class JaroWinkler(threshold: Double = 0.7)

The Jaro–Winkler distance metric is designed and best suited for short strings such as person names, and to detect typos; it is (roughly) a variation of Damerau-Levenshtein, where the substitution of 2 close characters is considered less important than the substitution of 2 characters that a far from each other.

Jaro-Winkler was developed in the area of record linkage (duplicate detection) (Winkler, 1990). It returns a value in the interval 0.0, 1.0.

Parameters

threshold

The current value of the threshold used for adding the Winkler bonus. The default value is 0.7.

Jaro–Winkler distance

Constructors

Link copied to clipboard
constructor(threshold: Double = 0.7)

Types

Link copied to clipboard
object Companion

Functions

Link copied to clipboard
fun distance(first: String, second: String): Double

Return 1 - similarity.

Link copied to clipboard
fun similarity(first: String, second: String): Double

Compute Jaro-Winkler similarity.