SorensenDice
Sorensen-Dice coefficient, aka Sørensen index, Dice's coefficient or Czekanowski's binary (non-quantitative) index.
The strings are first converted to boolean sets of k-shingles (sequences of k characters), then the similarity is computed as 2 * |A inter B| / (|A| + |B|).
Attention: Sorensen-Dice distance (and similarity) does not satisfy triangle inequality.
Parameters
k
length of k-shingles