Document Similarity Measures
String Matching
- Edit Distance
- Levenstein
- Smith-Waterman
- Affine
- Alignment
- Jaro-Winkler
- Soft-TFIDF
- Monge-Elkan
Distance Matching
- Euclidean
- Manhattan
- Minkowski
- Text Analytics
- Jaccard
- TFIDF
- Cosine Similarity
Relational Matching
- Set Based
- Dice
- Tanimoto (Jaccard)
- Common Neighbors
- Adar Weighted
- Aggregates
- Average values
- Max/Min values
- Medians
- Frequency (Mode)
Other Matching
- Numeric distance
- Boolean equality
- Fuzzy matching
- Domain specific
- Gazettes
- Lexical matching
- Named Entities (NER)