Match Score
Introduction
We will be using the following two names in the example:
- Johnny Michael Smith
- John Smith
Calculate the weights
Weight
The weight is calculated by the numbers of letters in each part compared to the total letters of the whole name.
Johnny Michael Smith
Name | Letters | Percentage |
---|---|---|
Johnny | 6 | 33% |
Michael | 7 | 38% |
Smith | 5 | 27% |
= | ||
Total | 18 | 100% |
John Smith
Name | Letters | Percentage |
---|---|---|
John | 4 | 44% |
Smith | 5 | 55% |
= | ||
Total | 9 | 100% |
Building the Matrix
A matrix is produced with each combination of all names (with weight next to them).
Johnny(0.33) | Michael(0.38) | Smith(0.27) | |
---|---|---|---|
John(0.44) | ? | ? | ? |
Smith(0.55) | ? | ? | ? |
Filling the question mark
The following rules are apply:
- Exactly the same
100%
- Only diacritics differ
95%
(Sofie vs. Sofié) - The first letter matches a name
80%
(A. vs. Adam) - Needs to add/remove/substitute one letter to another to make one word into the other
- 1 change ->
85%
(Rank vs. Bank) - 2 changes ->
70%
- 3 changes ->
40%
- 1 change ->
- Sounds similar:
- Very similar:
30%
- Slightly similar:
15%
- Very similar:
In this example we have two matches:
Johnny
withJohn
(diacritics 2 changes, 70%)Smith
withSmith
(exact match, 100%)
Johnny(0.33) | Michael(0.38) | Smith(0.27) | |
---|---|---|---|
John(0.44) | 70% | 0 | 0 |
Smith(0.55) | 0 | 0 | 100% |
The higest possible score is calculated, each name may only be used once.
The first name is compared with the second name.
0.33 * 0.7 + 0.38 * 0 + 0.27 * 1 = 0.501
Then the second name is compared the the first name.
0.44 * 0.7 + 0.55 * 1 = 0.858
Then the final score is the average of those calculations (0.501+0.858)/2=0.67=67%