Skip to main content

Match Score

Introduction

We will be using the following two names in the example:

  • Johnny Michael Smith
  • John Smith

Calculate the weights

Weight

The weight is calculated by the numbers of letters in each part compared to the total letters of the whole name.

Johnny Michael Smith

NameLettersPercentage
Johnny633%
Michael738%
Smith527%
=
Total18100%

John Smith

NameLettersPercentage
John444%
Smith555%
=
Total9100%

Building the Matrix

A matrix is produced with each combination of all names (with weight next to them).

 Johnny(0.33)Michael(0.38)Smith(0.27)
John(0.44)???
Smith(0.55)???

Filling the question mark

The following rules are apply:

  • Exactly the same 100%
  • Only diacritics differ 95% (Sofie vs. Sofié)
  • The first letter matches a name 80% (A. vs. Adam)
  • Needs to add/remove/substitute one letter to another to make one word into the other
    • 1 change -> 85% (Rank vs. Bank)
    • 2 changes -> 70%
    • 3 changes -> 40%
  • Sounds similar:
    • Very similar: 30%
    • Slightly similar: 15%

In this example we have two matches:

  • Johnny with John (diacritics 2 changes, 70%)
  • Smith with Smith (exact match, 100%)
 Johnny(0.33)Michael(0.38)Smith(0.27)
John(0.44)70%00
Smith(0.55)00100%

The higest possible score is calculated, each name may only be used once.

The first name is compared with the second name.

0.33 * 0.7 + 0.38 * 0 + 0.27 * 1 = 0.501

Then the second name is compared the the first name.

0.44 * 0.7 + 0.55 * 1 = 0.858

Then the final score is the average of those calculations (0.501+0.858)/2=0.67=67%