Skip to content Skip to sidebar Skip to footer

Comparing Similarity Between Multiple Strings With A Random Starting Point

I have a bunch of people names that are tied to their respective Identifying Numbers (e.g. Social Security Number/National ID/Passport Number). Due to duplication though, one Ident

Solution 1:

Take the function from https://stackoverflow.com/a/14631287/1082673 as you mentioned and iterate over all combinations in your list. This will work if you have not that many entries, otherwise the computation time can increase pretty fast…

Here is how to generate the pairs for a given list:

import itertools

persons = ['person1', 'person2', 'person3']

for p1, p2 in itertools.combinations(persons, 2):
    print"Compare", p1, "and", p2

Post a Comment for "Comparing Similarity Between Multiple Strings With A Random Starting Point"