I have an array of team names from NCAA, along with statistics associated with them. The school names are often shortened or left out entirely, but there is usually a common element in all variations of the name (like Alabama Crimson Tide vs Crimson Tide). These names are all contained in an array in no particular order. I would like to be able to take all variations of a team name by fuzzy matching them and rename all variants to one name. I'm working in python 2.7 and I have a numpy array with all of the data. Any help would be appreciated, as I have never used fuzzy matching before.
I have considered fuzzy matching through a for-loop, which would (despite being unbelievably slow) compare each element in the column of the array to every other element, but I'm not really sure how to build it.
Currently, my array looks like this:
{Names , info1, info2, info 3}
The array is a few thousand rows long, so I'm trying to make the program as efficient as possible.