I have two data files to merge with and both of them have the keyword fund_name, but the fund_name in the two files may be different and it's possible that some of the rows have no matches. Therefore, I want to do a fuzzy matching, returning the best match for each row.
I've read a relevant thread agrep: only return best matches and I've tried amatch(string, stringVector, maxDist = Inf)
function in the package stringdist
, and it worked well.
I saw there're many different method
(i.e. string distance metrics) in amatch()
like "osa","lv", "dl"
... I wonder if I can combine them and return a value only when all of them find the same match. If so, how should I write the algorithm?
I care more about the accuracy of a match than finding a match in this fuzzy matching work. Many thanks for your help!