0

I need some suggestions on comparing a few different workbooks. I have a final one that has 4 sheets, Verified contacts, removed contacts, Unverified Contacts, and not on any tabs. The other two that I have, I will need to compare the records with the final one and determine what the final workbook is missing.

The main issue that I have found is I don't have a common denominator. The closest thing I can see is the Name field. The final workbook has A full name field. The other two work books separate the name out. Now I can concatenate them to create a full name, but I may still be missing a middle initial, or someone may be using a nickname. I need a reliable way to compare these workbooks. Any suggestion will be greatly appreciated.

Thank you.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Zakayen
  • 9
  • 1
  • This is a VERY common problem with no good solution. Your data is poor so your solution will be ugly. – JNevill Jan 11 '18 at 18:50
  • you would have to use a programming language like Python and have it do matching with some level of error. See stuff like fuzzywuzzy or levenshtein distance. It won't be perfect but might be close enough. – sniperd Jan 11 '18 at 18:56
  • You could implement a [levenshtein distance](https://stackoverflow.com/questions/560709/levenshtein-distance-in-t-sql) function in SQL Server, which is not totally uncommon, and join on the results of that (result > .90, for instance). It will be slow, but it's a decent start. – JNevill Jan 11 '18 at 18:57
  • I'll look into it, Thanks! – Zakayen Jan 11 '18 at 19:11

0 Answers0