I have a large spreadsheet (upwards of 119K rows) of mismatched data. Column A contains a list of names in full (and occasionally a Trustee or company name), and Column B contains initialized first/middle names with full names (and occasionally Trustee or company names).
I do not currently have a way to compare them short of doing so manually as there are many variable, and am looking for some assistance.
So far I have tried using a VBA script from (How do I fuzzy match just adjacent cells?) to see if it can output the difference (which would allow me to eliminate the cells in Column 2 that had no matching data), but this did not function as intended.
I have also tried various LEFT/RIGHT to trim the names from Column A and then match this to Column B, but this has also not worked due to variance in text in Column A.
Here are some examples of the cells. Note that the names in Column A are not always in alphabetical order, but Column B is:
Example (complete match):
Column A: Column B:
Smith Marcus John J M Smith
Page Binder Book, Quoth Nevermore Raven B B Page, R N Quoth
Orange Apple Banana, Orange Pear Plum A B & P P Orange
Koala Bear, Koala Marsupial Pouch, Koala Gum Tree B, P M & T G Koala
S & P Limited S & P Limited
S & P Limited A D Cumin (S & P Limited)
Example (partial):
Column A: Column B:
Page Binder Book, Quoth Nevermore Raven B B Page
Orange Apple Banana, Orange Pear Plum A B & P P Orange (Fruit 2019 Limited)
Koala Bear, Koala Marsupial Pouch, Goanna Gumtree, Koala Gum Tree B, P M & T G Koala
Example (no match):
Column A: Column B:
Smith Marcus John H J Hyde
Sheppard Garrus Thane B B Page, R N Quoth
What I am hoping to do:
Firstly, I am hoping to correctly mark each cell in Column B as complete/partial/no match with a fill (green/yellow/red). Secondly, for partial matches (whether Column A has extra information, or Column B is missing information) I want to output in Column C the missing information, like so:
Column A: Column B:
Page Binder Book, Quoth Nevermore Raven B B Page
Orange Apple Banana, Orange Pear Plum A B & P P Orange (Fruit 2019 Limited)
Column C:
Quoth Nevermore Raven
(Fruit 2019 Limited)
Is this kind of thing even possible, or are there just too many variations in the way the data is presented in each column?
Very new to both this site and excel functions in general, this is my first task!
Thank you for your assistance/knowledge/time.