I have two table with an unequal amount of columns but with the same order, lets call the old and new. old has more columns than new than new.
The difference between them is that the spelling has changed as in spaces get replaced by _
and names get shortened from ex item name to item.
Ex:
old=['Item number','Item name', 'Item status', 'Stock volume EUR','Stock volume USD', 'Location']
new=['Item_number','Item', 'Item_status','Stock volume EUR', 'Location']
In reality if have a 50 column long list and 4 columns less in the new list.
Currently i have made list of the column headers and applied the levenshtein distance divided by sting length through a nested loop to find the most similar strings.
My next step i assume is change the nested loop in order to only keep the max result for each outer loop but i do not know how to go about that or if that is the right step.
distance=[jellyfish.levenshtein_distance(x,y)/len(x)for x in a for y in b
I want to use the new column headers on the old list and remove the columns that have no match in the new table