thank you for the help in advance. I'm in a bit of a pickle with this current problem, I have data sets all representing the same data in CSV format except the column names vary to a certain degree, for example
- ME_loard_MW
- ME_loard
- ME_load
Would be the heading names for 3 separate sets of data, I'm trying to develop a function that parses the column names(pandas) and changes all the names for any uploaded data set to a specific set. The approaches I've tried are using a Regex function such as
def renamefunc(col_name):
if re.match(myregex, col_name, flags=re.I):
return "FLOW202"
else:
return col_name
I've also considered using the difflib module(get_close_matches) since all the column names are distinct enough that the 1st list element will be the one I am targeting. Finally, I have been considering using a dictionary/algorithm, but this is a bit out of my scope since I started programming in April. Any input/feedback/criticism is more than welcome, my goal is to improve! Attached is an image of the type of data sets I expect to encounter