I need a smart way to join together multiple dataframes when various combinations of them are present in my session. It doesn't matter the order in which things are done via the joins.
In this simple example, I have 4 dataframes with every combination of which may or may not exist in my environment.
For example with each comma seperation referring to a dataframe representing df1, df2, df3, df4. My data frames will always have this similar sort of spelling if that helps.
My data frames will always join on the "ID" field in a data frame.
Example scenario setup (remember there are many permutations of this):
df1 = pd.DataFrame({'ID': ['jack', 'mary', 'jill'], 'Score1': [1,2,3]})
df3 = pd.DataFrame({'ID': ['jack', 'mary','katy'], 'Score3': [111,222,333]}
df4 = pd.DataFrame({'ID': ['jack', 'mary','katy'], 'Score4': [1111,2222,3333]}
The next time around I could just have df1 & df4...etc Further example of how data frames could be setup:
- exists, exists, exists,exists
- no, no, no, exists
- no, no, exists,exists
- no, exists, exists, exists
- etc.... I believe it is 4! x 3! combinations? There will always be at least one data frame in which cause no join would be attempted
Solution looking for:
df_final = df1.merge(.....) for all combinations of df#s that may or may not exist