I have n data frames that represent weekly time periods. I'd like to do something like an inner join (based on id and id2), but for appending rows from all n datasets, not additional columns (as they're all identical).
They all look like this
DF1:
id id2 A B C PERIOD
1 50 0.1 0.2 0.3 1
1 100 0.1 0.2 0.3 1
2 2 0.1 0.2 0.3 1
DF2:
id id2 A B C PERIOD
1 50 0.5 0.7 0.9 2
1 100 0.6 0.8 0.9 2
1 105 0.1 0.2 0.3 2
2 2 0.3 0.4 0.5 2
2 3 0.1 0.2 0.3 2
...repeats for DFn
I'd like a data frame that looks like this
id id2 A B C PERIOD
1 50 0.1 0.2 0.3 1
1 50 0.5 0.7 0.9 2
... n
1 100 0.1 0.2 0.3 1
1 100 0.6 0.8 0.9 2
... n
2 2 0.1 0.2 0.3 1
2 2 0.3 0.4 0.5 2
... n
So it discards all combinations of id, id2 that don't appear in ALL of my n datasets. Is there a quick way of doing this?
I was thinking of first going over all n data frames, grabbing sets of pairs id, id2, and doing an intersection of all these sets, and then reduce the data frames by .isin
, and then pd.concat
on the list of reduced data frames. This seems tedious though.