1

df1: | A | B | C |ID | |---|---|---|---| | 1 | 5 | 2 | Y | | 4 | 6 | 4 | Z |

df2: | A | B | C |ID | |---|---|---|---| | 2 | 1 | 2 | Y | | 4 | 6 | 4 | Z |

Merged: | case | A | B | C |ID | |------|---|---|---|---| |before| 1 | 5 | 2 | Y | |before| 4 | 6 | 4 | Z | |after | 2 | 1 | 2 | Y | |after | 4 | 6 | 4 | Z |

desired pivot for column A: |ID |before|after| |- |------|-----| | Y | 1 | 2| | Z | 4 | 4|

I want to use a for loop to create a pivot table for each column in dfs 1 and 2. The rows of these pivots will be the ID, the columns will be 'case'.

I would like to create a new df for each column's pivot table using a for loop.

Later, I will drop the rows in each pivot table where the before and after values are the same (only keeping the rows where they are different).

EMC
  • 13
  • 4
  • Welcome to stack overflow! Please have a look at [How to make good pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and edit to include your sample input and expected output as text in the body of your question, rather than an image or link, to make a [mcve] so that we can copy and paste. It would also be helpful to see code for what you've tried already based on your own research and what went wrong with your attempts so that we can offer more specific help and answers – G. Anderson Jul 19 '22 at 17:43
  • Pandas DataFrames have a couple methods called "pivot_table()" https://pandas.pydata.org/docs/reference/api/pandas.pivot_table.html and "pivot()" https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.pivot.html#pandas.DataFrame.pivot which may do what you want. Please try out those examples and update your question if you hav trouble modifying them for your use case. – Sarah Messer Jul 19 '22 at 18:30

1 Answers1

0

I hope I've understood you correctly:

df1["case"] = "before"
df2["case"] = "after"

df = pd.concat([df1, df2]).pivot(index="ID", columns="case")

for c in df.columns.get_level_values(0).unique():
    print(df.xs(c, axis=1))
    print("-" * 80)

Prints:

case  after  before
ID                 
Y         2       1
Z         4       4
--------------------------------------------------------------------------------
case  after  before
ID                 
Y         1       5
Z         6       6
--------------------------------------------------------------------------------
case  after  before
ID                 
Y         2       2
Z         4       4
--------------------------------------------------------------------------------

EDIT: To add dataframes into a dictionary:

df1["case"] = "before"
df2["case"] = "after"

df = pd.concat([df1, df2]).pivot(index="ID", columns="case")

all_dataframes = {}
for c in df.columns.get_level_values(0).unique():
    all_dataframes[c] = df.xs(c, axis=1).copy()

for key, dataframe in all_dataframes.items():
    print("Key:", key)
    print("DataFrame:", dataframe)
    print("-" * 80)
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • How would I create new dataframes for those pivots? – EMC Jul 19 '22 at 19:22
  • @EMC `df.xs(c, axis=1)` returns dataframe, so you can add them to list for example. – Andrej Kesely Jul 19 '22 at 19:25
  • I'm new to for loops... could you explain how to do that? I really appreciate your help you have already given! Thank you! – EMC Jul 19 '22 at 19:42
  • @EMC See my edit. – Andrej Kesely Jul 19 '22 at 19:45
  • Thank you I appreciate it - I have a new problem - this works for all columns except the columns which have repeated values. For example, if A were to have the same value for IDs Y and Z, there is an error (KeyError: 'A') – EMC Jul 19 '22 at 20:26
  • I fixed my key error - had to remove duplicate rows anyways for my report I'm building - and I'm really appreciating your super elegant solution to my problem. Thank you again! – EMC Jul 20 '22 at 15:20