I have a DataFrame df
which has 3 columns. Each row has a unique combination of values from the first two columns, and the third column is just a numerical value that is attributed to the unique combination of the first two:
>>> df
col1 col2 col3
0 a x 1
1 a y 5
2 a z 4
3 b x 3
4 b y 2
5 b z 5
6 c x 7
7 c y 9
8 c z 1
So 'a' and 'x' refers specifically to 1, 'c' and 'y' to 9, etc.
What I'm trying to do is convert this into a new DataFrame where the set of unique values in col1
and col2
are the index/column names and those in col3
are the values at the intersection of these. So the finished DataFrame df2
looks like this:
>>> df2
a b c
x 1 3 7
y 5 2 9
z 4 5 1
Currently I have a functional solution using nested for loops:
df2 = pd.DataFrame(index=df['col2'].unique(), columns=df['col1'].unique()])
for var1 in df2.columns.values:
for var2 in df2.index.values:
df2[var1][var2] = df.loc[df['col1']==var1].loc[df['col2']==var2]['col3'].values[0]
However this solution is pretty inefficient. Is there another more efficient and neatly organized way I can accomplish this same task without having to manually loop through each slot of the DataFrame that I need to populate?