0

I have a DataFrame df which has 3 columns. Each row has a unique combination of values from the first two columns, and the third column is just a numerical value that is attributed to the unique combination of the first two:

>>> df

   col1  col2  col3
0    a     x     1
1    a     y     5
2    a     z     4
3    b     x     3
4    b     y     2
5    b     z     5
6    c     x     7
7    c     y     9
8    c     z     1

So 'a' and 'x' refers specifically to 1, 'c' and 'y' to 9, etc.

What I'm trying to do is convert this into a new DataFrame where the set of unique values in col1 and col2 are the index/column names and those in col3 are the values at the intersection of these. So the finished DataFrame df2 looks like this:

>>> df2       
     a     b     c
x    1     3     7
y    5     2     9
z    4     5     1

Currently I have a functional solution using nested for loops:

df2 = pd.DataFrame(index=df['col2'].unique(), columns=df['col1'].unique()])
for var1 in df2.columns.values:
    for var2 in df2.index.values:
        df2[var1][var2] = df.loc[df['col1']==var1].loc[df['col2']==var2]['col3'].values[0]

However this solution is pretty inefficient. Is there another more efficient and neatly organized way I can accomplish this same task without having to manually loop through each slot of the DataFrame that I need to populate?

Pyronix
  • 3
  • 1

1 Answers1

0
In [8]: df.pivot_table(index="col1", columns="col2")
Out[8]:
     col3
col2    x  y  z
col1
a       1  5  4
b       3  2  5
c       7  9  1
Randy
  • 14,349
  • 2
  • 36
  • 42