-2

I have a dataset like this:

ID Type Value
01 A    $10
01 B    $12
01 C    $14
02 B    $20
02 C    $21
03 B    $11

I want to convert this into:

ID TypeA TypeB TypeC
01 $10   $12   $14
02 $0    $20   $21
03 $0    $11   $0

The only solution that I have is bunch of if-loops but don't have a few-liner. Can anyone help me with this python (pandas) problem?

Thanks

realkes
  • 833
  • 1
  • 12
  • 20

2 Answers2

1

Convert the first two columns to a multiindex (a two-level index). Unstack the dataframe (convert the second-leve row index into a column index). Fill in the gaps with "$":

d = df.set_index(['ID', 'Type']).unstack().fillna('$0')

Update column names:

d.columns = 'Type' + d.columns.levels[1]

The result:

Type TypeA TypeB TypeC
ID                    
01     $10   $12   $14
02      $0   $20   $21
03      $0   $11    $0

If the index name 'Type' bothers you, disable it:

d.columns.name = None

Finally, if you want the ID to be a column rather than an index, reset the index:

d.reset_index(inplace=True)

Final result:

   ID TypeA TypeB TypeC
0  01   $10   $12   $14
1  02    $0   $20   $21
2  03    $0   $11    $0
DYZ
  • 55,249
  • 10
  • 64
  • 93
1

You can use Pandas

Try this,

import pandas as pd

I created a data file for your sample data called psort.txt.

ID,Type,Value
01,A,$10
01,B,$12
01,C,$14
02,B,$20
02,C,$21
03,B,$11

and import it,

df = pd.read_csv('psort.txt', header=0)

  ID Type Value
0   1    A   $10
1   1    B   $12
2   1    C   $14
3   2    B   $20
4   2    C   $21
5   3    B   $11

I then pivot the dataframe,

df=df.pivot(index='ID',columns='Type', values='Value')

Below I fill the NaN values with 0 but I could fill with '$0'.

df=df.fillna(0)

print(df)

Type    A    B    C
ID                 
1     $10  $12  $14
2       0  $20  $21
3       0  $11    0
merit_2
  • 461
  • 5
  • 16