1

Suppose the following pandas dataframe

    Wafer_Id  v1  v2
0          0   9   6
1          0   7   8
2          0   1   5
3          1   6   6
4          1   0   8
5          1   5   0
6          2   8   8
7          2   2   6
8          2   3   5
9          3   5   1
10         3   5   6
11         3   9   8

I want to group it according to WaferId and I would like to get something like

w
Out[60]: 
   Wafer_Id  v1_1  v1_2  v1_3  v2_1 v2_2 v2_3
0         0     9     7     1     6  ...  ...
1         1     6     0     5     6
2         2     8     2     3     8
3         3     5     5     9     1

I think that I can obtain the result with the pivot function but I am not sure of how to do it

Donbeo
  • 17,067
  • 37
  • 114
  • 188

1 Answers1

1

Possible solution

    oes = pd.DataFrame()
    oes['Wafer_Id'] = [0,0,0,1,1,1,2,2,2,3,3,3]
    oes['v1'] = np.random.randint(0, 10, 12)
    oes['v2'] = np.random.randint(0, 10, 12)
    oes['id'] = [0, 1, 2] * 4


    oes.pivot(index='Wafer_Id', columns='id')

oes
Out[74]: 
    Wafer_Id  v1  v2  id
0          0   8   7   0
1          0   3   3   1
2          0   8   0   2
3          1   2   5   0
4          1   4   1   1
5          1   8   8   2
6          2   8   6   0
7          2   4   7   1
8          2   4   3   2
9          3   4   6   0
10         3   9   2   1
11         3   7   1   2

oes.pivot(index='Wafer_Id', columns='id')
Out[75]: 
         v1       v2      
id        0  1  2  0  1  2
Wafer_Id                  
0         8  3  8  7  3  0
1         2  4  8  5  1  8
2         8  4  4  6  7  3
3         4  9  7  6  2  1
Donbeo
  • 17,067
  • 37
  • 114
  • 188
  • Nice solution. See [here](http://stackoverflow.com/questions/14507794/python-pandas-how-to-flatten-a-hierarchical-index-in-columns) how to flatten the columns. – Ami Tavory Jun 11 '15 at 14:04
  • I would like a solution that is more memory efficient. – Donbeo Jun 11 '15 at 18:19
  • Why do you think that is is memory inefficient? – Ami Tavory Jun 11 '15 at 19:52
  • my dataset is quiet large. 300,000 rows and 1700 columns. If I run this code I get a memory error. The ideal would be to get the same result but inplace – Donbeo Jun 11 '15 at 23:46