numpy array to pandas pivot table

Question

I'm new to pandas and am trying to create a pivot table from a numpy array.

variable npArray is just that, a numpy array:

>>> npArray
array([(1, 3), (4, 3), (1, 3), ..., (1, 4), (1, 12), (1, 12)], 
      dtype=[('MATERIAL', '<i4'), ('DIVISION', '<i4')])

I'd to count occurrences of each material by division, with division being rows and material being columns. Example:

What I have:

#numpy array to pandas data frame
pandaDf = pandas.DataFrame (npArray)

#pivot table - guessing here
pandas.pivot_table (pandaDf, index = "DIVISION", 
                    columns = "MATERIAL", 
                    aggfunc = numpy.sum) #<--- want count, not sum

Results:

Empty DataFrame
Columns: []
Index: []

Sample of pandaDf:

>>> print pandaDf
         MATERIAL  DIVISION
0               1         3
1               4         3
2               1         3
3               1         3
4               1         3
5               1         3
6               1         3
7               1         3
8               1         3
9               1         3
10              1         3
11              1         3
12              4         3
...           ...       ...
3845291         1         4
3845292         1         4
3845293         1         4
3845294         1        12
3845295         1        12

[3845296 rows x 2 columns]

Any help would be appreciated.

score 2 · Accepted Answer · answered Jul 12 '18 at 23:37

2

Something similar has already been asked: https://stackoverflow.com/a/12862196/9754169

Bottom line, just do aggfunc=lambda x: len(x)

answered Jul 12 '18 at 23:37

Yuca

6,010
3
22
42

score 1 · Answer 2 · answered Jul 12 '18 at 23:47

@GerardoFlores is correct. Another solution I found was adding a column for frequency.

#numpy array to pandas data frame
pandaDf = pandas.DataFrame (npArray)

print "adding frequency column"
pandaDf ["FREQ"] = 1

#pivot table
pivot = pandas.pivot_table (pandaDf, values = "FREQ", 
                            index = "DIVISION", columns = "MATERIAL", 
                            aggfunc = "count")

numpy array to pandas pivot table

2 Answers2