I'm new to pandas and am trying to create a pivot table from a numpy array.
variable npArray
is just that, a numpy array:
>>> npArray
array([(1, 3), (4, 3), (1, 3), ..., (1, 4), (1, 12), (1, 12)],
dtype=[('MATERIAL', '<i4'), ('DIVISION', '<i4')])
I'd to count occurrences of each material by division, with division being rows and material being columns. Example:
What I have:
#numpy array to pandas data frame
pandaDf = pandas.DataFrame (npArray)
#pivot table - guessing here
pandas.pivot_table (pandaDf, index = "DIVISION",
columns = "MATERIAL",
aggfunc = numpy.sum) #<--- want count, not sum
Results:
Empty DataFrame
Columns: []
Index: []
Sample of pandaDf
:
>>> print pandaDf
MATERIAL DIVISION
0 1 3
1 4 3
2 1 3
3 1 3
4 1 3
5 1 3
6 1 3
7 1 3
8 1 3
9 1 3
10 1 3
11 1 3
12 4 3
... ... ...
3845291 1 4
3845292 1 4
3845293 1 4
3845294 1 12
3845295 1 12
[3845296 rows x 2 columns]
Any help would be appreciated.