I'm having a dataset which as the following
customer products Sales
1 a 10
1 a 10
2 b 20
3 c 30
How can I reshape and to do that in python and pandas? I've tried with the pivot tools but since I have duplicated CUSTOMER ID it's not working...
Products
customerID a b c
1 10
1 10
2 20
3 30
{' update': {209: 'Originator',
211: 'Originator',
212: 'Originator',
213: 'Originator',
214: 'Originator'},
'CUSTOMER ID': {209: 1000368,
211: 1000368, 212: 1000968, 213: 1000968, 214: 1000968},
'NET SALES VALUE SANOFI':{209: 426881.0,
211: 332103.0, 212: 882666.0, 213: 882666.0, 214: 294222.0},
'PRODUCT FAMILY': {209: 'APROVEL',
211: 'APROVEL', 212: 'APROVEL', 213: 'APROVEL', 214: 'APROVEL'},
'CHANNEL DEFINITION':
{209: 'PHARMACY', 211: 'PHARMACY', 212: 'PHARMACY', 213: 'PHARMACY', 214: 'PHARMACY'},
'index': {209: 209, 211: 211, 212: 212, 213: 213, 214: 214}
CUSTOMER ID 1228675 non-null int64
DISTRIBUTOR ID 1228675 non-null float64
PRODUCT FAMILY 1228675 non-null
object GROSS SALES QUANTITY 1228675
non-null int64 GROSS SALES VALUE 1228675
non-null int64 NET SALES VALUE 1228675
non-null int64 DISCOUNT VALUES 1228675
non-null int64 CHANNEL DEFINITION 1228675 non-null object
what i tried also : ONLY_PHARMA.pivot_table(values = "NET SALES VALUE ", index = ["CUSTOMER ID"], columns = "PRODUCT FAMILY").reset_index()
what im getting now a mix of float and Int....?? Why?
ID A B C
1000167 NaN 2.380122e+05 244767.466667
or im having :
ValueError: negative dimensions are not allowed
OR I've done which also return me floats and int:
pvt = pd.pivot_table(ONLY_PHARMA.reset_index(), index=['CUSTOMER ID'],
columns='PRODUCT FAMILY', values='NET SALES VALUE' , fill_value='') \
.reset_index()