0

I am trying to make a pivot table using pandas

ns = pd.read_csv('latest_sg.csv')
sfdc = pd.read_csv('report1.csv')

Matching the column to lookup

ns.rename(columns={'accountid':'Account ID'},inplace = True)

A vlookup

lookup = pd.merge(ns,sfdc[['Account ID','Account Status']],on=['Account ID'],how='left')
lookup = lookup.dropna(axis=0, subset=['Account Status'])

The pivot table

pivot = lookup.pivot_table(index='Account ID',values='gb',aggfunc='sum',fill_value=0)

The problem is that with print(pivot), the output does not return the sum of the column 'gb' values but concatenates the values.

Ex. if an Account ID say 'qwer' has 3 rows with values in 'gb' as 3,4 and 5 the output should be

qwer               12 

but the output is

qwer               345

Please help!

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459

1 Answers1

1

Your column is most likely an object and an integer, the strings being concatenated is the expected behaviour, use

df[col] = pd.to_numeric(df[col])

and print(df.dtypes) to check your column datatypes.

if you need to ignore strings in your numerical column you can add a further argument to cast non numeric values into nan

pd.to_numeric(df[col], errors='coerce')

to further illustrate this

1 + 1 
>> 2
'1' + '1'
>> '11'
Umar.H
  • 22,559
  • 7
  • 39
  • 74
  • Thanks a ton for the help, you are a savior! –  Apr 08 '20 at 22:55
  • @Kimberly_Jennifer no problem, read the links above in the comments and accept this answer if it answered your question (green tick under the arrows) best of luck – Umar.H Apr 08 '20 at 22:56