0

As the title mentions,

diag_code = df.pivot_table('PayabletoProvider',rows='DiagnosisCode',aggfunc=sum)

After applying the pivot function to my df, I am returned with data that dont make sense:

DiagnosisCode ------ PayabletoProvider

002 -------------------- NaN

003 -------------------- NaN

004 -------------------- 1824.70143.80220.001640.2540.00282.00946.31250...

005 -------------------- NaN

006 -------------------- 3324.95432.482400.65

Dont understand why this happens, as results in Excel seem to be ok and sum accordingly.

Would appreciate if somebody can help me fix this problem.

BernardL
  • 5,162
  • 7
  • 28
  • 47
  • What are the data types? If you have mixed types that could cause the problem. For example, you can't sum a mix of strings and floats in pandas but Excel would silently drop the string value and sum the floats. – jmz Mar 03 '14 at 09:11
  • @rauparaha I used Excel to check whether IsNumber and all True. Might it be different number types that will also cause this problem? Is there anyway I can replicate the function like Excel where it drops string types? I have started with a small set of data, but I am using Pandas because the data size will span to a couple million where Excel will just fail. – BernardL Mar 03 '14 at 09:14
  • I don't know how you're importing but a mistake in eg. a CSV import where you don't specify all NA markers can cause incorrect detection of types. Use `df.info()` to check that your columns are the correct type and not `object`. Dropping strings without identifying the root of the problem should be a last resort but [look here](http://stackoverflow.com/questions/12725417/drop-non-numeric-columns-from-a-pandas-dataframe) for how. – jmz Mar 03 '14 at 09:21
  • @rauparaha I guess then to avoid, the first step in importing is to ensure all data types are correctly declared and all NA markers are specified? Thanks for your help. – BernardL Mar 03 '14 at 09:27

1 Answers1

0

Old question but the answer might be that aggfunc expects "np.sum" instead of sum.

Pelican
  • 193
  • 2
  • 17