2

Using the code below, the results should show 0 for Dollar_Different. It works on all other rows except for the 3 below. Any ideas why? I've manually checked in Excel and my result is 0.

df['Dollar_Different'] = df['R_Revenue'] - df['S_Revenue']

Current Results - Wrong

Mark
  • 177
  • 3
  • 12
  • 3
    Those numbers probably have some truncated decimals etc. Plus those differences are technically 0 by most computer standards. Maybe this will give some context https://docs.python.org/3.8/tutorial/floatingpoint.html – Buckeye14Guy Nov 13 '19 at 14:22
  • Thanks. I will look into correcting the decimals and see if that makes a difference. – Mark Nov 13 '19 at 14:26

3 Answers3

1

Presumably you are using floating-point types (which is the default). Floating point cannot represent every real number exactly. Probably, the numbers in the two columns are actually slightly different, but you are displaying them with only 2 decimal digits.

In applications that deal with financial figures, floating-point types are avoided, for exactly the reason you see here. There are other datatypes designed for properly representing decimal quantities with accuracy. But IIRC pandas does not support any of these.

A simple workaround that you could use would be to represent the revenue quantities as integer amounts: either a single integer field representing the total quantity in cents, or two fields, one for the dollars component and another for the cents component.

Dave Costa
  • 47,262
  • 8
  • 56
  • 72
0

Are you sure that R_Revenue and S_Revenue are exactly these values and are not truncated to 2 decimals? You can try first to round using https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.round.html and then subtract again, see if that yields 0.

drakedog
  • 307
  • 4
  • 11
0

I changed the source data types to integers and everything matches now.

Mark
  • 177
  • 3
  • 12