0

I am trying to do a pivot in python using pandas pivot_table. When I am aggregating a column with huge(more than 10 digits with 3-4 numbers beyond decimals point) floating-point numbers using both "np.sum" and just "sum", getting wrong results. Below sharing an example.

data = pd.DataFrame({"store":["A","B","C","D"],
                      "sales":[11046021.3675,9222589.4978,3851017.2855,8284985.4983],
                      "place":["P","P","Q","Q"]})

code1 = pd.pivot_table(data,index="place",aggfunc={"sales":"sum"})

code2 = pd.pivot_table(data,index="place",aggfunc={"sales":np.sum})

Ideally, we should get sum of "P" as 20268610.8653 but with the above codes, it comes out 20268600. Does anyone know what is wrong here and can anyone please guide me on how to resolve the above discrepancy?

Prasad Patil
  • 45
  • 1
  • 8

2 Answers2

0

I think you can just try pip3 install --upgrade pandas to upgrade the version and run the code. Your code is fine. It might be a version issue

Shubham Periwal
  • 2,198
  • 2
  • 8
  • 26
  • 1
    I have upgraded to pandas "1.2.4" but the issue still persists. – Prasad Patil May 01 '21 at 11:24
  • https://stackoverflow.com/questions/3730019/why-not-use-double-or-float-to-represent-currency this must answer it for you. Create a big decimal data type and increase your precision levels – Shreyas Moolya May 01 '21 at 19:07
0

So there is a loophole here, I updated to the latest python & spyder version I get the correct answer when "I am printing the variable" not when I am seeing the answer in variable explorer. In variable explorer, the value is in exponentials and when I do copy-paste it in excel I find the discrepancy.

Prasad Patil
  • 45
  • 1
  • 8
  • 1
    I highly suggest saving and loading the data via files (e.g., `to_csv`). Copy-pasting relies on display formatting which is never totally precise. Even without the sci notation issue, you would lose raw float precision by copy-pasting. – tdy May 06 '21 at 03:33
  • 1
    Point taken, I was just checking at each step whether I was getting the correct number or not on variable explorer and there I came across this. – Prasad Patil May 06 '21 at 07:12