1

What am I missing? I tried appending .round(3) to the end of of the api call but it doesnt work, and it also doesnt work in separate calls. The data types for all columns is numpy.float32.

>>> summary_data = api._get_data(units=list(units.id),
                             downsample=downsample,
                             table='summary_tb',
                             db=db).astype(np.float32)
>>> summary_data.head()


    id  asset_id    cycle   hs      alt     Mach        TRA         T2
0   10.0    1.0     1.0     1.0     3081.0  0.37945     70.399887   522.302124
1   20.0    1.0     1.0     1.0     3153.0  0.38449     70.575668   522.428162
2   30.0    1.0     1.0     1.0     3229.0  0.39079     70.575668   522.645020
3   40.0    1.0     1.0     1.0     3305.0  0.39438     70.575668   522.651184
4   50.0    1.0     1.0     1.0     3393.0  0.39690     70.663559   522.530090

>>> summary_data = summary_data.round(3)
>>> summary_data.head()

    id  asset_id    cycle   hs      alt     Mach    TRA         T2
0   10.0    1.0     1.0     1.0     3081.0  0.379   70.400002   522.302002
1   20.0    1.0     1.0     1.0     3153.0  0.384   70.575996   522.427979
2   30.0    1.0     1.0     1.0     3229.0  0.391   70.575996   522.645020
3   40.0    1.0     1.0     1.0     3305.0  0.394   70.575996   522.651001
4   50.0    1.0     1.0     1.0     3393.0  0.397   70.664001   522.530029


>>> print(type(summary_data))

pandas.core.frame.DataFrame

>>> print([type(summary_data[col][0]) for col in summary_data.columns])

[numpy.float32,
 numpy.float32,
 numpy.float32,
 numpy.float32,
 numpy.float32,
 numpy.float32,
 numpy.float32,
 numpy.float32]

It does in fact look like some form of rounding is taking place, but something weird is happening. Thanks in advance.

EDIT

The point of this is to use 32 bit floating numbers, not 64 bit. I have since used pd.set_option('precision', 3), but according the the documentation this only affects the display, but not the underlying value. As mentioned in a comment below, I am trying to minimize the number of atomic operations. Calculations on 70.575996 vs 70.57600 are more expensive, and this is the issue I am trying to tackle. Thanks in advance.

darrahts
  • 365
  • 1
  • 10

1 Answers1

1

Hmm, this might be a floating-point issue. You could change the dtype to float instead of np.float32:

>>> summary_data.astype(float).round(3)
     id  asset_id  cycle   hs     alt   Mach     TRA       T2
0  10.0       1.0    1.0  1.0  3081.0  0.379  70.400  522.302
1  20.0       1.0    1.0  1.0  3153.0  0.384  70.576  522.428
2  30.0       1.0    1.0  1.0  3229.0  0.391  70.576  522.645
3  40.0       1.0    1.0  1.0  3305.0  0.394  70.576  522.651
4  50.0       1.0    1.0  1.0  3393.0  0.397  70.664  522.530

If you change it back to np.float32 afterwards, it re-exhibits the issue:

>>> summary_data.astype(float).round(3).astype(np.float32)
     id  asset_id  cycle   hs     alt   Mach        TRA          T2
0  10.0       1.0    1.0  1.0  3081.0  0.379  70.400002  522.302002
1  20.0       1.0    1.0  1.0  3153.0  0.384  70.575996  522.427979
2  30.0       1.0    1.0  1.0  3229.0  0.391  70.575996  522.645020
3  40.0       1.0    1.0  1.0  3305.0  0.394  70.575996  522.651001
4  50.0       1.0    1.0  1.0  3393.0  0.397  70.664001  522.530029
  • Thanks for your answer. From what I know, `float` is a c double type and uses 64 bytes of memory, and I'm trying to use 32 bytes. It looks like the second link provided above by the moderator "seems" to solve my problem, but I have further questions regarding computation expense, which is what I am trying to reduce. Does 70.575996 result in more atomic operations than 70.576? Yes, it does. But according to pandas, `pd.set_option('precision', 3)` only affects the display, not the underlying value. So my question it would seem is still valid, in a sense. – darrahts Dec 05 '21 at 16:17
  • Okay, sounds interesting. If your question _is_ unique, then clarify that in the question body, and we can probably get it reopened. (By the way, Henry Ecker is not a moderator - he's just a normal user who has the gold Python badge, which allows him to close questions with the Python tag a duplicates of others with that tag, whithout needing other users' approval, because he's assumed to be experienced enough :) –  Dec 05 '21 at 16:19
  • bits not bytes ^^ – darrahts Dec 05 '21 at 16:23
  • Looks like the question was opened back up, thanks! I'm coming back to this now in my project and remembered why I needed it. Any updates? – darrahts Jan 27 '22 at 14:16