Python modulo on np array

Question

Another question, Python modulo on floats, describes why Python's modulo has trouble with floats (I also understand that computer numbers are not numbers as covered in this essay, especially multiples of 3). However, I am having a similar, but different problem with modulo and the accepted answer does not work.

Here is a reproducible example in Python 3.7.1 on a MacBook Pro:

import numpy as np
x = np.array([2.012, 2.012 *2, 2.012 *3, 2.012 *4, 2.012 *5, 2.012 *6])
x % 2.012
# array([0.0000000e+00, 0.0000000e+00, 2.0120000e+00, 0.0000000e+00,
#        4.4408921e-16, 2.0120000e+00])

If I try the accepted answer, I still get the same problem:

from decimal import Decimal
Decimal(3*2.012) % Decimal(2.012)
# Decimal('2.011999999999999566568931186')

My specific question: How do I calculate the modulo and get zero for series such as my example?

Edited based upon feedback:

As a specific example, how do I do the modulo operator if my data looks like this (the specific values will be read in from actual data):

## X is read in from real data
x = np.array([2.012, 4.024, 6.036, 8.048, 1, 2, 3, 1.2, 1.3, 1.4, 1, 5, 6, 2.012, 2.012, 2.012])
[D(x) % D(frequency) for x in detections]
#[Decimal('0E-51'), Decimal('0E-51'), Decimal('2.011999999999999566568931186'), Decimal('0E-51'), Decimal('1.000000000000000000000000000'), Decimal('2.000000000000000000000000000'), Decimal('0.9879999999999999893418589636'), Decimal('1.199999999999999955591079015'), Decimal('1.300000000000000044408920985'), Decimal('1.399999999999999911182158030'), Decimal('1.000000000000000000000000000'), Decimal('0.9759999999999999786837179272'), Decimal('1.975999999999999978683717927'), Decimal('0E-51'), Decimal('0E-51'), Decimal('0E-51')]

Are the intial values provided as floating numbers or just as text in the actual data? You need to consider converting the values to decimals from the very beginning. — GZ0, Aug 09 '19 at 15:44
They'll be read in with Pandas or numpy. I haven't gotten to that part of the code yet. — Richard Erickson, Aug 09 '19 at 15:47
I don't think either of them has decimals as a built-in data type. In that case you might need to consider either (1) using an integer representation, e.g. by multiplying the values with 1000; or (2) keeping the floating number representation and round the modulo results. — GZ0, Aug 09 '19 at 15:55

unutbu · Answer 1 · 2019-08-09T16:48:48.730

In [6]: from decimal import Decimal as D
In [7]: D('3')*D('2.012') % D('2.012')
Out[7]: Decimal('0.000')

Python evaluates arguments before passing them to functions (or callables such as Decimal). So you must avoid evaluating the float 3*2.012 if you want to preserve the decimal value. For this reason, always pass string representations of decimals to Decimal rather than floats.

If you have a text file containing

2.012, 4.024, 6.036, 8.048, 1, 2, 3, 1.2, 1.3, 1.4, 1, 5, 6, 2.012, 2.012, 2.012

then load the values as strings and the convert them to Decimal so as to avoid ever representing them as floats.

import csv
from decimal import Decimal as D
freq = D('2.012')
with open('data', 'r') as f:
    reader = csv.reader(f)
    X = [D(item) for item in next(reader)]
    print([xi % freq for xi in X])

yields

[Decimal('0.000'), Decimal('0.000'), Decimal('0.000'), Decimal('0.000'), Decimal('1.000'), Decimal('2.000'), Decimal('0.988'), Decimal('1.200'), Decimal('1.300'), Decimal('1.400'), Decimal('1.000'), Decimal('0.976'), Decimal('1.976'), Decimal('0.000'), Decimal('0.000'), Decimal('0.000')]

Since Decimals are not a NumPy/Pandas native dtype, storing them in a NumPy arrays or Pandas Series would require the generic object dtype. NumPy/Pandas does not offer any vectorized operations for Decimals, so there would be no performance gain by using NumPy/Pandas here.

Thank you. Your answer helps me understand my problem better. I'm about to edit my question based upon your answer — Richard Erickson, Aug 09 '19 at 15:38

Python modulo on np array

1 Answers1