2

With regards to efficiency, how can we create a large numpy array where the values are float numbers within a specific range.

For example, for a 1-D numpy array of fixed size where the values are between 0 and 200,000,000.00 (i.e. values in [0, 200,000,000.00]), I can create the array using the smallest data type for floats (float16) and then validate any new value (from user input) before inserting it to the array:

import numpy as np

a = np.empty(shape=(1000,), dtype=np.float16))
pos = 0

new_value = input('Enter new value: ')

# validate
new_value = round(new_value, 2)
if new_value in np.arange(0.00, 200000000.00, 0.01):
    # fill in new value
    a[pos] = new_value
    pos = pos + 1

The question is, can we enforce the new_value validity (in terms of the already-known minimum/maximum values and number of decimals) based on the dtype of the array?

In other words, the fact that we know the range and number of decimals on the time of creating the array, does this gives us any opportunity to (more) efficiently insert valid values in the array?

Yannis
  • 1,682
  • 7
  • 27
  • 45
  • 1
    The `new_value` should be casted to the datatype of `a` i.e. `float16` you just need to make sure it is in your range. But if you want to control how the rounding is done then it might be useful to control that with some function. You don't need to create the array and check if your value is inside a simple range check with if should do it. `if 0 <= value < 2000000: #append... ` – 53RT Feb 19 '20 at 06:51
  • plus a check if the step size is compatible (something like `new_value % step_size == 0`) but be aware of the problems on using [modulo on floats](https://stackoverflow.com/questions/14763722/python-modulo-on-floats), [or more general floating point math](https://stackoverflow.com/questions/588004/is-floating-point-math-broken) – scleronomic Feb 19 '20 at 09:49

1 Answers1

2

I am a bit confused how your code even run because it's not working as it is presented here.

It is also a bit unclear why you want to append new values to an empty array you have created beforehand. Did you meant to fill the created array with the new incoming values instead of appending?

np.arange(0.00, 200000000.00, 0.01)

This line is causing problems as it creates a huge array with values leading to a MemoryError in my environment just to check if the new_value is in a certain range.

Extending my comment and fixing issues with your code my solution would look like

import numpy as np

max_value =  200000000
arr = np.empty(shape=(1000,), dtype=np.float16)

new_value = float(input('Enter new value: ')) # More checks might be useful if input is numeric

# validate
if 0 <= new_value <= max_value:
    new_value = round(new_value, 2) # round only if range criterion is fulfilled
    arr = np.append(arr, new_value) # in case you really want to append your value
53RT
  • 649
  • 3
  • 20
  • You are right about the append - I edited the question to correctly state that I am just filling (inserting) new values. Fixed some typos as well. Apologies! Let me know if it now makes sense. – Yannis Feb 19 '20 at 11:44
  • Just to note that your answer is very helpful - thanks. A question regarding your comment on line 6: the input should be numeric but it may not be; when you say `more checks might be useful (...)`, you mean checks for it is a string or similar? – Yannis Feb 19 '20 at 12:01
  • I think `input()` always returns a string but it could contain characters so that casting it to float or something else fails. That is what I meant with checking the input :-) – 53RT Feb 19 '20 at 12:48