Initializing numpy masked array from Python int list with None values

Question

As shown in the answer to the question Convert python list with None values to numpy array with nan values, it is straightforward to initialize a masked numpy array from a list with None values if we enforce the dtype=float. Those float values get converted to nan and we can simply do:

ma.masked_invalid(np.array(a, dtype=float), copy=False)

This however will not work for int like:

ma.masked_invalid(np.array(a, dtype=int), copy=False)

since the intermediate np.array will not be created with None values (there is no int nan).

What is the most efficient way to initialize a masked array based on Python list of ints that also contains None values in such way that those None values become masked?

I don't think there's going to be a good way to do this without a temporary array of object dtype. Make sure the solution you pick doesn't wind up with an object dtype at the end of it. — user2357112, May 20 '15 at 02:42

score 4 · Accepted Answer · answered May 25 '15 at 20:22

The most elegant solution I have found so far (and it is not elegant at all) is to initialize a masked array of type float and convert it to int afterwards:

ma.masked_invalid(np.array(a, dtype=float), copy=False).astype(int)

This generates a proper NP array where None values in the initial array a are masked. For instance, for:

a = [1, 2, 3, None, 4]
ma.masked_invalid(np.array(a, dtype=float), copy=False).astype(int)

we get:

masked_array(data = [1 2 3 -- 4],
             mask = [False False False  True False],
       fill_value = 999999)

Also, the actual masked int values become min int, i.e.

ma.masked_invalid(np.array(column, dtype=float), copy=False).astype(int).data

gives:

array([                   1,                    2,                    3,
       -9223372036854775808,                    4])

score 0 · Answer 2 · edited May 23 '17 at 12:14

0

You can't, however you can create a numpy array of object dtype cells

ma.masked_invalid(np.array(a, dtype=object), copy=False)

EDIT

Otherwise You can take a look here NumPy or Pandas: Keeping array type as integer while having a NaN value

edited May 23 '17 at 12:14

Community

1
1

answered May 20 '15 at 02:48

farhawa

10,120
16
49
91

Yes, but I do want int eventually. The complete answer should provide me with a masked array of type int. What's the most efficient way to achieve that? – Andrzej Pronobis May 20 '15 at 02:52
It seems that nan's still did not make it to numpy int. What's the best way through object then? – Andrzej Pronobis May 20 '15 at 03:05

score 0 · Answer 3 · answered Mar 04 '19 at 00:43

It's possible to do this by first creating two empty arrays, one with data type int that will become the masked array and another with data type bool that will become the mask itself.

Then we traverse the Python array. In the arr_without_none we replace all occurrences of None with the default value and in mask_mat we store whether the original value in the Python array was None or an integer. At the end we produce a masked array out of these two components.

def masked_int_array(arr, default=0):
    arr_without_none = numpy.empty(len(arr), dtype=int)
    mask_mat = numpy.empty(len(arr), dtype=bool)
    for i in range(len(arr)):
        arr_without_none[i] = default if arr[i] is None else arr[i]
        mask_mat[i] = arr[i] is None
    return ma.array(data=arr_without_none, dtype=int, mask=mask_mat, copy=False)

Initializing numpy masked array from Python int list with None values

3 Answers3

Linked