2

A numpy array z is constructed from 2 Python lists x and y where values of y can be 0 and values of x are not continuously incrementing (i.e. values can be skipped).

Since y values can also be 0, it will be confusing to assign missing values in z to be 0 as well.

What is the best practice to avoid this confusion?

import numpy as np

# Construct `z`
x = [1, 2, 3, 5, 8, 13]
y = [12, 34, 56, 0, 78, 0]
z = np.ndarray(max(x)+1).astype(np.uint32)  # missing values become 0
for i in range(len(x)):
    z[x[i]] = y[i]

print(z)        # [ 0 12 34 56  0  0  0  0 78  0  0  0  0  0]
print(z[4])     # missing value but is assigned 0
print(z[13])    # non-missing value but also assigned 0
Athena Wisdom
  • 6,101
  • 9
  • 36
  • 60
  • Can you accept signed integers? What do you want to do with the missing values later? – David Hoffman Aug 21 '20 at 02:42
  • @DavidHoffman Best to stick to unsigned integers, but it is probably beneficial to also know the solution when signed integers can be used. When a missing value is detected when reading from the array, a different logic may be used in the main program, such as raising an error or accessing the value at another index until a non-missing element is found – Athena Wisdom Aug 21 '20 at 12:56

1 Answers1

2

Solution

You could typically assign np.nan or any other value for the non-existing indices in x.

Also, no need for the for loop. You can directly assign all values of y in one line, as I showed here.

However, since you are typecasting to uint32, you cannot use np.nan (why not?). Instead, you could use a large number (for example, 999999) of your choice, which by design, will not show up in y. For more details, please refer to the links shared in the References section below.

import numpy as np

x = [1, 2, 3, 5, 8, 13]
y = [12, 34, 56, 0, 78, 0]
# cannot use np.nan with uint32 as np.nan is treated as a float
# choose some large value instead: 999999 
z = np.ones(max(x)+1).astype(np.uint32) * 999999 
z[x] = y
z

# array([999999,     12,     34,     56, 999999,      0, 999999, 999999,
#            78, 999999, 999999, 999999, 999999,      0], dtype=uint32)

References

CypherX
  • 7,019
  • 3
  • 25
  • 37