Is it safe to read the value of numpy.empty or jax.numpy.empty?

Question

In Flax, we typically initialize a model by passing in a random vector and let the library figure the correct shape for the parameters via shape inference. For example, this is what the tutorial did

def create_train_state(rng, learning_rate, momentum):
  """Creates initial `TrainState`."""
  cnn = CNN()
  params = cnn.init(rng, jnp.ones([1, 28, 28, 1]))['params']
  tx = optax.sgd(learning_rate, momentum)
  return train_state.TrainState.create(
      apply_fn=cnn.apply, params=params, tx=tx)

It is worth noting that the concrete value of jnp.ones([1, 28, 28, 1]) does not matter, as shape inference only relies on its shape. I can replace it with jnp.zeros([1, 28, 28, 1]) or jnp.random(jax.random.PRNGKey(42), [1, 28, 28, 1]), and it will give me the exactly same result.

My question is, can I use jnp.empty([1, 28, 28, 1]) instead? I want to use jnp.empty to clarify that we don't care about the value (and it could also be faster but the speedup is negligible). However, there is something called trap representation in C, and it looks like reading from jnp.empty without overwriting it first could trigger undefined behavior. Since Numpy is a light wrapper around C, should I worry about that?

Bonus question: let's forget about Jax and focus on vanilla Numpy. It is safe to read from np.empty([...])? Again, I don't care about the value, but I do care about not getting a segfault.

The documentation says it contains "arbitrary (uninitialized()" data. I think that means it's valid, but the values are unpredictable. — Barmar, Jul 08 '22 at 00:01
With the numeric dtype those are just bytes, and none of the `numpy` code is messed up by "reading" them. A clever programmer might be able to handle them as code or something else, but `np.empty` doesn't give us control over that memory location. But `np.empty(..., dtype=object)` sets them all to `None`, a safe reference. Similarly string dtypes are "blank". — hpaulj, Jul 08 '22 at 00:24
*"... as shape inference only relies on its shape. I can replace it with `jnp.zeros([1, 28, 28, 1])` or `jnp.random(jax.random.PRNGKey(42), [1, 28, 28, 1])`, and it will give me the exactly same result."* In that case, does the code actually attempt to read values from the array? The shape is part of the metadata; there is no need to access values of the array when getting the shape. — Warren Weckesser, Jul 08 '22 at 00:45
A `segfault` is not possible, AFAIK. But it is possible to have `nan` values in `np.empty`: `b = np.array([9221120237041090560, 10, 20 ,30]); del b; a = np.empty(4); a`, then *a* should contain `np.nan` as the first value on linux. — Michael Szczesny, Jul 08 '22 at 01:39
[`jnp.empty`](https://jax.readthedocs.io/en/latest/_autosummary/jax.numpy.empty.html) fills with zeros. `np.empty` can contain trap representations `b = np.arange(65536).astype(np.uint16); del b; a = np.empty(65536, np.float16); print(a + 1.)` results in the warning "RuntimeWarning: invalid value encountered in add" on linux. — Michael Szczesny, Jul 08 '22 at 02:14

jakevdp · Accepted Answer · 2022-07-09T02:55:48.987

Because XLA does not provide a mechanism to create uninitialized memory, in JAX jnp.empty is currently (v0.3.14) equivalent to jnp.zeros (see https://github.com/google/jax/blob/jax-v0.3.14/jax/_src/numpy/lax_numpy.py#L2007-L2009)

So at least in the current release, it is safe to refer to the contents of jnp.empty arrays. But if you're going to rely on that property, I'd suggest using jnp.zeros instead, so that if the jnp.empty implementation changes in the future your assumptions will still be valid.

np.empty is different: it will include uninitialized values, and so your program's behavior may change unpredictably from run to run if you rely on those values. There's no danger of memory corruption/segfaults when accessing these uninitialized values: the memory is allocated, it's just that the contents are uninitialized and so the values will reflect whatever bits happened to be stored there at the time the block was allocated.

Is it safe to read the value of numpy.empty or jax.numpy.empty?

1 Answers1