30

Is there an efficient way to remove Nones from numpy arrays and resize the array to its new size?

For example, how would you remove the None from this frame without iterating through it in python. I can easily iterate through it but was working on an api call that would be potentially called many times.

a = np.array([1,45,23,23,1234,3432,-1232,-34,233,None])
Michael WS
  • 2,450
  • 4
  • 24
  • 46

2 Answers2

51
In [17]: a[a != np.array(None)]
Out[17]: array([1, 45, 23, 23, 1234, 3432, -1232, -34, 233], dtype=object)

The above works because a != np.array(None) is a boolean array which maps out non-None values:

In [20]: a != np.array(None)
Out[20]: array([ True,  True,  True,  True,  True,  True,  True,  True,  True, False], dtype=bool)

Selecting elements of an array in this manner is called boolean array indexing.

John1024
  • 109,961
  • 14
  • 137
  • 171
  • If you don't mind me asking, if there are n items in the array, is this method faster than O(n)? – wookie919 Aug 12 '14 at 02:04
  • 2
    This is saving me 80% of the time on a large array – Michael WS Aug 12 '14 at 02:11
  • @wookie919 If I understand the internals of numpy correctly, this is copying out the array, and removing the None's one-by-one. So, I don't think so. – Nick ODell Aug 12 '14 at 02:29
  • 2
    but most of this will live in C. its much faster than filter – Michael WS Aug 12 '14 at 02:31
  • Does using a comparison inside of an access (index) operator have a name? Like the colon in the accessor is called the slice. – Kevin Feb 19 '18 at 15:53
  • @Kevin It's called [boolean array indexing](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#boolean-array-indexing). – John1024 Feb 19 '18 at 16:55
5

I use the following which I find simpler than the accepted answer:

a = a[a != None]

Caveat: PEP8 warns against using the equality operator with singletons such as None. I didn't know about this when I posted this answer. That said, for numpy arrays I find this too Pythonic and pretty to not use. See discussion in comments.

eric
  • 7,142
  • 12
  • 72
  • 138
  • While it is simpler, this does not work with PEP8 autoformatters like flake8, which would want to turn `a[a != None]` into `a[a is not None]` (which is not semantically equivalent). – David Slater Feb 03 '20 at 16:33
  • There is nothing in PEP8 about not using `!=` that I have seen. Consider: sometimes it is apt to use `is`, other times `==`. Similarly, sometimes you need `!=` and other times `is not`. You just have to know what you are doing. Note also this question/answer pertains to numpy arrays -- for Python lists you need something else. – eric Feb 03 '20 at 19:21
  • 1
    From [PEP8](https://www.python.org/dev/peps/pep-0008/#programming-recommendations) - "Comparisons to singletons like None should always be done with is or is not, never the equality operators.". I understand that this is talking about numpy, not Python lists. – David Slater Feb 03 '20 at 19:43
  • Dang I hadn't noticed that. The construction in the answer works well, but caveat noted!. :) I'm happy to flout the advice in this case but glad you pointed it out. Discussion here: https://stackoverflow.com/questions/11166748/identity-versus-equality-for-none-in-python – eric Feb 03 '20 at 20:12
  • 2
    Your answer is how I used to do this all time... until we started using flake8 (and autopep8) in our repos - this resulted in code changed (sometimes automatically) to `a = a[a is not None]` which resulted in some weird bugs. Thanks for the link - good discussion! – David Slater Feb 03 '20 at 22:45
  • 1
    @DavidSlater I learned something new about PEP8, and added a caveat to my answer. Thanks! – eric Feb 03 '20 at 23:26