1

What is the most efficient way to implement matlab's ismember(A, b) in python where in A is any numpy ndarray and b is a list of values. It should return a mask as a boolean ndarray of the same shape as A where in an element is True if the corresponding value in A is in the list of values in b.

I want to replace all elements of A with value in list B with something.

I expected A[A in B] = 0 to work but it throws the following error:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

If an implementation of an equivalent of ismember is there then the following would do what I need:

A[ismember(A, b)] = 0

Note: I don't want solutions involving looping through all elements of A in python.

Based on the answer of ajcr, one solution is:

import numpy as np

def ismember(A, b):
  return np.in1d(A, b).reshape(A.shape)

But this is quite slow and runs out of memory. For my case, A is an image as big as 512 x 512 x 1200. b has about 1000 elements.

Community
  • 1
  • 1
cdeepakroy
  • 2,203
  • 3
  • 19
  • 23

2 Answers2

4

You may be looking for np.in1d:

>>> A = np.arange(9)
>>> B = [4, 6, 7]
>>> np.in1d(A, B)
array([False, False, False, False,  True, False,  True,  True, False])

Note that for multidimensional arrays A, the input is flattened so you'll need to reshape the boolean array:

>>> A = np.arange(9).reshape(3, 3)
>>> np.in1d(A, B).reshape(A.shape)
array([[False, False, False],
       [False,  True, False],
       [ True,  True, False]], dtype=bool)
DSM
  • 342,061
  • 65
  • 592
  • 494
Alex Riley
  • 169,130
  • 45
  • 262
  • 238
  • Bingo! Thanks! Paralelly i arrived at def ismember(A, b): return np.in1d(A.flatten(), b).reshape(A.shape) – cdeepakroy Mar 12 '15 at 17:02
  • It is very slow and runs out of memory for A of size 512 x 512 x 1200 and b of size 3800. I see some sorting going on inside [in1d](https://github.com/numpy/numpy/blob/v1.9.1/numpy/lib/arraysetops.py#L296) – cdeepakroy Mar 12 '15 at 17:12
  • I think that's one of the inevitabilities of working with very large NumPy arrays - there isn't an obvious solution to this problem which avoids looping over the array. If memory is a problem, you could consider trying other non-NumPy data structures or using memory-mapped arrays... – Alex Riley Mar 12 '15 at 17:24
0

The ismember library in pypi may be useful.

Speed check is done here: Python equivalent of MATLAB's "ismember" function

pip install ismember
erdogant
  • 1,544
  • 14
  • 23
  • If you find that a question is answered by an answer elsewhere, don’t post an answer linking the other question. Instead, flag the question to be closed as duplicate. – Cris Luengo Jun 22 '20 at 00:45