0

I'd like to know how to search over N vectors for the differences between them and a reference vector. In other words, i want all elements of reference_arr that is not present in any of the N vectors.

Example:

reference_arr = [0,1,2,3,4,5,6,7,8,9,10]

A = [0, 4, 5]

B = [0, 10]

C = [1, 5, 7]

Desired output would be: [2,3,6,8,9]

My solution would be concatenate everything in a single vector, cast as a set to remove duplicate values and then use np.setdiff1d(). I don't know how this can scale (how many Nvectors i would have), so i can't say it's the best solution. Any guidance would be appreciated.

Thanks in advance.

heresthebuzz
  • 678
  • 7
  • 21
  • 'Best' is vague and depends on your exact needs and use cases. – Julien Jan 28 '21 at 00:18
  • Sorry for the vague word, but i mean "performance-wise". The arrays can be very large and i'm afraid that my solution isn't good enough. – heresthebuzz Jan 28 '21 at 00:33
  • Make a realistic example, and show your current code and performance. Then we can see if we can improve. – Julien Jan 28 '21 at 00:38
  • If you join your N lists using `itertools.chain(A, B, C)`, then [this question](https://stackoverflow.com/questions/41125909/python-find-elements-in-one-list-that-are-not-in-the-other/41125939) should be equivalent. Also, it won't get faster than O(n) in average case – naicolas Jan 28 '21 at 00:40
  • 1
    Also " i'm afraid that my solution isn't good enough". Don't solve an issue you don't have. First make sure this is an actual issue. – Julien Jan 28 '21 at 00:40
  • It is an issue, i think there's enough information to describe it. – heresthebuzz Jan 28 '21 at 01:01
  • Thanks @naicolas, i will take a look. – heresthebuzz Jan 28 '21 at 01:01

0 Answers0