1

Needle: ['', 'yes', 'yes', '', '', '', 'yes', 'yes', 'yes', '']

Haystack: [['', '', 'yes', 'yes', '', '', 'yes', 'yes', '', 'yes'], ['', '', '', 'yes', 'yes', '', '', '', 'yes', 'yes']]

Needle matches with Haystack[0] at 2,6,7 and matches with Haystack[1] at 8, I'd like to be able to create these match lists of indices.

Currently: my code returns [1,2,6,7,8], and doesn't tell me where the matches are... not sure why it finds a match at 1:

for sublist in (haystack):
print(needle)
print(sublist)
print([i for i, item in enumerate(needle) if item in sublist and item != ''])

and my output looks like

['', 'yes', 'yes', '', '', '', 'yes', 'yes', 'yes', '']
['', '', 'yes', 'yes', '', '', 'yes', 'yes', '', 'yes']
[1, 2, 6, 7, 8]
['', 'yes', 'yes', '', '', '', 'yes', 'yes', 'yes', '']
['', '', '', 'yes', 'yes', '', '', '', 'yes', 'yes']
[1, 2, 6, 7, 8]

Full reproducible:

needle = ['', 'yes', 'yes', '', '', '', 'yes', 'yes', 'yes', '']
haystack = [['', '', 'yes', 'yes', '', '', 'yes', 'yes', '', 'yes'], ['', '', '', 'yes', 'yes', '', '', '', 'yes', 'yes']]`

for sublist in (haystack):
    print(needle)
    print(sublist)
    print([i for i, item in enumerate(needle) if item in sublist and item != ''])
user61871
  • 929
  • 7
  • 13
  • 2
    You're saying "for item in needle, if "yes" in haystack (which is always true) print the index of the "yes" in needle... so `[1,2,6,7,8]` will be the answer for any haystack which contains at least 1 `"yes"` – TemporalWolf Feb 20 '18 at 20:22

4 Answers4

2

Use enumerate and zip:

for sublist in haystack:
    print(needle)
    print(sublist)
    print([i for i, (x, y) in enumerate(zip(needle, sublist)) if x and y and x == y])

Output:

['', 'yes', 'yes', '', '', '', 'yes', 'yes', 'yes', '']
['', '', 'yes', 'yes', '', '', 'yes', 'yes', '', 'yes']
[2, 6, 7]
['', 'yes', 'yes', '', '', '', 'yes', 'yes', 'yes', '']
['', '', '', 'yes', 'yes', '', '', '', 'yes', 'yes']
[8]
juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172
1

As TemporalWolf pointed out, I was enumerating the wrong thing... the following works!

for sublist in (haystack):
    print([i for i, item in enumerate(sublist) if needle[i]=='yes' and sublist[i]=='yes'])
user61871
  • 929
  • 7
  • 13
  • In my opinion, the entire approach of enumeration inside another loop etc. is very cumbersome and confusing (inefficient + unclear to a future reader of that code), especially if what is really happening here is a simple logical AND between arrays. I've suggested another answer below, for your consideration. – Yaniv Feb 20 '18 at 21:21
1

What you are looking for-

needle = ['', 'yes', 'yes', '', '', '', 'yes', 'yes', 'yes', '']
haystack = [['', '', 'yes', 'yes', '', '', 'yes', 'yes', '', 'yes'],
['', '', '', 'yes', 'yes', '', '', '', 'yes', 'yes']]

for sublist in (haystack):
    print(needle)
    print(sublist)
    print([i for i, item in enumerate(needle) if item == sublist[i] and item != ''])
Abhisek Roy
  • 582
  • 12
  • 31
1

If I understand you correctly, then you're looking for a logical AND between the arrays, where "yes" is 1 and "" is 0.

So if we first convert your data to binary: (of course, you can skip this paragraph and assume we have binary data in the first place...)

import numpy as np
def convert_to_binary(arr):
  return 1 * (np.array(arr) == 'yes')
needle = convert_to_binary(needle)
# array([0, 1, 1, 0, 0, 0, 1, 1, 1, 0])
haystack = np.array([convert_to_binary(h_arr) for h_arr in haystack])
# array([[0, 0, 1, 1, 0, 0, 1, 1, 0, 1],
#        [0, 0, 0, 1, 1, 0, 0, 0, 1, 1]])

Their logical AND:

their_logical_and = needle & haystack
# array([[0, 0, 1, 0, 0, 0, 1, 1, 0, 0],
#        [0, 0, 0, 0, 0, 0, 0, 0, 1, 0]])

To achieve the non-zero indices, can use numpy.nonzero:

indices = [list(np.nonzero(arr)[0]) for arr in their_logical_and]
# [[2, 6, 7], [8]]
Yaniv
  • 819
  • 4
  • 12
  • Thanks for your feedback, @TemporalWolf. Where exactly is the indentation messed up? If you're referring to my first paragraph of code, that entire paragraph is redundant, as I wrote above, and left there for completeness (to convert OP's original data to the more convenient binary format), in a short form of writing (but not messed up). – Yaniv Feb 20 '18 at 21:18
  • Ah, I didn't read carefully enough. I would discourage the usage of inline function definitions, although it is correct. – TemporalWolf Feb 20 '18 at 21:24
  • Well, I edited it to try to be clearer. But it's not the important part anyway. The important observation here is that the OP really wants to do a logical AND between arrays (see my comment to his answer). So any loop / enumerate / zip etc. are redundant and confusing. – Yaniv Feb 20 '18 at 21:29