If-statement with list inside for-loop over same list

Question

I am appending similarity scores of all pairs in a list.

data = []

for i1, i2 in list: 
    data.append([i1, i2, cosine_similarity([X[df.index.get_loc(i1)]],[X[df.index.get_loc(i2)]]).ravel()[0]])

However, I need it to only append scores that are non-zero.

I put in an if statement, but it produces an error since it is not of int type.

for i1, i2 in list:
    if [cosine_similarity([X[df.index.get_loc(i1)]], [X[df.index.get_loc(i2)]])] > 0:
        data.append([i1, i2, cosine_similarity([X[df.index.get_loc(i1)]], [X[df.index.get_loc(i2)]]).ravel()[0]])

Any way of only appending only none-zeros as part of the iteration?

What does "produces an error" mean? Do you get an exception? If so, show us the whole exception. — abarnert, Apr 01 '18 at 20:48
If it's [this error](https://stackoverflow.com/questions/34472814/use-a-any-or-a-all), you'll also need to explain more of what you're trying to do, because the answer _might_ be exactly what's in the error message, but it might be something else like using a mask, and there's no way we can know which one you want without sample input and desired output and why you want that output. — abarnert, Apr 01 '18 at 20:50
I don't see anything called "score". You have something called `df` which is a ... what? Are i1 and i2 indicies? Are you wanting to skip ones that are zero? How about a running example? And how about trimming it down to just what's useful for the question. Does `cosine_similarity` make any difference to the problem? — tdelaney, Apr 01 '18 at 20:57
`[cosine_similarity([X[df.index.get_loc(i1)]], [X[df.index.get_loc(i2)]])] > 0` should likely not have the result of the call wrapped in a list `cosine_similarity([X[df.index.get_loc(i1)]], [X[df.index.get_loc(i2)]]) > 0`. — Dan D., Apr 01 '18 at 21:02
@DanD. That was it! I copied it from the for-loop to the if-statement. Thanks! — user6453877, Apr 01 '18 at 21:28

score 0 · Answer 1 · answered Apr 01 '18 at 21:21

The general pattern for a conditional iteration is (a for a in b if a). Pulling your calculation into a helper function for readability, this should work:

def calc_sim(X, df, i1, i2):
    return cosine_similarity([X[df.index.get_loc(i1)]], 
        [X[df.index.get_loc(i2)]])

data = [(i1, i2, sim) 
    for (i1, i2, sim) in ((i1, i2, calc_sim(X, df, i1, i2)) 
    if sim > 0]

If-statement with list inside for-loop over same list

1 Answers1