Converting "for" loop with nested "for" loop and "If" statement into List Comprehension?

Question

How can I convert the following code into a list comprehension?

for i in range(xy-1):
    for b in range(i+1, xy):
        if(fuzz.token_set_ratio(Names[i], Names[b]) >= 90):
            FuzzNames[b].append(ID[i])
        else:
            pass

Thanks for helping.

I always refer to this [answer](https://stackoverflow.com/questions/18072759/list-comprehension-on-a-nested-list/45079294#45079294) when converting back and forth. — quamrana, Aug 09 '22 at 20:39
This is made a little more tricky because it doesn't return a single list. It's filling in some values, and leaving other values alone. — Tim Roberts, Aug 09 '22 at 20:43
I don't think this can be a list comprehension. It's not creating new lists, it's appending to existing lists. And it's appending to a different list each time through the loop. I thought it could be done by swapping the order of the `for` loops, but `b` is dependent on `i`. — Barmar, Aug 09 '22 at 20:46
I am doing a fuzzy search between 32,000 lines of data, the code took 7 hours but did not finish. I read about how to make my code faster, one of the solutions was list comprehension. This is the main block in my code, so I wondered how to do it — Abdulrahman Hocaoglu, Aug 09 '22 at 20:53
The list processing is NOT your bottleneck. Have you timed the individual fuzzy searches? If a single search takes 1 second, then 32,000 searches will take 9 hours. — Tim Roberts, Aug 09 '22 at 20:59
I did time the fuzzy search of 100, 200, 400 and 800 search to do some approximation. It will take me more than 9 hours. I did some changes to my code, but the last one was the list processing — Abdulrahman Hocaoglu, Aug 09 '22 at 21:21

score 0 · Answer 1 · answered Aug 09 '22 at 20:54

0

You could do something like this:

indices = [(i,b) for i in range(xy-1) for b in range(i+1, xy) if fuzz.token_set_ratio(Names[i], Names[b]) >= 90]

for i, b in indices:
  FuzzNames[b].append(ID[i])

However, the way you originally wrote it is more readable and easier to understand, and it's not likely that you need that list of indices later on.

answered Aug 09 '22 at 20:54

Joshua Bigler

36
1

This is a little faster. The data I am working with have 11 columns, every one with ten of thousands of line. This will save hours for me, thank you. – Abdulrahman Hocaoglu Aug 09 '22 at 21:14

Converting "for" loop with nested "for" loop and "If" statement into List Comprehension?

1 Answers1