1

I have the following piece of code:

def is_it_bad(word):
    try:
        res = next((item for item in all_names if str(word) in str(item["name"])))
    except:
        res = {'name':word, 'gender':2}
return res

It looks like It is blocking my async function that is calling is_it_bad. I'm not very familiar with async, is there any way to make this function non-blocking?

Function calling is_it_bad:

async def get_genders_by_dict(res):
    letters = re.compile('[^a-zA-Z\ ]')
    fname = unidecode(str(letters.sub('', res['full_name'])).lower())
    fname = letters.sub('', res['username']).lower() + ' ' + fname + ' ' + fname.replace(' ', '')
    fname = fname.split(' ')    
    genders = []
    for j in fname:
        if len(j) > 2:
            print(j)
            genders.append(is_it_bad_tst('_' + j + '_')['gender'])
            for k in genders:
                if int(k) != 2:
                    gender = k
                    print('GOOD: ', '_' + j + '_', gender)


async def get_genders_by_dict_main(loop):
    tasks = [get_genders_by_dict(res) for res in results]
    await asyncio.gather(*tasks)


loop = asyncio.get_event_loop()
loop.run_until_complete(get_genders_by_dict_main(loop))

2 Answers2

2

make this function non-blocking?

In context of asyncio blocking function is a function that spends much time waiting for network-related operations (when you're requesting something from web) or a function that spends much CPU-time (long calculations).

Usually you can use asyncio to run network-related operations concurrently that allows to get results faster. asyncio can't somehow speedup CPU-related operations other than running them in executor (pool of processes) to get benefit of multiple cores. Latter however can be achieved with pure ProcessPoolExecutor, without asyncio at all.

As far as I can tell your code is none of the described situations: get_genders_by_dict has nothing to do with network and it doesn't seem to contain long-running calculations that can be parallelized on multiple cores. Read this answer for detailed explanations.

Long story short, if I'm not missing something you don't need asyncio at all, there's just no sense to use it. Just make get_genders_by_dict a plain function and use it so.

Mikhail Gerasimov
  • 36,989
  • 16
  • 116
  • 159
0

What I see from your code is that you are doing CPU bounded call and it can block the reactor(loop), I think the better way to solve your problem is using multiprocessing or just using a wrapper to run the tasks in an executor(another process)

https://docs.python.org/3/library/asyncio-eventloop.html#executor

https://docs.python.org/3/library/concurrent.futures.html#processpoolexecutor

fabiocerqueira
  • 762
  • 4
  • 12
  • I have tried with ThreadPoolExecutor - no difference in time compared to plain for loop. I believe I need an async generator function to replace current *is_it_bad* – Edgard Gomez Sennovskaya Mar 28 '18 at 09:40