1

I am trying to run through a for loop asynchronously in Python like you can do in Javascript using the map method and promise.all. I have searched everywhere on how to do this, but the code below is still running synchronously (doing one by one, instead of letting the loop do other iterations while it is finishing off the previous like promise.all allows you). Any help would be appreciated.

from jwt import scopes                                                   
from googleapiclient.discovery import build
from google.oauth2 import service_account
import json
import asyncio

key = 'file.json'
ID = 'ID'
rg = 'A1'

j2 = service_account.Credentials.from_service_account_file(key, scopes=scopes).with_subject('me@emial.com')

ar = []
cl = build('classroom', 'v1', credentials=j2)

def cour():
    co = []
    result1 = cl.courses().list().execute()
    courses = result1.get('courses', [])
    for cc in courses:
        co.append(cc['id'])
    return co

cco = cour()

async def main():
    async def subs2(i):
        await asyncio.sleep(0)
        result2 = cl.courses().courseWork().list(courseId=i).execute() 
        works = result2.get('courseWork', [])

        for work in works:
            result = cl.courses().courseWork().studentSubmissions().list(courseId=work['courseId'], courseWorkId=work['id']).execute()
            subs = result.get('studentSubmissions', [])

            for sub in subs:
                try:
                    ar.append(sub['assignedGrade'])
                    ar.append(sub['courseId'])
                    ar.append(sub['courseWorkId'])
                    ar.append(sub['userId'])
                except KeyError as name: 
                    pass

    coros = [subs2(i) for i in cco]
    await asyncio.gather(*coros)

if __name__ == '__main__':
    cour()
    loop = asyncio.get_event_loop()
    loop.run_until_complete(main())
    loop.close()
dano
  • 91,354
  • 19
  • 222
  • 219
Jason Jurotich
  • 441
  • 4
  • 24
  • Which specific call are you trying to make run asynchronously? Your inner coroutine, `subs2(i)` doesn't have any await calls inside, aside from a `sleep(0)`. So it's always going to run synchronously. – dano Jul 31 '19 at 17:32

1 Answers1

1

I think you are misunderstanding the way asyncio provides concurrency. It does not spawn any additional threads or processes. The event loop, and all the coroutines running on it, execute in a single thread. In order to get concurrency, your coroutines need to await a call doing asynchronous I/O, or some other operation which yields control to the asyncio event loop.

In your example, the coroutine you are trying to get to run concurrently doesn't actually do any asynchronous I/O. So each time the coroutine executes, it's going to block the event loop until it completes. This means each one will execute sequentially. In order to get concurrency, you need to either use an asyncio-friendly library instead of the one (googleapiclient) you're currently using, or farm the blocking work off to a background thread, using loop.run_in_executor.

dano
  • 91,354
  • 19
  • 222
  • 219
  • FWIW, it looks like there isn't an asyncio-compatible replacement for `googleapiclient`: https://github.com/googleapis/google-api-python-client/issues/360 – dano Jul 31 '19 at 17:51
  • Thanks dano, but I (and surely others) would greatly appreciate a more specific response to this. I understood that I was not doing this correctly, but as I mentioned before, even here on stackoverflow I still haven't found a concrete way to resolve this. There must be a way to do a "promise.all" procedure without using threading or multiprocessing. And multiprocessing does not work here either. That is a no go. I already tried it and Google blocks that as well when you call cour() and main(). https://stackoverflow.com/questions/3724900/python-ssl-problem-with-multiprocessing – Jason Jurotich Jul 31 '19 at 19:08
  • I am not trying to do anything in the code itself that would need "awaiting" but rather I am trying to do the process for each course in Classroom, as I can easily do in Node. What it needs to do is run this on the various courses that are in the array. I wish I could explain myself better. All I know is that something similar works fine in Node and it doesn't here. – Jason Jurotich Jul 31 '19 at 19:11
  • I don't know much about the NodeJS concurrency model, so I can't really comment on it. But with `asyncio`, there is no way to get multiple coroutines to run concurrently if you don't do anything asynchronous inside of them. They are only running in one thread, one thread can only do one thing at a time. If you introduce more threads (via `loop.run_in_executor`) you can get concurrency. If you had an asynchronous API to make the calls to the google cloud platform, you could get concurrency that way, too. I suspect what you have seen in Node is using asynchronous I/O, but I can't say for sure. – dano Jul 31 '19 at 19:15
  • According to [this question](https://stackoverflow.com/questions/22844441/is-promise-all-useful-given-that-javascript-is-executed-in-a-single-thread/26669587), in Javascript, I/O operations run on a different thread, which is probably why you've see concurrent behavior using `Promise.all`. – dano Jul 31 '19 at 19:21
  • Exactly. What you quoted above just now is what is happening, but I am new to Python, so it baffles me as to why it would be slower if it is used for Machine Learning. I am assuming then that you can only get real speed from Python if you use threading? I am wondering now if the new Node worker threads would work even faster than Python threads then... I'll have to check. Thanks anyway though. – Jason Jurotich Jul 31 '19 at 21:44
  • Javascript gets concurrency for I/O operations the same way Python does - either you use non-blocking I/O APIs, or you use background threads. Javascript is not doing anything Python (or most other languages) can't do. Which one has more "speed" depends on the specific use-case. Machine learning use-cases, for example, are likely CPU-bound, so APIs used to do I/O have no impact on its performance. You can write a highly-performant, Python application using only a single thread, but only if it's I/O-bound, and you have async APIs to use to do that I/O. Javascript is no different. – dano Jul 31 '19 at 21:59
  • ok, understood, so what you are saying is that, for some weird reason, Google may have only offered non-blocking I/O APIs in JS but not in Python, and for that reason I would be forced to use threading here? – Jason Jurotich Jul 31 '19 at 22:09
  • 1
    Yes, exactly. If you read through the github issue in the first comment I left (and other issues linked to it), they go into some detail on why they haven't provided a non-blocking API for use with `asyncio`. Basically, it's not feasible to implement it while they still support Python 2.7. They say they hope to do add an `asyncio`-friendly API once they drop support for 2.7, in 2020. – dano Jul 31 '19 at 22:11
  • 1
    The introduction of `asyncio` into Python has caused some interesting compatibility issues for networking libraries that were around before it existed. Unless you built your library in [a very specific way](https://sans-io.readthedocs.io/), which carefully separates the protocol from the underlying I/O layer, it's typically a lot of work to support both asynchronous I/O libraries (`asyncio`, `twisted`, `tornado`) and clients with applications using traditional, blocking I/O. Many projects still haven't caught up. – dano Jul 31 '19 at 22:15