Google App Engine offers services like Task Queues and Backends (now Modules) to parallelize handling of requests and doing "concurrent work". Typical fan-in fan-out/fork-join techniques can easily be implemented with Pipelines API, Fantasm etc.
When configuring the hardware of Backends/Modules you choose between B1, B2, B4, B8, but it does not say anything about the number of cores in the CPU configuration. Maybe the number of CPU cores is not relevant here. Backends support spawning "Background Threads" for each incoming request, but Python cannot actually do real concurrent work because of the famous GIL (Global Interpreter Lock).
One frontend instance will handle 8 requests (default, maximum 30) before firing up a new instance.
Python 2.7 with the Threadsafe directive is said to handle incoming request in parallel on one isolated instance, is this correct, or is it only incoming requests that are spread across the independent instances which are done with real concurrency?
On Google App Engine, what is actually performed with real concurrency technically, and on the other side, what is the recommended design pattern gaining most real concurrency and scaling?
You could make a "manual scaling" Backend/Module with 10-20 resident B8 instances with each spawning 10 "out-lived" background threads and doing 10 concurrent async URL fetches at all times for I/O work, or should it be fanned-out with dynamic instance creation?