I'm a Java Spring boot developer and I develop 3-tier crud applications. I talked to a guy who seemed knowledgeable on the subject, but I didn't get his contact details. He was advocating for Python's FastApi, because horizontally it scales better than Spring boot. One of the reasons he mentioned is that FastApi is single-threaded. When the thread encounters a database lookup (or other work the can be done asyncly), it picks up other work to later return to the current work when the database results have come in. In Java, when you have many requests pending, the thread pool may get exhausted.
I don't understand this reasoning a 100%. Let me play the devil's advocate. When the Python program encounters an async call, it must somehow store the program pointer somewhere, to remember where it needs to continue later. I know that that place where the program pointer is stored is not at all a thread, but I have to give it some name, so let's call it a "logical thread". In Python , you can have many logical threads that are waiting. In Java, you can have a thread pool with many real threads that are waiting. To me, the only difference seems to be that Java's threads are managed at the operating system level, whereas Python's "logical threads" are managed by Python or FastApi. Why are real threads that are waiting in a thread pool so much more expensive than logical threads that are waiting? If most of my threads are waiting, why can't I just increase the thread pool size to avoid exhaustion?