4

If I have a thread that is blocked waiting for a lock, can the OS reschedule that thread to do other work until the lock becomes available? From my understanding, it cannot be rescheduled, it just sits idle until it can acquire the lock. But it just seems inefficient. If we have 100 tasks submitted to an ExecutorService, and 10 threads in the pool: if one of the threads holds a lock and the other 9 threads are waiting for that lock, then only the thread holding the lock can make progress. I would have imagined that the blocked threads could be temporarily rescheduled to run some of the other submitted tasks.

dreamcrash
  • 47,137
  • 25
  • 94
  • 117
alfer
  • 355
  • 2
  • 8
  • Java threads and native threads are not the same thing. The OS's scheduler can do whatever it wants with native threads, and it knows nothing about your locks either. I think probably what you actually care about is how Java threads are implemented. – Michael Apr 09 '21 at 17:21
  • You mean that in theory, the native thread could be rescheduled to run the other tasks, but that Java was not designed that way and prevents it from happening? – alfer Apr 09 '21 at 17:30
  • Are you interested in a solution for such use-cases or are you interested in what Java can do? – akuzminykh Apr 09 '21 at 17:57
  • I would just like to understand how Java works, I'm not specifically looking for a solution – alfer Apr 09 '21 at 17:59
  • 1
    @alfer hehe you really got stuck with this is mind :P, basically you are getting the same answers again – dreamcrash Apr 09 '21 at 18:44
  • @dreamcrash indeed, you gave me the answer first, thanks for that. But I've also learned other interesting things by reading the answers here. – alfer Apr 09 '21 at 18:56
  • 1
    @alfer yep that is true :), you curious that is a good think in a programmer – dreamcrash Apr 09 '21 at 18:58

4 Answers4

5

You said:

I would have imagined that the blocked threads could be temporarily rescheduled to run some of the other submitted tasks.

Project Loom

You happen to be describing the virtual threads (fibers) being developed as part of Project Loom for future versions of Java.

Currently the OpenJDK implementation of Java uses threads from the host OS as Java threads. So the scheduling of those threads is actually controlled by the OS rather than the JVM. And yes, as you describe, on all common OSes when Java code blocks, the code’s thread sits idle.

Project Loom layers virtual threads on top of the “real” platform/kernel threads. Many virtual threads may be mapped to each real thread. Running millions of threads is possible, on common hardware.

With Loom technology, the JVM detects blocking code. That blocked code’s virtual thread is “parked”, set aside, with another virtual thread assigned to that real thread to accomplish some execution time while the parked thread awaits a response. This parking-and-switching is quite rapid with little overhead. Blocking becomes extremely “cheap” under Loom technology.

Blocking is quite common in most run-of-the-mill business-oriented apps. Blocking occurs with file I/O, network I/O, database access, logging, console interactions, GUIs, and so on. Such apps using virtual threads are seeing huge performance gains with the experimental builds of Project Loom. These builds are available now, based on early-access Java 17. The Project Loom team seeks feedback.

Using virtual threads is utterly simple: Switch your choice of executor service.

ExecutorService executorService = Executors.newVirtualThreadExecutor() ;

Caveat: As commented by Michael, the virtual threads managed by the JVM depend on the platform/kernel threads managed by the host OS. Ultimately, execution is scheduled by the OS even under Loom. Virtual threads are useful for those times when a blocked Java thread would otherwise be sitting idle on a CPU core. If the host computer were overburdened, the Java threads might see little execution time, with or without virtual threads.

Virtual threads are not appropriate for tasks that rarely block, that are truly CPU-bound. For example, encoding video. Such tasks should continue using conventional threads.

For more info, see the enlightening presentations and interviews with Ron Pressler of Oracle, or with other members of the Loom team. Look for the most recent, as Loom has evolved.

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
  • thank you for your reply. Why are virtual threads not appropriate for tasks that are truly CPU-bound? – alfer Apr 09 '21 at 17:56
  • 1
    Project Loom does not change how the JVM interacts with native operating system threads WRT scheduling. Loom reserves several native threads to act as the basis for virtual threads. The OS could still decide that those native threads should get no CPU time, if it decides something else is more important. OP asked whether the OS could decide how to schedule some task based on knowledge of which tasks are blocking. The answer to that is no, and under Loom will still be no. – Michael Apr 09 '21 at 18:03
  • @alfer While parking and switching virtual threads is quite cheap, it is not free of cost. There is some overhead. If the task needs constant work on the CPU without pauses, then there is no point in handing the thread over to another task. A CPU-bound task should be allowed a good chunk of time to execute before suspending. The platform/kernel threads handle this quite well today. – Basil Bourque Apr 09 '21 at 18:04
  • Java already has tools for "rescheduling" threads; it's called "fork/join framework". – akuzminykh Apr 09 '21 at 18:11
2

I would have imagined that the blocked threads could be temporarily rescheduled to run some of the other submitted tasks.

That's what the other threads are for. If you create X threads and Y are blocked, you have the remaining X-Y threads to do other submitted tasks. Presumably, the number X was chosen specifically to get the number of concurrent tasks that the implementation and/or programmer thought was best.

You are asking why the implementation doesn't ignore this decision. The answer is because it makes more sense to choose the number of threads reasonably than have the implementation ignore that choice.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
2

You are partially right.

In the executor service scenario that you described, all the 9 threads will be blocked and only one thread will make progress. true.

The part where you are not quite right is when you attempt to expect the behaviour of OS and Java combined. See, the concept of threads exists at both OS and at Java level. But they are two different things. So there are Java-Threads and there are OS-Threads. Java-Threads are implemented through OS-Threads.

Imagine it this way, JVM has (say) 10 Java-Threads in it, some running, some not. Java borrows some OS-Thread to implement the running Java-Threads. Now when a Java-Thread gets blocked (for whatever reason) then what we know for sure is that the Java-Thread has been blocked. We cannot easily observe what happened to the underlying OS-Thread.

The OS could reclaim the OS-Thread and use it for something else, or it can stay blocked - it depends. But even if the OS-Thread is reused, still Java-Thread will remain blocked. In your thread pool scenario, still the nine Java-Threads will be blocked, and only one Java-Thread will be working.

inquisitive
  • 3,549
  • 2
  • 21
  • 47
  • 1
    I think the most helpful mental model, at least for me, is to not assume any specific mapping between Java threads and OS threads. Each Java thread could be backed by its own native thread, or all Java threads could be backed by the same native thread (and thus there would be no true parallelism at all), or anything in between. It's an implementation detail of the JVM. – Michael Apr 09 '21 at 18:29
  • @Michael That is common mistake that I see a lot around here – dreamcrash Apr 09 '21 at 18:43
  • @Michael What implementations of Java do not map each Java thread to an OS thread? – Basil Bourque Apr 09 '21 at 19:22
  • 1
    @BasilBourque I never said there was one, although with dozens of JVMs out there, including for purposes like embedded systems, I certainly wouldn't rule it out. I said it's usually not helpful to think of it in those terms, the same way it's usually not helpful to think about what native instructions bytecode will be mapped to. The JVM provides an abstraction, and most of the time it's convenient to think of it as a black box. That's the point of an abstraction. – Michael Apr 09 '21 at 20:37
  • @inquisitive the 9 Java threads will remain blocked but not the OS theads, so if I define a second thread pool and submit tasks to it, the OS will be able to start them? this is also the reason why we define custom forkjoinpools when using parallel streams, to avoid having too many parallel streams using the common forkjoinpool and blocking? – alfer Apr 19 '21 at 18:10
  • **(1)** > _if I define a second thread pool_ :: Yes OS will be able to start them - but only if they do not use the same lock. If they block on the same lock that threads on the first pool are waiting on, then the threads in the new pool will also block. | **(2)** > _Java threads will remain blocked but not the OS theads_ :: We cannot say for sure. We cannot easily observe what happened to the underlying OS-Thread. They _may_ get blocked or _may not_, there is no easy way to tell. – inquisitive Apr 20 '21 at 04:07
1

If I have a thread that is blocked waiting for a lock, can the OS reschedule that thread to do other work until the lock becomes available? From my understanding, it cannot be rescheduled, it just sits idle until it can acquire the lock. But it just seems inefficient.

I think you are thinking about this entirely wrong. Just because 10 of your 20 threads are "idle" doesn't mean that the operating system (or the JVM) is somehow consuming resources trying to manage these idle threads. Although in general we work on our applications to make sure that our threads are as unblocked as possible so we can achieve the highest throughput, there are tons of times that we write threads where we expect them to be idle most of the time.

If we have 100 tasks submitted to an ExecutorService, and 10 threads in the pool: if one of the threads holds a lock and the other 9 threads are waiting for that lock, then only the thread holding the lock can make progress. I would have imagined that the blocked threads could be temporarily rescheduled to run some of the other submitted tasks.

It is not the threads which are rescheduled, it is the CPU resources of the system. If 9 of your 10 threads are blocked in your thread-pool then other threads in your application (garbage collector) or other processes can be given more CPU resources on your server. This switching between jobs is what the modern operating systems are really good at and it happens many, many times a second. This is all quite normal.

Now, if your question is really "how do I improve the throughput of my application" then you are asking the right question. First off, you should make sure that your locks are as fine grained as possible to make sure that the thread holding the lock does so for a minimal amount of time. If the blocking happens too often then you should consider increasing the number of threads in the pool so that there is a higher likelihood that some of the jobs will run concurrently. This optimizing of the number of threads in a thread-pool is very application specific. See my post here for some more details: Concept behind putting wait(),notify() methods in Object class

Another thing you might consider is breaking your jobs up into pieces to separate the pieces that can run concurrently from the ones that need to be synchronized. You could have a pool of 10 threads doing the concurrent work and then a single thread doing the operations that require the locks. This is why the ExecutorCompletionService was written so that something downstream can take the results of a thread-pool and act on them as they complete. This will make your program more complicated and you'll need to worry about your queues if you are talking about some large number of jobs or large number of results but you can dramatically improve throughput if you do this right.

A good example of such refactoring is a situation where you have a processing job that has to write the results to a database. If at the end of each job, each thread in the pool needs to get a lock on the database connection then there will be a lot of contention for the lock and less concurrency. If, instead, the processing was done in a thread-pool and there was a single database update thread, it could turn off auto-commit and make updates from multiple jobs in a row between commits which could dramatically increase throughput. Then again, using multiple database connections managed by a connection pool might be a fine solution.

Gray
  • 115,027
  • 24
  • 293
  • 354