tl;dr
I too see strange results where running your recursive tasks concurrently takes much more time, not less. I have no idea why.
Details
At a glance, you code looks correct, though I did not study the recursive calls. (Minor issue: My IDE says your FakeMiniMax#getResult
method is never called.)
Executors framework
But… your code is obsolete. In modern Java, we rarely need to address the Thread
class directly. Instead, we use the Executors framework added to Java 5. That framework relieves us of the chore of juggling threads and managing thread pools.
As commented by markspace, use System.nanoTime
for more accurate benchmark timing. If you really want to get serious with benchmarking, use Java Microbenchmark Harness (JMH). Also, use Duration
to track elapsed time rather than doing the math (/1_000
).
Callable
You want to gather a result. So we use Callable
rather than Runnable
.
Double
return object
Lets use an object Double
as the return value from our recursive method, rather than the primitive double
. We must do this to parameterize the type of our Callable
.
Future
objects
When submitting a task to an executor service, we get back a Future
object. This object can be used to track the completion status. We will retrieve the result of the task from this Future
object: a Double
object. As we submit our multiple tasks to the executor service, we collect each Future
object returned to us.
We can submit multiple tasks to an executor service, then wait for them to finish. The current way to do so is to first call ExecutorService#shutdown
, followed by a call to awaitTermination
with a time-out. When the submitted tasks are done or cancelled, flow-of-control moves beyond the awaitTermination
call.
Example code
Here is my revised version of your FakeMiniMax
class.
package work.basil.demo.threadmark;
import java.time.Instant;
import java.util.concurrent.Callable;
public class FakeMiniMax implements Callable < Double >
{
private double result;
// `Callable` interface
@Override
public Double call ( ) throws Exception
{
System.out.println( "INFO - The `call` method of `FakeMiniMax` is starting to run in thread ID " + Thread.currentThread().getId() + " at " + Instant.now() + "." );
return fakeMiniMax( 0 );
}
public double fakeMiniMax ( int iterationNumber )
{
int total = 0;
if ( iterationNumber > 6 )
{
this.result = 100;
return total;
} else
{
this.result = - 1;
return this.fakeMiniMax( iterationNumber + 1 ) + this.fakeMiniMax( iterationNumber + 1 ) - this.fakeMiniMax( iterationNumber + 1 ) + this.fakeMiniMax( iterationNumber + 1 ) / 20 + this.fakeMiniMax( iterationNumber + 1 ) - 45 - this.fakeMiniMax( iterationNumber + 1 ) - this.fakeMiniMax( iterationNumber + 1 ) - this.fakeMiniMax( iterationNumber + 1 ) - this.fakeMiniMax( iterationNumber + 1 ) - this.fakeMiniMax( iterationNumber + 1 ) + this.fakeMiniMax( iterationNumber + 1 ) + this.fakeMiniMax( iterationNumber + 1 ) + this.fakeMiniMax( iterationNumber + 1 ) + this.fakeMiniMax( iterationNumber + 1 ) - this.fakeMiniMax( iterationNumber + 1 ) - this.fakeMiniMax( iterationNumber + 1 ) - this.fakeMiniMax( iterationNumber + 1 ) - this.fakeMiniMax( iterationNumber + 1 ) - this.fakeMiniMax( iterationNumber + 1 ) - this.fakeMiniMax( iterationNumber + 1 ) - this.fakeMiniMax( iterationNumber + 1 ) - this.fakeMiniMax( iterationNumber + 1 ) + this.fakeMiniMax( iterationNumber + 1 ) + this.fakeMiniMax( iterationNumber + 1 ) + this.fakeMiniMax( iterationNumber + 1 ) + this.fakeMiniMax( iterationNumber + 1 ) + this.fakeMiniMax( iterationNumber + 1 ) + this.fakeMiniMax( iterationNumber + 1 ) + this.fakeMiniMax( iterationNumber + 1 ) + this.fakeMiniMax( iterationNumber + 1 ) + this.fakeMiniMax( iterationNumber + 1 ) + this.fakeMiniMax( iterationNumber + 1 );
}
}
double getResult ( )
{
return result;
}
}
And here is the App
class that acts as a test harness.
package work.basil.demo.threadmark;
import java.time.Duration;
import java.time.Instant;
import java.util.ArrayList;
import java.util.List;
import java.util.Objects;
import java.util.concurrent.*;
public class App
{
public static void main ( String[] args )
{
System.out.println( "INFO - The `main` method of `App` is starting to run in thread ID " + Thread.currentThread().getId() + " at " + Instant.now() + "." );
long start = System.nanoTime();
ExecutorService executorService = Executors.newFixedThreadPool( 4 ); // Executors.newSingleThreadExecutor() or Executors.newFixedThreadPool( 4 )
// Populate a list of tasks to be executed on background threads.
int countTasks = 4;
List < FakeMiniMax > tasks = new ArrayList <>( countTasks );
for ( int nthTask = 0 ; nthTask < countTasks ; nthTask++ )
{
tasks.add( new FakeMiniMax() );
}
// Submit list of tasks to executor service. Collect each task’s `Future` object.
List < Future < Double > > futures = null;
try
{
futures = executorService.invokeAll( tasks );
executorService.shutdown(); // Prevents adding any more tasks.
try { executorService.awaitTermination( Duration.ofSeconds( 10 ).toMillis() , TimeUnit.MILLISECONDS ); } catch ( InterruptedException e ) { e.printStackTrace(); }
// Review the list of `Future` objects collected for each of all the tasks.
for ( Future < Double > future : futures )
{
try { System.out.println( "future.get() = " + future.get() ); } catch ( InterruptedException e ) { e.printStackTrace(); } catch ( ExecutionException e ) { e.printStackTrace(); }
}
}
catch ( InterruptedException e )
{
e.printStackTrace();
}
finally
{
if ( Objects.nonNull( executorService ) && ! executorService.isShutdown() )
{
executorService.shutdownNow(); // Prevents adding any more tasks.
}
}
Duration elapsed = Duration.ofNanos( System.nanoTime() - start );
System.out.println( "Our " + countTasks + " tasks took " + elapsed + "." );
System.out.println( "INFO - The `main` method of `App` is ending its run in thread ID " + Thread.currentThread().getId() + " at " + Instant.now() + "." );
}
}
Notice that we can swap out our chosen implementation of ExecutorService
. We can use a single-threaded executor service to run our tasks sequentially using only a single core. Or we can switch to a multi-threaded executor service backed by a fixed pool of threads. Just change that one line of code to run this test yourself in both modes.
Test results
When I run this on a fairly busy Mac mini (way too many web browser windows & tabs open!) with an Intel chip containing six cores but no Hyper-Threading, 32 gigs RAM, and Java 16, I get the following results.
Single-threaded
With a single thread, elapsed time is about 2.5 minutes.
Notice how the timestamps show each call
method running sequentially over time. The second does not start until the first is done, and so on.
Console output:
INFO - The `main` method of `App` is starting to run in thread ID 1 at 2021-05-09T06:05:39.348246Z.
INFO - The `call` method of `FakeMiniMax` is starting to run in thread ID 14 at 2021-05-09T06:05:39.417155Z.
INFO - The `call` method of `FakeMiniMax` is starting to run in thread ID 14 at 2021-05-09T06:06:30.348844Z.
INFO - The `call` method of `FakeMiniMax` is starting to run in thread ID 14 at 2021-05-09T06:07:07.385369Z.
INFO - The `call` method of `FakeMiniMax` is starting to run in thread ID 14 at 2021-05-09T06:07:43.701939Z.
future.get() = -53874.172111640604
future.get() = -53874.172111640604
future.get() = -53874.172111640604
future.get() = -53874.172111640604
Our 4 tasks took PT2M38.344593758S.
INFO - The `main` method of `App` is ending its run in thread ID 1 at 2021-05-09T06:08:17.737893Z.
Four threads
With a thread-pool of four threads, elapsed time is 7 minutes!
I have no idea why this is taking more time rather than the single-thread time, 7 minutes versus 2.5 minutes.
Notice how the timestamps show all four tasks starting at the same moment, within the same millisecond. Then time goes by as all four do their computational work simultaneously.
Console output:
INFO - The `main` method of `App` is starting to run in thread ID 1 at 2021-05-09T05:52:53.511070Z.
INFO - The `call` method of `FakeMiniMax` is starting to run in thread ID 15 at 2021-05-09T05:52:53.540791Z.
INFO - The `call` method of `FakeMiniMax` is starting to run in thread ID 14 at 2021-05-09T05:52:53.540791Z.
INFO - The `call` method of `FakeMiniMax` is starting to run in thread ID 16 at 2021-05-09T05:52:53.541806Z.
INFO - The `call` method of `FakeMiniMax` is starting to run in thread ID 17 at 2021-05-09T05:52:53.541910Z.
future.get() = -53874.172111640604
future.get() = -53874.172111640604
future.get() = -53874.172111640604
future.get() = -53874.172111640604
Our 4 tasks took PT7M6.872664491S.
INFO - The `main` method of `App` is ending its run in thread ID 1 at 2021-05-09T06:00:00.406725Z.
Conclusion
In my experiments running a revised version of your CPU-intensive code, like you I get crazy results. Rather than saving time, concurrency takes more time.
I am left befuddled by these results. I expected using four threads with six cores would cut the time by 3/4, but instead I see the time more than double. I ran this test at least a dozen times. For the single-threaded mode, I can see one of my six cores jumps to nearly 100% utilization for the whole time. In multi-threaded mode, I clearly see all 6 cores (not 4!) jump from 10-20% utilization to about 70-90% utilization. So the threads certainly seem to be running simultaneously across cores.
I hope that some one can study this code to offer an explanation.
My only uneducated wild thought: Is there only a single Floating-Point Unit (FPU) on my 3 GHz Intel Core i5 chip being shared across cores and therefore acting as a bottleneck for all your recursive double
math operations? Even if that were the case, why so much slower?
Tip: Notice how we include a call to Instant.now
on every call to System.out.println
. Study these timestamps! When accessing System.out
across threads, the output may be interleaved in unexpected ways. Never assume the sequential lines of console happened chronologically.