I'm trying to simulate some processes in order to get some statistics. I decided to write simulation program using multiple threads as each test run is independant.
It means that if I need to perform e.g. 1000 test runs then it should be possible to have 4 threads (each doing 250 test runs).
While doing this I found that addition of multiple threads does not decrease simulation time.
I have Windows 10 laptop with 4 physical cores.
I wrote a simple program which shows behaviour I'm talking about.
from concurrent.futures import ThreadPoolExecutor
import time
import psutil
import random
def runScenario():
d = dict()
for i in range(0, 10000):
rval = random.random()
if rval in d:
d[rval] += 1
else:
d[rval] = 1
return len(d)
def runScenarioMultipleTimesSingleThread(taskId, numOfCycles):
print('starting thread {}, numOfCycles is {}'.format(taskId, numOfCycles))
sum = 0
for i in range(numOfCycles):
sum += runScenario()
print('thread {} finished'.format(taskId))
return sum
def modelAvg(numOfCycles, numThreads):
pool = ThreadPoolExecutor(max_workers=numThreads)
cyclesPerThread = int(numOfCycles / numThreads)
numOfCycles = cyclesPerThread * numThreads
futures = list()
for i in range(numThreads):
future = pool.submit(runScenarioMultipleTimesSingleThread, i, cyclesPerThread)
futures.append(future)
sum = 0
for future in futures:
sum += future.result()
return sum / numOfCycles
def main():
p = psutil.Process()
print('cpus:{}, affinity{}'.format(psutil.cpu_count(), p.cpu_affinity() ))
start = time.time()
modelAvg( numOfCycles = 10000, numThreads = 4)
end = time.time()
print('simulation took {}'.format(end - start))
if __name__ == '__main__':
main()
These are the results:
One thread:
cpus:8, affinity[0, 1, 2, 3, 4, 5, 6, 7]
starting thread 0, numOfCycles is 10000
thread 0 finished
simulation took 23.542529582977295
Four threads:
cpus:8, affinity[0, 1, 2, 3, 4, 5, 6, 7]
starting thread 0, numOfCycles is 2500
starting thread 1, numOfCycles is 2500
starting thread 2, numOfCycles is 2500
starting thread 3, numOfCycles is 2500
thread 1 finished
thread 2 finished
thread 0 finished
thread 3 finished
simulation took 23.508538484573364
I expect that when using 4 threads simulation time should be ideally 4 times smaller, and of cause it should not be the same.