Reading and updating is not thread-safe – here's an example that you can try to use locally to see the effect in practice:
from threading import Thread
def add_to_counter(ctr):
for i in range(100000):
ctr['ctr'] = ctr.get('ctr', 0) + 1
ctr = {}
t1 = Thread(target=add_to_counter, args=(ctr,))
t2 = Thread(target=add_to_counter, args=(ctr,))
t1.start()
t2.start()
t1.join()
t2.join()
print(ctr['ctr'])
The results obviously depend on the scheduling and other system/timing-dependent details, but on my system I consistently get different numbers under 200000
.
Solution 1: Locks
You could require the threads to acquire a lock every time before they modify the dictionary. This slows down the program execution somewhat.
Solution 2: Sum the counters at the end
Depending on your exact use case, you might be able to assign a separate counter to each thread, and sum the counts together after the threads have finished counting. The dictionary-like collections.Counter
allows you to easily add two counters together (here's the above example modified to use Counters):
from collections import Counter
from threading import Thread
def add_to_counter(counter):
for i in range(100000):
counter['ctr'] = counter.get('ctr', 0) + 1
ctr1 = Counter()
ctr2 = Counter()
t1 = Thread(target=add_to_counter, args=(ctr1,))
t2 = Thread(target=add_to_counter, args=(ctr2,))
t1.start()
t2.start()
t1.join()
t2.join()
ctr = ctr1 + ctr2
print(ctr['ctr'])