0

After running the code below, the expected output should be:

[50000000.0, 50000001.0, 50000002.0, ... 50000020.0], but now the result is weird
[50000000.0, 50000000.0, 50000000.0, 50000004.0, 50000004.0, 50000004.0, 50000008.0, 50000008.0, 50000008.0, 50000008.0, 50000008.0, 50000012.0, 50000012.0, 50000012.0, 50000016.0, 50000016.0, 50000016.0, 50000016.0, 50000016.0, 50000020.0]

Is there anything wrong with my code? Python: 3.7.4, Operation system: win 10

from multiprocessing import Pool, Lock, Array, Queue
import os, time

array = Array('f', 20)
lock = Lock()


def long_time_task(i):
    print('Run task %s (%s)...' % (i, os.getpid()))
    start = time.time()

    total_count = 0
    for k in range(5*10**7): total_count += 1
    total_count += i
    lock.acquire()
    array[i] = total_count
    lock.release()

    end = time.time()
    print('Task %s runs %0.2f seconds.' % (i, (end - start)))


def init(l,a):
    global lock
    global array
    lock = l
    array = a


def mainFunc():
    print('Parent process %s.' % os.getpid())

    p = Pool(initializer=init, initargs=(lock,array))

    for i in range(20): p.apply_async(long_time_task, args=(i,))

    print('Waiting for all subprocesses done...')
    p.close()
    p.join()
    print('All subprocesses done.')

if __name__ == '__main__':

    mainFunc()
    print(array[:])
Jiawei Lu
  • 509
  • 6
  • 16
  • Tried searching? Read https://stackoverflow.com/questions/39122270/multiprocessing-shared-array#39124004 Also please read the multiprocessing documentation about protecting main code on Windows – DisappointedByUnaccountableMod Mar 24 '20 at 08:37
  • Basically the way your code works `array` is defined locally in each process - you need to define it in the __protected__ main code and pass it as a parameter - there are examples https://stackoverflow.com/questions/39122270/multiprocessing-shared-array#39124004 – DisappointedByUnaccountableMod Mar 24 '20 at 08:41
  • @barny that's not correct, `multiprocessing.Array` is a shared synchronised array proxy, it exists to be safely shared between multiple processes. – Masklinn Mar 24 '20 at 08:56

2 Answers2

1

You are using single-precision (32b) floats:

array = Array('f', 20)

32b floats only have 23+1 bits fractional, which corresponds to a bit above 7 digits (24 * log10(2) = 7.22).

Your numbers are 8 digits, which means they can not store exactly in your arrays and will get rounded to the nearest multiple of the missing bits (so 4 decimal). Use an array of 32-bit integers or 64b floats instead.

You could have realised the issue was the array's definition (and typecode) if you'd just tried to store your values in the array directly in sequential code (without even involving multiple processes) as it exhibits the exact same behaviour.

That aside:

  • the array is implicitly locked and you're never read-update-write-ing, each tasks literally just sets its own personal cell, the explicit lock is useless and redundant
  • what you're doing is a trivial map, why aren't you using Pool.map/Pool.imap, possibly even Poo.imap_unordered?
Masklinn
  • 34,759
  • 3
  • 38
  • 57
1

After tried many possible fix method to find it, finally i found the array's type is wrong. you should use array = Array('i', 20), not f
output:

All subprocesses done.
[50000000, 50000001, 50000002, 50000003, 50000004, 50000005, 50000006, 50000007, 50000008, 50000009, 50000010, 50000011, 50000012, 50000013, 50000014, 50000015, 50000016, 50000017, 50000018, 50000019]

gamesun
  • 227
  • 1
  • 10