0

Background: I'm writing a program that reads video frames from multiple cameras concurrently. I'd like to have 1 process that performs the frame reads and a 2nd process that writes those frames to disk. I've been trying to identify the best way (in Python 3.6) to make the frames in the "read" process available to the "write" process for saving. I've landed on shared memory as the best option.

Problem: When I allocate shared memory for both processes, changes that the parent process makes in the shared memory space are not seen by the child process.

Previous Effort: I've been trying the method suggested by Is shared readonly data copied to different processes for multiprocessing?. However, pasting this code directly into Atom and trying to run it under Python 2.7 or Python 3.6 does not produce the same result that the linked answer provides (I.e the parent process does not see the changes made by my_func in the child process)

Note:

import multiprocessing
import ctypes
import numpy as np

shared_array_base = multiprocessing.Array(ctypes.c_double, 10*10)
shared_array = np.ctypeslib.as_array(shared_array_base.get_obj())
shared_array = shared_array.reshape(10, 10)

# Parallel processing
def my_func(i, def_param=shared_array):
    shared_array[i,:] = i

if __name__ == '__main__':
    pool = multiprocessing.Pool(processes=4)
    pool.map(my_func, range(10))

    print shared_array
eyllanesc
  • 235,170
  • 19
  • 170
  • 241
  • Shared memory is the most problematic to manage. You should probably look at pipes or queues. – heemayl Jul 11 '19 at 15:39
  • There's not much to manage, as one process just reads a frame and the other process writes it to disk. It also doesn't make sense to be sending raw video frames down a pipe or into a queue as the bandwidth required is huge. Writing the frame to memory once sounds a lot more efficient and that's important for my application (Multi camera, high FPS). – JulianPitney Jul 11 '19 at 16:05
  • how are you reading these frames? are you OK with limiting yourself to machines with Unix style fork semantics? – Sam Mason Jul 11 '19 at 16:38
  • I'm reading them using the Spinnaker PySpin API on windows. I'm not sure how windows fork semantics compare with Unix style. – JulianPitney Jul 11 '19 at 16:55
  • windows doesn't fork, it [spawns a new process](https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods) this is why your `Array` sharing code doesn't work, they're both interacting with different array objects in memory. under unix they'd do what you're expecting. I don't see any public API for that camera, so can't comment – Sam Mason Jul 11 '19 at 18:20

1 Answers1

0

been having a play with something that should be compatible with both Unix and Windows. first idea I came up with is using memory mapped files as this is basically the mechanism behind shared memory

they're pretty easy to work with under Python. for example, here's a function to create a numpy array this way:

import numpy as np
from ctypes import sizeof, c_double
from mmap import mmap, ACCESS_DEFAULT

def shared_array(
    shape, path, mode='rb', *,
    dtype=c_double, access=ACCESS_DEFAULT
):
    for n in reversed(shape):
        dtype *= n
    with open(path, mode) as fd:
        if fd.writable():
            size = fd.seek(0, 2)
            if size < sizeof(dtype):
                fd.truncate(sizeof(dtype))
        buf = mmap(fd.fileno(), sizeof(dtype), access=access)
    return np.ctypeslib.as_array(
        dtype.from_buffer(buf)
    )

i.e. create a "shared numpy array" with a given shape (i.e. (len,) or (rows, cols)), that appears in the filesystem at a given path, and with access given by mode (i.e. rb is readonly, r+b is read write).

this can be used with code like:

from multiprocessing import Pool, set_start_method

def fn(shape, path):
    array = shared_array(shape, path, 'r+b')
    np.fill_diagonal(array, 42)


def main():
    shape = (5, 10)
    path = 'tmp.buffer'
    array = shared_array(shape, path, 'w+b')
    with Pool() as pool:
        pool.apply(fn, (shape, path))
    print(array)


if __name__ == '__main__':
    set_start_method('spawn')
    main()

I use Linux, so have explicitly set a Windows style process spawning style

Sam Mason
  • 15,216
  • 1
  • 41
  • 60
  • This is working for me on windows. (I had to cut some stuff out because it wouldn't run with Python 3.6). I don't fully understand mmap or the detailed differences between fork and spawn but I'm happy it's working lol. Will read up on both. Thanks for the help. +1 – JulianPitney Jul 12 '19 at 18:23
  • Would you mind explaining why using `mmap` works while `multiporcessing.Array()` does not? – JulianPitney Jul 12 '19 at 18:36
  • My last comment at the top outlines the problem, try printing ’id(shared_array)’ in the parent and child process, they'll be different objects – Sam Mason Jul 12 '19 at 22:54