0
def func(item, protein, ncpu):

    output = None

    item_id = item.id

    output_fname = tempfile.mkstemp(suffix='_output.json', text=True)[1]
    input_fname = tempfile.mkstemp(suffix='_input.pdbqt', text=True)[1]   # <-- error occurs here

    try:
        with open(input_fname, 'wt') as f:
            f.write(preprocess(item))   # <- convert item to text format, not important

        python_exec = sys.executable
        cmd = f'{python_exec} script.py -i {input_fname} -p {protein} -o {output_fname} -c {ncpu}'
        subprocess.run(cmd, shell=True)

        with open(output_fname) as f:
            res = f.read()
            if res:
                res = json.loads(res)
                output = {'score': res['score'],
                          'block': res['poses']}

    finally:
        os.unlink(input_fname)
        os.unlink(output_fname)

    return item_id, output


with Pool(ncpu) as pool:
    for item_id, res in pool.imap_unordered(partial(func, **kwargs), tuple(items), chunksize=1):
        yield item_id, res

I process multiple items using multiprocessing.Pool. For every item I run a python script from subprocess shell. Before, I created two temporary files and pass them as arguments to the script. The script.py calls C-extension which process an item. After, I parse the output json file and return values, if any. Temporary files should be destroyed. in a finally section. However, after I process 3880-3920 items I got an error:

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/pavlop/anaconda3/envs/vina_cache/lib/python3.9/multiprocessing/pool.py", line 125, in worker
  File "/home/pavlop/python/docking-scripts/moldock/vina_dock.py", line 93, in func
OSError: [Errno 24] Too many open files: '/var/tmp/pbs.147815.login/tmpp8tqblfv_input.pdbqt'
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/pavlop/python/docking-scripts/moldock/run_dock.py", line 230, in <module>
    main()
  File "/home/pavlop/python/docking-scripts/moldock/run_dock.py", line 203, in main
    for i, (item_id, res) in enumerate(docking(mols,
  File "/home/pavlop/python/docking-scripts/moldock/run_dock.py", line 74, in docking
    for item_id, res in pool.imap_unordered(partial(func, **kwargs), tuple(items), chunksize=1):
  File "/home/pavlop/anaconda3/envs/vina_cache/lib/python3.9/multiprocessing/pool.py", line 870, in next
    raise value
OSError: [Errno 24] Too many open files: '/var/tmp/pbs.147815.login/tmpp8tqblfv_input.pdbqt'

What do I do wrong or miss? Why file descriptors are not released? Could it happen that C-extension does not release file descriptors?

I see that temporary files are created and removed as expected. ulimit (soft and hard) was set to 1000000. I checked all my code and all files are opened using with statement to avoid leaking.

If I replace multiprocessing.Pool with dask cluster, everything works as expected, no errors.

UPDATE:

I checked output of lsof. Really both temporary files remain open for every item and they are accumulated over time in every running process, but they have status? (deleted). So the issue in that how I manage them. However, since the ulimit is large, I should not observe this error.

UPDATE2:

It seems that I have to close descriptors manually. It worked on a test run, have to check on a larger run.

fd, name - tempfile.mkstemp()
try:
    ...
finally:
    os.close(fd)
    os.unlink(name)
DrDom
  • 4,033
  • 1
  • 21
  • 23
  • 1
    You could comment out the subprocess call (and adjust the remaining code accordingly) to check if the subprocess causes the problem. – Michael Butscher Apr 29 '23 at 19:28
  • Good idea! Unfortunately the error persists, occurred after 3700 processed items. – DrDom Apr 29 '23 at 20:10
  • @MichaelButscher, the problem should be in management of temporary files, see update – DrDom Apr 29 '23 at 20:33
  • 1
    `mkstemp` creates a file and also returns a file descriptor to the open file (which isn't closed and accumulates). You can wrap the file descriptor in a file object (instead of creating new file objects) as described here: https://stackoverflow.com/a/1296063/987358 – Michael Butscher Apr 29 '23 at 20:58

1 Answers1

0

The solution was indeed as proposed in the UPDATE2. A file descriptor should be explicitly closed. It is not enough just to remove a file.

fd, name - tempfile.mkstemp()
try:
    ...
finally:
    os.close(fd)
    os.unlink(name)
DrDom
  • 4,033
  • 1
  • 21
  • 23