25

I am facing a problem now, unable to run any program in the cluster. It gives error.

OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 64 current, 64 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 64 current, 64 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 64 current, 64 max
OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 64 current, 64 max
Traceback (most recent call last):
  File "hello-world.py", line 1, in <module>
    from keras.models import Sequential
  File "/home/amalli2s/anaconda3/lib/python3.6/site-packages/keras/__init__.py", line 3, in <module>
    from . import utils
  File "/home/amalli2s/anaconda3/lib/python3.6/site-packages/keras/utils/__init__.py", line 2, in <module>
    from . import np_utils
  File "/home/amalli2s/anaconda3/lib/python3.6/site-packages/keras/utils/np_utils.py", line 6, in <module>
    import numpy as np
  File "/home/amalli2s/.local/lib/python3.6/site-packages/numpy/__init__.py", line 142, in <module>
    from . import add_newdocs
  File "/home/amalli2s/.local/lib/python3.6/site-packages/numpy/add_newdocs.py", line 13, in <module>
    from numpy.lib import add_newdoc
  File "/home/amalli2s/.local/lib/python3.6/site-packages/numpy/lib/__init__.py", line 8, in <module>
    from .type_check import *
  File "/home/amalli2s/.local/lib/python3.6/site-packages/numpy/lib/type_check.py", line 11, in <module>
    import numpy.core.numeric as _nx
  File "/home/amalli2s/.local/lib/python3.6/site-packages/numpy/core/__init__.py", line 16, in <module>
    from . import multiarray
SystemError: initialization of multiarray raised unreported exception

This problem, i assume to be same as this one Multiple instances of Python running simultaneously limited to 35

So according to the solution when I set export OPENBLAS_NUM_THREADS=1

then I end up getting following error:

terminate called after throwing an instance of 'std::system_error'
  what():  Resource temporarily unavailable
Aborted

Is there anybody else facing same issue or has idea on how to solve this ? Thank you.

EDIT: Ok, Seems like this happened because of some config restrictions the admins were trying to implement. Now it works again.

Arka Mallick
  • 1,206
  • 3
  • 15
  • 28

6 Answers6

60

I had this problem running numpy on an ubuntu server. I got all of the following errors, depending on whether I tried to import numpy in a shell or running my django app:

  • PyCapsule_Import could not import module "datetime" from
  • numpy.core._multiarray_umath import ( OpenBLAS blas_thread_init:
  • pthread_create failed for thread 25 of 32: Resource temporarily unavailable

I'm posting this answer since it drove me crazy. What helped for me was to add:

import os
os.environ['OPENBLAS_NUM_THREADS'] = '1'

before

import numpy as np

I guess the server had some limit for the amount of threads it allows(?). Hope it helps someone!

Ylor
  • 693
  • 1
  • 5
  • 10
  • 1
    Thanks, this helped me a lot – DovaX Mar 27 '20 at 02:47
  • 2
    Thank you @Ylor and @Haramoz. For me it worked this solution when adding `import os` `os.environ['OPENBLAS_NUM_THREADS'] = '1'` before importing pandas. All in my app.py file of my flask application – Corina Roca Jan 21 '21 at 16:27
  • 1
    To add a bit more to this, the error above is commonly encountered when using multithreading in Python, alongside OpenBLAS. In their [FAQ](https://github.com/xianyi/OpenBLAS/wiki/Faq#multi-threaded), OpenBLAS suggests this as a solution. [This](https://github.com/obspy/obspy/wiki/Notes-on-Parallel-Processing-with-Python-and-ObsPy) post also has some useful information. Both of the above were cribbed from [this answer](https://stackoverflow.com/a/56113154/5843188). – jpgard Feb 16 '22 at 00:49
11

This is for others in the future who encounter this error. The cluster setup most likely limits the number processes that can be run by a user on an interactive node. The clue is in the second line of the error:

OpenBLAS blas_thread_init: pthread_create: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 64 current, 64 max

Here the limit is set to 64. While this is quite sufficient for normal CLI use, it's probably not sufficient for interactively running Keras jobs (like the OP); or in my case, trying to run an interactive Dask cluster.

It may be possible to increase the limit from your shell using, say ulimit -u 10000, but that's not guaranteed to work. It's best to notify the admins like the OP.

suvayu
  • 4,271
  • 2
  • 29
  • 35
  • I was also trying to run a Keras program interactively , when I faced the error. Then the cluster admin removed the restrictions and i was able to use it thereafter. – Arka Mallick Feb 18 '19 at 13:02
  • 3
    In my case, `OpenBLAS blas_thread_init: pthread_create failed for thread 3 of 24: Resource temporarily unavailable OpenBLAS blas_thread_init: RLIMIT_NPROC 4096 current, 1540106 max ` – Jingnan Jia Jan 16 '21 at 03:32
8

Often this issue is related to the limit number of processes available through ulimit (in Linux):

→ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 127590
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4096         # <------------------culprit
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

A temporary solution is to increase this limit:

ulimit -u unlimited

Most servers I've encountered have this values set to the number of pending signals. E.g. ulimit -i. So, on my example above I did instead:

ulimit -u 127590

Then, added a line on my ~/.bashrc file to set it on login.

For more info on how to permanently fix this, check out: https://serverfault.com/a/485277

mimoralea
  • 9,590
  • 7
  • 58
  • 59
3

If you are a manager, you can:

  1. change temporarily the limit number of processes with the command ulimit -u [number]

  2. change permanently the limit number of processes, i.e. the parameter nproc in /etc/security/limit.conf

If you are a user, you can:

  1. In bash
    $ export OPENBLAS_NUM_THREADS=2
    $ export GOTO_NUM_THREADS=2
    $ export OMP_NUM_THREADS=2
  1. In Python
    >>> import os
    >>> os.environ['OPENBLAS_NUM_THREADS'] = '1`

Then the problem due to multiple-threads in Python should be solved. The key is to set the number of threads to be less than the limit for you in the cluster.

willcrack
  • 1,794
  • 11
  • 20
MrLi
  • 31
  • 3
  • This works from `bash terminal` but from jupyter notebook its still failing. I did set these environment variable using below script: import os os.environ['OPENBLAS_NUM_THREADS']='4' os.environ['GOTO_NUM_THREADS']='4' os.environ['OMP_NUM_THREADS']='4' – Parvez Khan Aug 24 '23 at 05:05
1

Building on Ylor's answer, rather than limiting yourself to a single thread, read through the error outputs (here are the first few lines of mine):

OpenBLAS blas_thread_init: pthread_create failed for thread 13 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 2048 current, 384066 max
OpenBLAS blas_thread_init: pthread_create failed for thread 58 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC 2048 current, 384066 max
...

and find the minimum thread number which failed–then set the thread count to one fewer (12 for me here):

>>> import os
>>> os.environ['OPENBLAS_NUM_THREADS'] = '12`

This will maximize your code's ability to use threads while still staying within current system limits (if unable to change).

1

Had this exact problem in Docker, it seems it's an issue with older versions.

In case anyone has this problem and cannot upgrade Docker version, the workaround I find is to add to the container run the following option

--security-opt seccomp=unconfined
vlizana
  • 2,962
  • 1
  • 16
  • 26