Python Process wont allocate more than about 20GB of RAM

Question

I am trying to train a MultinomialNB classifier on a huge set of data (features as well as targets, (about 75k x 130k)). I am aware of the fact, that this classifier will generate a distinct one for each target, thus the memory is expected to explode.

However, the process won't allocate more than about 20GB of RAM even though the machine has about 640GB.

I have tried to set memory lock, tried to run as root (which I have to to adjust these limits), but it won't work.

Traceback (most recent call last):
    File "test_classifiers.py", line 202, in <module>
        train_mb()
    File "test_classifiers.py", line 168, in train_mb
        mb_classifier.partial_fit(X, y, list(set(y)))
    File "/usr/local/lib/python3.5/dist-packages/sklearn/naive_bayes.py", line 539, in partial_fit
        Y = label_binarize(y, classes=self.classes_)
    File "/usr/local/lib/python3.5/dist-packages/sklearn/preprocessing/label.py", line 657, in label_binarize
        Y = Y.toarray()
    File "/usr/local/lib/python3.5/dist-packages/scipy/sparse/compressed.py", line 1024, in toarray
        out = self._process_toarray_args(order, out)
    File "/usr/local/lib/python3.5/dist-packages/scipy/sparse/base.py", line 1186, in _process_toarray_args
        return np.zeros(self.shape, dtype=self.dtype, order=order)
MemoryError

resource.setrlimit(resource.RLIMIT_MEMLOCK, (-1, -1))

and

resource.setrlimit(resource.RLIMIT_MEMLOCK, (resource.RLIM_INFINITY, resource.RLIM_INFINITY))

Have been tried, any Ideas? Does it correlate to the fact, that only one CPU can be used, using this classifier?

The question lacks detail. Please provide a [Minimal, Complete and Verifiable Example](https://stackoverflow.com/help/minimal-reproducible-example). — Ruslan, Aug 24 '19 at 14:19
Possible duplicate: [memory error in python](https://stackoverflow.com/q/11283220/673852). In particular, see [this answer](https://stackoverflow.com/a/37726090/673852). — Ruslan, Aug 25 '19 at 16:44
Another relevant post: [Python/Numpy MemoryError](https://stackoverflow.com/q/4318615/673852). — Ruslan, Aug 25 '19 at 16:51

Python Process wont allocate more than about 20GB of RAM

0 Answers0