Scikit-learn and keras multicore command n_jobs = -1

Question

I just created an artificial neural network with Keras and I want to pass to it the Scikit-learn function cross_val_score to train it on some X_train and y_train of a data set.

import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score

def build_classifier():
    classifier = Sequential()
    classifier.add(Dense(units = 16, kernel_initializer = 'uniform', activation = 'relu', input_dim = 30))
    classifier.add(Dense(units = 16, kernel_initializer = 'uniform', activation = 'relu'))
    classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
    classifier.compile(optimizer = 'rmsprop', loss = 'binary_crossentropy', metrics = ['accuracy'])
    return classifier

classifier = KerasClassifier(build_fn = build_classifier, batch_size=25, epochs = 10)

results = cross_val_score(classifier, X_train, y_train, cv=10, n_jobs=-1)

The output I get is Epoch 1/1 repeated 4 times (I have 4 cores) and nothing else because after that it stucks and calculation never finishes. I tested n_jobs = -1 with any other Scikit-learn algorithms and it works fine. I'm not using GPU, only CPU.

To test the code just add the following normalized data set:

from sklearn.datasets import load_breast_cancer
data = load_breast_cancer()
df = pd.DataFrame(data['data'])
target = pd.DataFrame(data['target'])

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(df, target, test_size = 0.2, random_state = 0)

# Feature Scaling
from sklearn.preprocessing import StandardScaler 
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

After playing around with n_jobs (set to 1,2,3 or -1) I get some weird results, like Epoch 1/1 repeated only 3 times instead of 4 (even with n_jobs = -1) or when I interrupt the kernel here is what I get:

Process ForkPoolWorker-33:
Traceback (most recent call last):
  File "/home/myname/anaconda3/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
  File "/home/myname/anaconda3/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
  File "/home/myname/anaconda3/lib/python3.6/multiprocessing/pool.py", line 108, in worker
task = get()
  File "/home/myname/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/pool.py", line 362, in get
return recv()
  File "/home/myname/anaconda3/lib/python3.6/multiprocessing/connection.py", line 250, in recv
buf = self._recv_bytes()
  File "/home/myname/anaconda3/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
  File "/home/myname/anaconda3/lib/python3.6/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
KeyboardInterrupt

It could be something in multiprocessing but I don't know how to fix it.

see my answer here: https://stackoverflow.com/a/44985898/5025009 and make sure to use Theano backend. — seralouk, Dec 07 '18 at 23:02
It does not work. By the way I still want to use Tensorflow backend, and I'm not using Spyder, but the Jupyter notebook — bruco, Dec 08 '18 at 11:33
Yes sure, I just edited my question with some code to test a dataset — bruco, Dec 08 '18 at 12:11

score 0 · Answer 1 · answered Dec 08 '18 at 15:12

0

The above code works fine for me. Please upgrade your modules.

step 1) pip install --upgrade tensorflow

step 2) pip install keras

I tried and it works using TensorFlow Backend.

I have:

In [7]: sklearn.version Out[7]: '0.19.1'

In [8]: keras.version Out[8]: '2.2.4'

And:

import keras

/anaconda2/lib/python2.7/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type. from ._conv import register_converters as _register_converters

Using TensorFlow backend.

answered Dec 08 '18 at 15:12

seralouk

30,938
9
118
133

Everything is up to date, tensorflow 1.12.0, keras 2.2.4, scikit-learn 0.19.2. And still having Epoch 1/1 repeated four times with nothing else. How can I check the health of my cores? – bruco Dec 08 '18 at 18:48
just to say that it's becoming rather weird: after rerunning the same code and playing with n_jobs (=1,2,3 or -1) the output sometimes is Epoch 1/1 only 3 times (even if n_jobs is -1!), sometimes it splits the training data set before completing an epoch... i'll write in my question the output when I interrupt the kernel – bruco Dec 08 '18 at 19:19

score 0 · Accepted Answer · answered Dec 17 '18 at 10:19

I switched to sklearn version = 0.20.1

Now the n_jobs issue "works", since the command runs and finishes in shorter time than n_jobs = 1.

Nevertheless:

1) There is no sensible improvement in computation time for n_jobs = 2 or higher

2) In some cases I get this warning:

[Parallel(n_jobs=-1)]: Using backend LokyBackend with 2 concurrent workers.
/home/my_name/anaconda3/lib/python3.6/site-packages/sklearn/externals/joblib/externals/loky/process_executor.py:706: 
UserWarning: A worker stopped while some jobs were given to the executor. 
This can be caused by a too short worker timeout or by a memory leak.
  "timeout or by a memory leak.", UserWarning

And last remark: the interactive computing of the neural network with the epochs has not been showing anymore for n_jobs != 1 in the Jupyter notebook, but in the terminal (!?)

Scikit-learn and keras multicore command n_jobs = -1

2 Answers2

Linked

Related