In scikit-learn 0.24.0 or above when you use either GridSearchCV or RandomizedSearchCV and set n_jobs=-1, with setting any verbose number (1, 2, 3, or 100) no progress messages gets printed. However, if you use scikit-learn 0.23.2 or lower, everything works as expected and joblib prints the progress messages.
Here is a sample code you can use to repeat my experiment in Google Colab or Jupyter Notebook:
from sklearn import svm, datasets
from sklearn.model_selection import GridSearchCV
iris = datasets.load_iris()
parameters = {'kernel':('linear', 'rbf'), 'C':[0.1, 1, 10]}
svc = svm.SVC()
clf = GridSearchCV(svc, parameters, scoring='accuracy', refit=True, n_jobs=-1, verbose=60)
clf.fit(iris.data, iris.target)
print('Best accuracy score: %.2f' %clf.best_score_)
Results using scikit-learn 0.23.2:
Fitting 5 folds for each of 6 candidates, totalling 30 fits
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 40 concurrent workers.
[Parallel(n_jobs=-1)]: Done 1 tasks | elapsed: 0.0s
[Parallel(n_jobs=-1)]: Batch computation too fast (0.0295s.) Setting batch_size=2.
[Parallel(n_jobs=-1)]: Done 2 out of 30 | elapsed: 0.0s remaining: 0.5s
[Parallel(n_jobs=-1)]: Done 3 out of 30 | elapsed: 0.0s remaining: 0.3s
[Parallel(n_jobs=-1)]: Done 4 out of 30 | elapsed: 0.0s remaining: 0.3s
[Parallel(n_jobs=-1)]: Done 5 out of 30 | elapsed: 0.0s remaining: 0.2s
[Parallel(n_jobs=-1)]: Done 6 out of 30 | elapsed: 0.0s remaining: 0.2s
[Parallel(n_jobs=-1)]: Done 7 out of 30 | elapsed: 0.0s remaining: 0.1s
[Parallel(n_jobs=-1)]: Done 8 out of 30 | elapsed: 0.0s remaining: 0.1s
[Parallel(n_jobs=-1)]: Done 9 out of 30 | elapsed: 0.0s remaining: 0.1s
[Parallel(n_jobs=-1)]: Done 10 out of 30 | elapsed: 0.0s remaining: 0.1s
[Parallel(n_jobs=-1)]: Done 11 out of 30 | elapsed: 0.0s remaining: 0.1s
[Parallel(n_jobs=-1)]: Done 12 out of 30 | elapsed: 0.0s remaining: 0.1s
[Parallel(n_jobs=-1)]: Done 13 out of 30 | elapsed: 0.0s remaining: 0.1s
[Parallel(n_jobs=-1)]: Done 14 out of 30 | elapsed: 0.0s remaining: 0.1s
[Parallel(n_jobs=-1)]: Done 15 out of 30 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=-1)]: Done 16 out of 30 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=-1)]: Done 17 out of 30 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=-1)]: Done 18 out of 30 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=-1)]: Done 19 out of 30 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=-1)]: Done 20 out of 30 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=-1)]: Done 21 out of 30 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=-1)]: Done 22 out of 30 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=-1)]: Done 23 out of 30 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=-1)]: Done 24 out of 30 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=-1)]: Done 25 out of 30 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=-1)]: Done 26 out of 30 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=-1)]: Done 27 out of 30 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=-1)]: Done 28 out of 30 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=-1)]: Done 30 out of 30 | elapsed: 0.1s remaining: 0.0s
[Parallel(n_jobs=-1)]: Done 30 out of 30 | elapsed: 0.1s finished
Best accuracy score: 0.98
Results using scikit-learn 0.24.0 (tested up to v1.0.2):
Fitting 5 folds for each of 6 candidates, totaling 30 fits
Best accuracy score: 0.98
It appears to me that scikit-learn 0.24.0 or above are not sending "verbose" value to joblib
and therefore, the progress is not printing when multiprocessors are used in GridSearch or RandomizedSearchCV with "loky" backend.
Any idea how to solve this issue in Google Colab or Jupyter Notebook and get the progress log printed for sklearn 0.24.0 or above?