Highest Voted 'joblib' Questions

125

votes

10 answers

ImportError: cannot import name 'joblib' from 'sklearn.externals'

I am trying to load my saved model from s3 using joblib import pandas as pd import numpy as np import json import subprocess import sqlalchemy from sklearn.externals import joblib ENV = 'dev' model_d2v = load_d2v('model_d2v_version_002', ENV) def…

asked May 19 '20 at 14:36

Praneeth Sai

1,421
2
7
11

94

votes

4 answers

What does the delayed() function do (when used with joblib in Python)

I've read through the documentation, but I don't understand what is meant by: The delayed function is a simple trick to be able to create a tuple (function, args, kwargs) with a function-call syntax. I'm using it to iterate over the list I want to…

python multiprocessing joblib

asked Feb 14 '17 at 07:44

orrymr

2,264
4
21
29

70

votes

9 answers

How can we use tqdm in a parallel execution with joblib?

I want to run a function in parallel, and wait until all parallel nodes are done, using joblib. Like in the example: from math import sqrt from joblib import Parallel, delayed Parallel(n_jobs=2)(delayed(sqrt)(i ** 2) for i in range(10)) But, I want…

python parallel-processing joblib tqdm

asked Jun 14 '16 at 06:17

Dror Hilman

6,837
9
39
56

64

votes

5 answers

Joblib UserWarning while trying to cache results

I get the following UserWarning when trying to cache results using joblib: import numpy from tempfile import mkdtemp cachedir = mkdtemp() from joblib import Memory memory = Memory(cachedir=cachedir, verbose=0) @memory.cache def…

python numpy netcdf joblib

asked May 10 '16 at 05:12

user308827

21,227
87
254
417

58

votes

11 answers

Tracking progress of joblib.Parallel execution

Is there a simple way to track the overall progress of a joblib.Parallel execution? I have a long-running execution composed of thousands of jobs, which I want to track and record in a database. However, to do that, whenever Parallel finishes a…

python multithreading parallel-processing multiprocessing joblib

asked Jul 27 '14 at 17:20

Cerin

60,957
96
316
522

43

votes

1 answer

Out-of-core processing of sparse CSR arrays

How can one apply some function in parallel on chunks of a sparse CSR array saved on disk using Python? Sequentially this could be done e.g. by saving the CSR array with joblib.dump opening it with joblib.load(.., mmap_mode="r") and processing the…

python scipy apache-spark-mllib dask joblib

asked Jul 17 '17 at 13:20

rth

10,680
7
53
77

36

votes

8 answers

How to properly pickle sklearn pipeline when using custom transformer

I am trying to pickle a sklearn machine-learning model, and load it in another project. The model is wrapped in pipeline that does feature encoding, scaling etc. The problem starts when i want to use self-written transformers in the pipeline for…

python scikit-learn persistence pipeline joblib

asked Sep 11 '19 at 11:36

spiral

381
1
3
6

26

votes

6 answers

KeyError when loading pickled scikit-learn model using joblib

I have an object that contains within it two scikit-learn models, an IsolationForest and a RandomForestClassifier, that I would like to pickle and later unpickle and use to produce predictions. Apart from the two models, the object contains a couple…

python python-3.x scikit-learn joblib

asked Feb 23 '18 at 12:43

haroba

2,120
4
22
37

24

votes

3 answers

Printed output not displayed when using joblib in jupyter notebook

So I am using joblib to parallelize some code and I noticed that I couldn't print things when using it inside a jupyter notebook. I tried using doing the same example in ipython and it worked perfectly. Here is a minimal (not) working example to…

python parallel-processing jupyter-notebook jupyter joblib

asked May 02 '19 at 15:07

Zaccharie Ramzi

2,106
1
18
37

24

votes

2 answers

How to write to a shared variable in python joblib

The following code parallelizes a for-loop. import networkx as nx; import numpy as np; from joblib import Parallel, delayed; import multiprocessing; def core_func(repeat_index, G, numpy_arrary_2D): for u in G.nodes(): …

python parallel-processing shared-memory joblib

asked Oct 10 '17 at 02:43

user3813057

891
3
13
31

24

votes

2 answers

Why is it important to protect the main loop when using joblib.Parallel?

The joblib docs contain the following warning: Under Windows, it is important to protect the main loop of code to avoid recursive spawning of subprocesses when using joblib.Parallel. In other words, you should be writing code like this: import…

python multiprocessing joblib

asked Apr 09 '15 at 17:53

Joe

3,831
4
28
44

22

votes

3 answers

How do I store a TfidfVectorizer for future use in scikit-learn?

I have a TfidfVectorizer that vectorizes collection of articles followed by feature selection. vectroizer = TfidfVectorizer() X_train = vectroizer.fit_transform(corpus) selector = SelectKBest(chi2, k = 5000 ) X_train_sel =…

python python-3.x scikit-learn tf-idf joblib

asked Sep 24 '15 at 15:14

user2161903

577
1
6
22

21

votes

7 answers

spacy with joblib library generates _pickle.PicklingError: Could not pickle the task to send it to the workers

I have a large list of sentences (~7 millions), and I want to extract the nouns from them. I used joblib library to parallelize the extracting process, like in the following: import spacy from tqdm import tqdm from joblib import Parallel,…

python python-3.x parallel-processing spacy joblib

asked Jul 04 '19 at 08:48

Minions

5,104
5
50
91

21

votes

2 answers

how to save a scikit-learn pipline with keras regressor inside to disk?

I have a scikit-learn pipline with kerasRegressor in it: estimators = [ ('standardize', StandardScaler()), ('mlp', KerasRegressor(build_fn=baseline_model, nb_epoch=5, batch_size=1000, verbose=1)) ] pipeline = Pipeline(estimators) After,…

python machine-learning scikit-learn keras joblib

asked Jun 23 '16 at 06:57

Dror Hilman

6,837
9
39
56

21

votes

3 answers

Python scikit learn n_jobs

This is not a real issue, but I'd like to understand: running sklearn from Anaconda distrib on a Win7 4 cores 8 GB system fitting a KMeans model on a 200.000 samples*200 values table. running with n-jobs = -1: (after adding the if __name__ ==…

python parallel-processing scikit-learn joblib

asked Sep 24 '15 at 12:38

Bruno Hanzen

351
1
2
7

Questions tagged [joblib]