I have a random forest classifier stored in the object clf
. In really simplified terms, I did the following:
# Import libraries
import pandas as pd
from import sklearn.ensemble import RandomForestClassifier as rfc
# Import data
exog = pd.read_csv('train.csv')
trgt = pd.read_csv('target.csv')
# Declare classifier
clf = rfc(n_estimators=51, bootstrap=True, max_features=3)
# Fit classifier to data
clf.fit(exog, trgt)
I would like to export clf
so I can reference it in another script. My goal is to import clf
into a Python script that will be running on a remote server. I want to input out-of-sample data into it and have it return their respective scores using clf.predict_proba(new_data)
.
My top priority is to avoid training the classifier every time I predict the probabilities for new datasets. Is there a way to export the tuned clf
object?
This thread pointed me in the right direction, but the solution is using cPickle and it's throwing the following error:
TypeError: write() argument must be str, not bytes