I want to persist my ML model to my local machine. I have followed https://stackoverflow.com/a/29291153/11543494 this answer to store ML model in local machine but when I load persisted ML model from local machine I get Key Error.
I have created ML model which will predict category of URL. Now I want to integrate ML model with my web application and that is why I created API using Flask.
I have tested my ML model in jupyter notebook where I have my code related to ML model now, I just want to dump my ML model and use it in my API. In jupyter notebook I am getting proper output but when I load persisted file in my API code, I get KeyError. I tried using pickle, joblib but there I am getting MemoryError, I tried to resolve that also but I was unable to solve that issue so I am trying Klepto.
Klepto code
from klepto.archives import dir_archive
model = dir_archive('E:/Mayur/Sem 5/Python project/model_klepto',{'result':gs_clf},serialized=True, cached=False)
#gs_clf = gs_clf.fit(x_train, y_train) #RandomizedSearchCV
model.dump()
API code
import numpy as np
from flask import Flask, request, jsonify, render_template
from klepto.archives import dir_archive
app = Flask(__name__)
demo = dir_archive(
'E:/Mayur/Sem 5/Python project/model_klepto', {}, serialized=True, cached=False)
demo.load()
@app.route('/')
def home():
return render_template('index.html')
@app.route('/predict', methods=['POST'])
def predict():
input = request.form.values()
final_feature = [np.array(input)]
prediction = demo['result'].predict([str(final_feature)])
return render_template('index.html', prediction_text=prediction)
if __name__ == "__main__":
app.run(debug=True)
When I run API I get KeyError:'result'.
If I run below code in jupyter notebook, I get correct output
demo = dir_archive(
'E:/Mayur/Sem 5/Python project/model_klepto', {}, serialized=True, cached=False)
demo.load()
demo
Output>
ir_archive('model_klepto', {'result': RandomizedSearchCV(cv='warn', error_score='raise-deprecating',
estimator=Pipeline(memory=None,
steps=[('vect',
CountVectorizer(analyzer='word',
binary=False,
decode_error='strict',
dtype=<class 'numpy.int64'>,
encoding='utf-8',
input='content',
lowercase=True,
max_df=1.0,
max_features=None,
min_df=1,
ngram_range=(1,
1),
preprocessor=None,
stop_words=None,
strip_accen...
sublinear_tf=False,
use_idf=True)),
('clf',
MultinomialNB(alpha=1.0,
class_prior=None,
fit_prior=True))],
verbose=False),
iid='warn', n_iter=5, n_jobs=None,
param_distributions={'clf__alpha': (0.01, 0.001),
'tfidf__use_idf': (True, False),
'vect__ngram_range': [(1, 1), (1, 2)]},
pre_dispatch='2*n_jobs', random_state=None, refit=True,
return_train_score=False, scoring=None, verbose=0)}, cached=False)
demo['result'].predict(['http://www.windows.com'])
Output> array(['Computers'], dtype=
Here is the screenshot of the stack trace Stack trace