I'm trying to predict some features from trained data. However , I'm in trouble with python. I have to make sure path of it.
My first python file looks like ;
dataset = pandas.read_csv('/root/Desktop/data.csv' , encoding='cp1252')
test_size = 0.2
X_train_raw, X_test_raw, y_train, y_test = train_test_split(dataset['text'],dataset['age'],test_size=test_size)
vectorizer = TfidfVectorizer()
X_train = vectorizer.fit_transform(X_train_raw)
classifier = LogisticRegression()
svm_=classifier.fit(X_train, y_train)
save = joblib.dump(svm_,'myfile.pkl')
Second python file looks like ;
datasetforprediction = pandas.read_csv('/root/Desktop/predict.csv' , encoding='cp1252')
load = joblib.load('myfile.pkl')
vectorizer = TfidfVectorizer()
Test = vectorizer.fit_transform(datasetforprediction['text'])
x=load.predict(Test)
Error : ValueError: X has 505 features per sample; expecting 18063