I have written a code based on this site and made different multi-label classifiers.
I would like to evaluate my model based on accuracy per class and F1 measurement per class.
The problem is that I am getting the same number for both accuracy and f1 measurement in all models.
I am suspicious I have done something wrong. I would like to know in which circumstances this may happen.
the code is exactly the same as the site and I calculated the f1 measurement like this:
print('Logistic Test accuracy is {} '.format(accuracy_score(test[category], prediction)))
print 'Logistic f1 measurement is {} '.format(f1_score(test[category], prediction, average='micro'))
Update 1
this is the whole code,
df = pd.read_csv("finalupdatedothers.csv")
categories = ['ADR','WD','EF','INF','SSI','DI','others']
train,test = train_test_split(df,random_state=42,test_size=0.3,shuffle=True)
X_train = train.sentences
X_test = test.sentences
NB_pipeline = Pipeline([('tfidf', TfidfVectorizer(stop_words=stop_words)),
('clf',OneVsRestClassifier(MultinomialNB(fit_prior=True,class_prior=None))),])
for category in categories:
print 'processing {} '.format(category)
NB_pipeline.fit(X_train,train[category])
prediction = NB_pipeline.predict(X_test)
print 'NB test accuracy is {} '.format(accuracy_score(test[category],prediction))
print 'NB f1 measurement is {} '.format(f1_score(test[category],prediction,average='micro'))
print "\n"
and this is the output:
processing ADR
NB test accuracy is 0.821963394343
NB f1 measurement is 0.821963394343
and this is the way my data looks:
,sentences,ADR,WD,EF,INF,SSI,DI,others
0,"extreme weight gain, short-term memory loss, hair loss.",1,0,0,0,0,0,0
1,I am detoxing from Lexapro now.,0,0,0,0,0,0,1
2,I slowly cut my dosage over several months and took vitamin supplements to help.,0,0,0,0,0,0,1
3,I am now 10 days completely off and OMG is it rough.,0,0,0,0,0,0,1
4,"I have flu-like symptoms, dizziness, major mood swings, lots of anxiety, tiredness.",0,1,0,0,0,0,0
5,I have no idea when this will end.,1,0,0,0,0,0,1
Why am I getting the same number?
Thanks.