I'm new to artificial intelligence. I'm working with the SVM algorithm and ran this Python script to train/predict if an email is spam or not. The script works:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn import svm
# dependencies
# pip install pandas
# pip install -U scikit-learn
spam = pd.read_csv('Cartel1.csv')
z = spam['v2']
y = spam["v1"]
#Splitting our data into training and testing.
z_train, z_test,y_train, y_test = train_test_split(z,y,test_size = 0.2)
#Converting text into integer using CountVectorizer
cv = CountVectorizer()
features = cv.fit_transform(z_train)
svm = svm.SVC()
svm.fit(features,y_train)
features_test = cv.transform(z_test)
comment =["Sexy free Call and text messages on 08002986030"]
vect= cv.transform(comment)
print("This comment: ", comment, " is: ", svm.predict(vect))#spam
comment2 =["Hi there, I am emailing you today to let you know we have created a new task for you."]
vect2= cv.transform(comment2)
print("This comment: ", comment2, " is: ", svm.predict(vect2))#ham --no spam
#print(model.score(features_test,y_test))
But I was hoping that I could inspect the model to get the most frequent words classified as "spam" and "ham." I would like to get a result similar to this: Determining the most contributing features for SVM classifier in sklearn
I would like to get the most frequent words classified as spam or ham.