0

I am trying to execute a oneclass svm study in Python (Jupyter Notebook). But I am not able to use my predefined validation set to optimize the parameters of OneClassSVM in sklearn.GridSearch(). There is a solution for regular svm here: Using explicit (predefined) validation set for grid search with sklearn but the hyperopt mentioned there does not work for OneClassSVM. As an example, we could redo analysis by introducing a validation set from here: https://www.kaggle.com/code/amarnayak/once-class-svm-to-detect-anomaly/notebook My code after including the validation set for the example workflow so far:

import pandas as pd
from sklearn import svm
cc =  pd.read_csv("creditcard.csv")
#I observed a conflict in the name 'class'. Therefore, I have changed the name from class to category

cc= cc.rename(columns={'Class': 'Category'})
cc.Category.value_counts()
# For convinience, divide the dataframe cc based on two labels. 

nor_obs = cc.loc[cc.Category==0]    #Data frame with normal observation
ano_obs = cc.loc[cc.Category==1]    #Data frame with anomalous observation

train_feature = nor_obs.loc[0:100000, :]. # train set
train_feature = train_feature.drop('Category', 1)
validation_feature = nor_obs.loc[100000:200000, :] # validation set
validation_feature = validation_feature.drop('Category', 1)

Y_1 = nor_obs.loc[200000:, 'Category']
Y_2 = ano_obs['Category']
X_test_1 = nor_obs.loc[200000:, :].drop('Category',1)
X_test_2 = ano_obs.drop('Category',1)
X_test = X_test_1.append(X_test_2) # test set 
Y_1 = nor_obs.loc[200000:, 'Category']
Y_2 = ano_obs['Category']
Y_test= Y_1.append(Y_2) # test labels

# the one class svm model
oneclass = svm.OneClassSVM()
parameters = {'kernel':('linear', 'rbf'), 'gamma':[0.1,0.2, 0.3, 0.4, 0.5]}
clf = GridSearchCV() or just GridSearch() # need help here to use the validation_feature data set
clf.fit(train_feature)
## Need some scores too for the trained set
predictions = oneclass.predict(X_test)

Is there a way I could use GridSearch to optimize the parameters using the validation_feature dataset?

AAA
  • 332
  • 1
  • 10

0 Answers0