2

I wish to train PassiveAggressiveClassifier in scikit-learn in online setting.

I was wondering if the correct way to instantiate this classifier is

PA_I_online = PassiveAggressiveClassifier(warm_start=True)

As per the docs

warm_start : bool, optional
When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. 

which is what I will need in online/incremental setting - Continue training the model on next data point.

But in the example, it is intiantiated as

'Passive-Aggressive': PassiveAggressiveClassifier()

Likewise in this code too

Note, as per docs default value of warm_start = False

Am I missing something ?

My complete code snipet for online training is :

# Given X_train, y_train, X_test and y_test, labels

PA_I_online = PassiveAggressiveClassifier(loss='hinge', warm_start=True)

no_of_samples = len(X_train)
no_of_classes= np.unique(labels)

for i in range(no_of_samples):
    #get the ith datapoint
    X_i = X_train[i]
    y_i = y_train[i]

    #reshape it
    X_i = X_i.reshape(1,300)
    y_i = y_i.reshape(1,)

    #consume data point 
    PA_I_online.partial_fit(X_i, y_i, no_of_classes)

Crux: To do online training using using PassiveAggressiveClassifier() isn't it must to set the argument warm_start=True

Community
  • 1
  • 1
Anuj Gupta
  • 6,328
  • 7
  • 36
  • 55

1 Answers1

1

When you use partial_fit(), the model is not re-initialized in any case. The documentation says about fit() method, which by default resets the model parameters and trains from the scratch. warm_start=True is designed more for fit() method.

You may find this discussion useful for further details.