I'm working with a environment that generates data at each iteration. I want to retain the model from previous iteration and add new data to the existing model.
I want to understand how model fit works. Will it fit the new data with the existing model or will it create a new model with the new data.
calling fit with the new data:
clf = RandomForestClassifier(n_estimators=100)
for i in customRange:
get_data()
clf.fit(new_train_data) #directly fitting new train data
clf.predict(new_test_data)
Or Saving the history of train data and calling fit over all the historic data is the only solution
clf = RandomForestClassifier(n_estimators=100)
global_train_data = new dict()
for i in customRange:
get_data()
global_train_data.append(new_train_data) #Appending new train data
clf.fit(global_train_data) #Fitting on global train data
clf.predict(new_test_data)
My goal is to train model efficiently so i don't want to waste CPU time re-learning models.
I want to confirm the right approach and also want to know if that approach is consistent across all classifiers