I have a big file with training data. I am worry, when I use this code:
clf = RandomForestClassifier()
for chunk in reader:
clf.fit(chunk, target)
Do clf will produce model for all chunk or only for current? For incremental learning should I use only Classifiers with partial_fit() method? How I should normalize train data (build normalizer for whole data, neither only current chunk) in that way?