Questions tagged [online-machine-learning]

Use for questions about the online machine learning (aka out-of-core or incremental) technique. Use with the main [machine-learning] tag and with an appropriate language tag (e.g. [python]) where applicable.

Online machine learning is different than other approaches, such as batch learning techniques, which generate the best predictor by learning on the entire training data set at once.

It is a common technique used in areas of machine learning, where it is computationally infeasible to train over the entire dataset, requiring the need for out-of-core algorithms.

It is also used when it is necessary for the algorithm to dynamically adapt to new patterns in data, or when the data itself is generated as a function of time, such as "stock price prediction" for example.

In scikit-learn, for example, the SGDClassifier features an implementation of online learning.

19 questions
7
votes
3 answers

How to add a new class to an existing classifier in deep learning?

I trained a deep learning model to classify the given images into three classes. Now I want to add one more class to my model. I tried to check out "Online learning", but it seems to train on new data for existing classes. Do I need to train my…
3
votes
1 answer

Incremental learning in keras

I am looking for a keras equivalent of scikit-learn's partial_fit : https://scikit-learn.org/0.15/modules/scaling_strategies.html#incremental-learning for incremental/online learning. I finally found the train_on_batch method but I can't find an…
mac179
  • 1,540
  • 1
  • 14
  • 24
3
votes
2 answers

Gaussian Process Regression incremental learning

I am using the scikit-learn implementation of Gaussian Process Regression here and I want to fit single points instead of fitting a whole set of points. But the resulting alpha coefficients should remain the same e.g. gpr2 =…
2
votes
2 answers

Stream normalization for online clustering in evolving environments

TL;DR: how to normalize stream data, given that the whole data set is not available and you are dealing with clustering for evolving environments Hi! I'm currently studying dynamic clustering for non-stationary data streams. I need to normalize the…
1
vote
1 answer

Sink for user activity data stream to build Online ML model

I am writing a consumer that consumes (user activity data, (activityid, userid, timestamp, cta, duration) from Google Pub/Sub and I want to create a sink for this such that I can train my ML model in online fashion. Since this sink is the source…
1
vote
1 answer

How can update trained IsolationForest model with new datasets/datafarmes in python?

Let's say I fit IsolationForest() algorithm from scikit-learn on time-series based Dataset1 or dataframe1 df1 and save the model using the methods mentioned here & here. Now I want to update my model for new dataset2 or df2. My findings: this…
1
vote
0 answers

Keras Online Learning probem in implementation

I want LSTM to learn with newer data. It needs to update itself depending on the trend in the new data and I wish to save this say in a file. Then I wish to call this pre-fed training file into any other X,Y,Z fresh files where testing is done. So I…
1
vote
1 answer

Incremental learning in facial recognition

I am trying to implement incremental/online learning for a face recognition application. I've trained a model on a dataset and it works perfectly fine, however, I need to capture new faces(classes) over time and add them to the existing dataset. Is…
1
vote
1 answer

Sequential k-means

Can I use cluster_center coordinates from a previous Kmeans fit as an init argument to sequentially update the cluster_center coordinates as new data arrives? Are there any drawbacks to this method? UPDATED Online version of Scikit learns…
1
vote
1 answer

naive bayes classifier dynamic training

Is it possible (and how if it is) to dynamically train sklearn MultinomialNB Classifier? I would like to train(update) my spam classifier every time I feed an email in it. I want this (does not work): x_train, x_test, y_train, y_test = tts(features,…
1
vote
2 answers

Training neural network for updated data

I have a neural network which has been trained over some dataset. Say the dataset had 10k data points initially and another 100 data points are now added. Is there a way for my neural network to learn this entire (updated) dataset without training…
0
votes
0 answers

What optimization method can I use instead of Baysian Optimization for non-parametric online learning algorithm?

In a problem I am working on, the problem is solved using the Baysian optimiztion for non-parametric online learning. My question is: which other methods' performance can outperform baysian optimization? I haven't tried much about this since I'm a…
Milad A
  • 11
  • 1
0
votes
1 answer

Scikit-Multiflow - Cannot take a larger sample than population when 'replace'=False

So I was trying to run the following code, where x is a feature vector with dimensions (2381,) and y is a label with dimension (1,) after being cast to a Numpy array. from skmultiflow.meta import AdaptiveRandomForestClassifier import numpy as…
0
votes
0 answers

online learning for label encoder and random forest classifier

I have a very large dataset that needs to be used for classification, I sampled the data, but that does not guarantee that I will have the whole labels in my output. How can I sample my data to cover all labels? Also, I wanted to save the label…
0
votes
0 answers

Is it possible to update a trained model in sklearn?

Is it possible, for any Python sklearn learning model, to update an already-fitted model? For example, if I have trained my model on a large set of data, is it possible to update the training by introducing new data, or does it have to be retrained…
1
2