4

I expected scikit-learn's DP-GMM to allow for online update of cluster assignments given new data, but sklearn's implementation of DP-GMM only has a fit method.

My understanding of variational inference is yet unclear and I think that the inability of doing online update of cluster assignments is particular of sklearn's implementation, but not of the variational inference for the infinite GMM.

I would be very thankful if someone could clarify this and point to an implementation capable of online update of cluster assignments!

http://scikit-learn.org/stable/modules/generated/sklearn.mixture.DPGMM.html

rafaelvalle
  • 6,683
  • 3
  • 34
  • 36

1 Answers1

0

Posting Dawen Liang's explanation :

  1. Bayesian nonparametric does not equal online learning. It just means determining model complexity based on the data, this can happen in a batch learning setting (as sklearn's implementation of DP-GMM).

  2. Variational inference is essentially an optimization-based method, so you can certainly apply stochastic optimization method, which is what gives you the ability to do online learning. Applying stochastic variational inference on Bayesian nonparametric models is actually still an active research area.

Emily Fox's sampler sticky HDP-HMM

John Paisley's group HDP-HMM Matt Hoffman's infinite HMM (perhaps not HDP)

1http://www.stat.washington.edu/~ebfox/software/HDPHMM_HDPSLDS_toolbox.zip

rafaelvalle
  • 6,683
  • 3
  • 34
  • 36