Estimating differential entropy with weighted samples?

Question

I have an unknown continuous probability distribution p(x) that is expensive to sample from, but cheap to evaluate, and I would like to estimate its differential entropy. Some other details that might not matter are that x is 9 dimensional, and that the distribution is likely multi-modal with an unknown number of modes. I would prefer a solution in python, ideally that is pytorch compatible.

Currently, I have a number (~1000) of x samples proposed from some distribution that is cheap to sample and evaluate (e.g. uniform or Gaussian), and I can evaluate each p(x) easily. I roughly know the bounds of where p(x) are "high". My idea for evaluating the entropy is either:

fit a GMM to the weighted samples, then estimate the entropy of the GMM
duplicate the sampled x according to their probability, then estimate the entropy of the samples using KDE methods

For option 1, I would like to not specify the number of GMM components.

sklearn has the Dirichlet Process Gaussian Mixture Model which has the intended behavior, but there is no API to fit to weighted samples. There is an open pull request for doing so: https://github.com/scikit-learn/scikit-learn/pull/17130
This standalone repository https://github.com/ktrapeznikov/dpgmm may be what I need - I will update this question after testing it (edit: it's out of date and refers to sklearn internals so it is not usable)
pomegranate mentioned in a related question: python Fitting weighted data with Gaussian mixture model (GMM) with minimum on covariance seems to have weighted data fitting, but there were major API changes and missing tutorials since 1.0, and there doesn't seem to be an easy way to not set the number of components

For option 2, I will have extra hyperparameters to play around with since some of the sampled x have very low probability. The difference between the highest and lowest p(x) may be a factor of 10000, so finding the greatest common denominator and using that as a weight of 1 (having 1 copy) is likely not feasible. I would need a cutoff p(x), and even in that case I would increase the size of the set of samples significantly.

scipy.stats has differential entropy estimation from samples
manual histogram approaches may not be feasible due to x being 9 dimensional
lots of options and papers show up in a search, but few implementations

Are option 1 and option 2 equally valid? My intuition is that a GMM can fit p(x) reasonably well. Do you have any suggestions for implementations for option 1 or 2?

Estimating differential entropy with weighted samples?

0 Answers0