2

I'm looking for a technique to logarithmically bin some data sets. We've got data with values ranging from _min to _max (floats >= 0) and the user needs to be able to specify a varying number of bins _num_bins (some int n).

I've implemented a solution taken from this question and some help on scaling here but my solution stops working when my data values lie below 1.0.

class Histogram {
double _min, _max;
int _num_bins;
......
};

double Histogram::logarithmicValueOfBin(double in) const {
    if (in == 0.0)
        return _min;

    double b = std::log(_max / _min) / (_max - _min);
    double a = _max / std::exp(b * _max);

    double in_unscaled = in * (_max - _min) / _num_bins + _min;
    return a * std::exp(b * in_unscaled) ;
}

When the data values are all greater than 1 I get nicely sized bins and can plot properly. When the values are less than 1 the bins come out more or less the same size and we get way too many of them.

Boumbles
  • 2,473
  • 3
  • 24
  • 42
  • Don't use [float/double] == [value]. Because computers cannot represent exact values. Use an epsilon instead: `if(std::abs(in - x) < 0.00001)` (or some other acceptable small value). – Casey Sep 23 '15 at 18:21
  • http://coliru.stacked-crooked.com/a/2bf7eaa25bcf1b9b ? – Mooing Duck Nov 20 '20 at 01:15
  • If you're willing to sacrifice significant accuracy, frexp and ldexp can make this _really_ fast. http://coliru.stacked-crooked.com/a/5f7477995f36930f – Mooing Duck Nov 20 '20 at 01:30

2 Answers2

0

I found a solution by reimplementing an opensource version of Matlab's logspace function.

Given a range and a number of bins you need to create an evenly spaced numerical sequence

module.exports = function linspace(a,b,n) {
  var every = (b-a)/(n-1),
      ranged = integers(a,b,every);

  return ranged.length == n ? ranged : ranged.concat(b);
}

After that you need to loop through each value and with your base (e, 2 or 10 most likely) store the power and you get your bin ranges.

module.exports.logspace = function logspace(a,b,n) {
  return linspace(a,b,n).map(function(x) { return Math.pow(10,x); });
}

I rewrote this in C++ and it's able to support ranges > 0.

Boumbles
  • 2,473
  • 3
  • 24
  • 42
0

You can do something like the following

// Create isolethargic binning
    int     T_MIN   = 0;                    //The lower limit i.e. 1.e0
    int     T_MAX   = 8;                    //The uper limit   i.e. 1.e8
    int     ndec    = T_MAX - T_MIN;        //Number of decades
    int     N_BPDEC = 1000;                 //Number of bins per decade
    int     nbins   = (int) ndec*N_BPDEC;   //Total number of bins
    double  step    = (double) ndec / nbins;//The increment 
    double  tbins[nbins+1];                 //The array to store the bins

    for(int i=0; i <= nbins; ++i)
        tbins[i] = (float) pow(10., step * (double) i + T_MIN);
Thanos
  • 594
  • 2
  • 7
  • 28