How can I calculate the mode of a floating point array in Ruby?

Question

I have an array of floating point data, I would like to pick out the most probable value. It is called "mode" in descriptive statistics. How can I calculate it in Ruby, or with the help of a gem.

possible duplicate of [Ruby: How to find item in array which has the most occurrences?](http://stackoverflow.com/questions/412169/ruby-how-to-find-item-in-array-which-has-the-most-occurrences) — theTRON, Jun 26 '14 at 01:32
Thx, but I think those algorithm is useless with floating point data. — Konstantin, Jun 26 '14 at 01:51
@Konstantin, why you think so? That answer works perfectly for float. There is nothing wrong to use float as key of Hash in Ruby. — huocp, Jun 26 '14 at 02:12
@theTRON is correct, the method in the first answer will work for you. — Anthony, Jun 26 '14 at 02:14

score 1 · Accepted Answer · answered Jun 26 '14 at 02:39

1

[0.0, 0.1, 0.2, 0.1, 0.3, 0.3, 0.1]
.group_by{|e| e}.max_by{|k, v| v.length}.first
# => 0.1

answered Jun 26 '14 at 02:39

sawa

165,429
45
277
381

score 1 · Answer 2 · answered Sep 02 '14 at 21:29

DescriptiveStatistics adds methods to the Enumerable module to allow easy calculation of basic descriptive statistics of Numeric sample data in collections that have included Enumerable such as Array, Hash, Set, and Range.

> require 'descriptive_statistics'
> [0.0, 0.1, 0.2, 0.1, 0.3, 0.3, 0.1].mode
=> 0.1

score 0 · Answer 3 · answered Jun 26 '14 at 02:15

0

The following will work for bimodal and multimodal datasets, but only returns a single value. For bimodal/multimodal datasets it always returns the value that occurs first in the array.

# returns 1.0
a = [1.0, 1.0, 2.0, 2.0, 3.0]
a.max_by { |x| a.count(x) }

You can also try the easystats gem. It adds a .mode method to Arrays (among other methods), but it returns nil for bimodal or multimodal datasets.

require 'easystats'

# returns 1.0
a = [1.0, 1.0, 2.0, 3.0]
a.mode 

# returns nil
a = [1.0, 1.0, 2.0, 2.0, 3.0]
a.mode

answered Jun 26 '14 at 02:15

infused

24,000
13
68
78

1

Your first piece of code will work, but is inefficient. – sawa Jun 26 '14 at 02:40
This is true. The fastest method appears to be `a.group_by {|e| e}.values.max_by{|e| e.size}.first`, which was posted by @Brandon in the duplicate post mentioned above. – infused Jun 26 '14 at 03:50
Thx, I see, but my floating point numbers show a bit fluctuation, because they comes from different calculations. For example 1.00001 and 1.00000 should be both treated as 1.0 – Konstantin Jun 26 '14 at 18:16
On the top of this, my floating point numbers are in pairs, because they are parameters of a line in a coordinate system (y=a*x+b). In fact my data is two dimensional, so something advanced method should be applied. I don't think so I am allowed to calculate the mode of the "a" values and separately of the "b" values, because they are "attached". – Konstantin Jun 26 '14 at 18:26
In that case, use [Float#round](http://www.ruby-doc.org/core-2.1.2/Float.html#method-i-round) to round each value to a specific precision: `rounded = a.map {|n| n.round(1)}` – infused Jun 26 '14 at 18:26
@Konstantin, it might help if you can update your question with some example data. – infused Jun 26 '14 at 18:29
Okay, here is my sample data http://pastebin.com/krAh1yUC Floating point number pairs represents parameters of a linear transformation: y=a*x+b So values in the array are [a,b] pairs, 45 pieces of pairs in total. One can see the mode is a=1.0, b=0.4 – Konstantin Jun 26 '14 at 21:52

How can I calculate the mode of a floating point array in Ruby?

3 Answers3