9

I am trying to create a histogram data using following query:

SELECT FLOOR(Max_Irrad/10) AS bucket, COUNT(*) AS COUNT
FROM marctest.test_summarynimish
where Lcu_name='Allegro'
  and Lcu_Mode='Standard'
GROUP  BY bucket;

following is the result that i am getting:

bucket count
 0        3
 4        3
 5       12
 7        6
 8        3
10        3

now the bucket field is the range or bin used in the histogram. I want to create a bucket values with consistent range, for eg starting from 0,4,8,12.... and so on.. Is there any way to achieve this in mysql? This is how I am expecting to have as result:

 bucket count
 0        3
 4       21
 8        6
Mosty Mostacho
  • 42,742
  • 16
  • 96
  • 123
dna
  • 483
  • 3
  • 10
  • 32
  • I have provided the result of the query above.. do you need anything extra apart from that? – dna Feb 11 '12 at 08:17
  • It would help to know what the expected result is (the same way as you displayed the result of your query, AND using that very same sample data, in order to understand the logic behind) – Mosty Mostacho Feb 11 '12 at 08:24
  • i m trying to post the expected result in tabulated form but I am unable to do it. I apologize as I am new to the forum – dna Feb 11 '12 at 08:29
  • Shouldn't it be (0,3);(4,21);(8,6)? From 0,1,2,3 - 4,5,6,7 - 8,9,10,11? – Mosty Mostacho Feb 11 '12 at 08:39
  • yes that is how I want. Systematic distribution of the buckets or bin – dna Feb 11 '12 at 09:27
  • Possible duplicate of [MySQL: Getting data for histogram plot?](http://stackoverflow.com/questions/1764881/mysql-getting-data-for-histogram-plot) – Ciro Santilli OurBigBook.com Oct 07 '15 at 10:39

2 Answers2

8

I think we can use the following general form to create a general histogram:

select (x div 4) * 4 as NewX, count(*) as NewY from histogram
group by NewX

Where x is the real x value of the x axis and count(*) is the real y value. The number 4 is the size amount of the x values we want to group. This means we will group all x values in groups of 4 (e.g.: group 1 is 0, 1, 2, 3; group 2 is 4, 5, 6, 7, and so on). The count of each item in the group will become the NewY value

You can play with this here

Applying this logic to your query this would be:

select (floor(Max_Irrad/10) div 4) * 4 as NewX, count(*) as NewY
from marctest.test_summarynimish
where Lcu_name='Allegro' and Lcu_Mode='Standard'
group by NewX

Let me know if you have any trouble or doubt about this.

Mosty Mostacho
  • 42,742
  • 16
  • 96
  • 123
  • 1
    The solution you provide is working thanks a lot.... :) that would be really very helpful – dna Feb 11 '12 at 14:47
  • 2
    Above method is correct except it will not report buckets having zero samples. – LionHeart Dec 06 '12 at 05:58
  • Also, if you're bin size is a non-integer it won't work. You'll need instead floor(value/binSize)*binSize for non-integer bin sizes. – juacala Feb 12 '14 at 15:57
  • @LionHeart That makes sense. If there are no rows that contain data then it is not possible to display those missing rows, right? :) If you want to know how to generate data in MySQL I've written a long explanation with a couple of alternatives in [this other question](http://stackoverflow.com/questions/10034668/how-to-generate-data-in-mysql). Anyway, the missing data can be easily and more efficiently zero-filled when formatting the results in the histogram. – Mosty Mostacho Feb 12 '14 at 18:43
0

Just make your buckets bigger by dividing Max_Irrad by 40 instead of 10.

David Grayson
  • 84,103
  • 24
  • 152
  • 189