Postgres query for 10 steps in a range

Question

We have a RAILS app with Postgres as database and have a use case for drawing a graph over a range of values. Unfortunately the range is a decimal, so I am not able to use the generate_series function of Postgres. Need help in figuring out an optimal way to query this rather than splitting this into 10 different queries. Here's sample data

We have a table with score | students
Given a query I would get a set of score-student tuples, from which I get range(min(score), max(score)). For example range(10.25, 16.80)
We need to break the above range into 10 steps with an interval of 0.655 which is (max-min)10 - 10.25,10.91,11.56,12.22,12.87
For each step above show the number of students between that score and previous value
Result would be an array with [(10.25,11232),(10.91,2434),....]

Any way/thoughts to do this in Postgres in a single query or less than 10+ queries?

Maybe you should try to adapt this: http://stackoverflow.com/questions/232387/in-sql-how-can-you-group-by-in-ranges — Ruby Racer, Jun 23 '14 at 21:42
The question should provide a table definition, your version of Postgres, some sample values and the query you tried (even if it's not working). Ideally an sql fiddle ([random example](http://sqlfiddle.com/#!15/bc7d7/3)). — Erwin Brandstetter, Jun 23 '14 at 22:21
Also, why do you want an array as result instead of a result set? — Erwin Brandstetter, Jun 23 '14 at 22:46
Huh. I never noticed there wasn't a `generate_series` for `numeric`. I'll point it out on pgsql-hackers; it'd be a good newbie developer task. — Craig Ringer, Jun 24 '14 at 02:43

Erwin Brandstetter · Accepted Answer · 2014-06-24T00:51:27.957

Your result as set (makes more sense to me):

WITH base AS (
   SELECT student, score
   FROM   tbl
   WHERE  <some_condition>
   )
, border AS (
   SELECT min(score) AS min_score, max(score) AS max_score
   FROM   base
   )
SELECT lower_bound, ct
FROM  (
   SELECT step
        , min_score + ((max_score - min_score) * (step-1)) / 10 AS lower_bound
   FROM   border, generate_series(1,10) step
   ) x
LEFT   JOIN (
   SELECT width_bucket(b.score, x.min_score, x.max_score, 10) AS step
        , count(*)::int AS ct
   FROM   border x, base b
   GROUP  BY step
   ) y USING (step)
ORDER  BY step;

Featuring two CTEs, generate_series() (still useful) and the often overlooked function width_bucket().

To produce an array of composite types, like outlined in the question, first create a matching type (once):

CREATE TYPE my_type AS (bound numeric, ct int);

Assuming numeric values for lack of information.
Then feed above query to an array constructor:

SELECT ARRAY (
   <query from above>
   SELECT (lower_bound, ct::int)::my_type   -- only difference
   <query from above>
   );

@erwin-brandsetter, thanks for the detailed answer and apologies for missing postgres version and sqlfiddle (wasn't aware). I am on 9.3 so have access to the functions you mentioned. width_bucket() seems the right tool for my use case. With your answer I am able to get what I want. I also ran into http://tapoueh.org/blog/2014/02/21-PostgreSQL-histogram, which helped some more. — sumit, Jun 24 '14 at 03:06

Postgres query for 10 steps in a range

1 Answers1