sum and generate series does't work in postgresql

Question

In classification_indicator_id column I have some digits. I would like to sum this digits in 1 day series. I wrote below query

select
a.data_start::date,
a.segment1::integer as "Segment1"
from (
select
    data as data_start,
    (select sum(classification_indicator_id) from classifications where classification_indicator_id = 3)::integer as segment1
from
    generate_series('2013-03-25'::timestamp without time zone,
    '2013-04-01'::timestamp without time zone,
    '1 day'::interval) data
) a
group by
a.data_start,
a.segment1
ORDER BY data_start

But I always getting something like:

date start|segment1
-------------------
2013-03-25|39
2013-03-26|39
2013-03-27|39
2013-03-28|39
2013-03-29|39
2013-03-30|39
2013-03-31|39
2013-04-01|39

I am sure that should be something like:

date start|segment1
-------------------
2013-03-25|3
2013-03-26|4
2013-03-27|7
2013-03-28|9
2013-03-29|15
2013-03-30|22
2013-03-31|19
2013-04-01|5

SQL Fiddle

Can there be days with no classifications at all? And would you want to list those days with a count of `0`? You would also have to provide the table definition (`CREATE TABLE` script) for a precise answer. It depends on whether columns can be `NULL` and whether you want to count those or not. — Erwin Brandstetter, Apr 04 '13 at 12:00
Yes, some day has no classifications. And I want list those days. I modified a bit answer that @Clodoaldo Neto sent. I post it as a new answer. [link to table definition](http://sqlfiddle.com/#!1/f1bec/1) — ssuperczynski, Apr 04 '13 at 12:31

score 1 · Accepted Answer · edited Apr 04 '13 at 13:09

1

select
    data.d::date,
    coalesce(sum(classification_indicator_id), 0)::integer as "Segment1"
from 
    classifications c
    right join
    generate_series(
        '2013-03-25'::timestamp without time zone,
        '2013-04-01'::timestamp without time zone,
        '1 day'::interval
    ) data(d) on data.d::date = c.data_start::date
where classification_indicator_id = 3
group by 1
ORDER BY 1

edited Apr 04 '13 at 13:09

Erwin Brandstetter

605,456
145
1,078
1,228

answered Apr 04 '13 at 09:46

Clodoaldo Neto

118,695
26
233
260

Thank you for your answer it's wokring great:) – ssuperczynski Apr 04 '13 at 09:52
But how to modify this query to add additional column for example `where classification_indicator_id = 4` `where classification_indicator_id = 6` etc. – ssuperczynski Apr 04 '13 at 09:52
@infaustus You can do it by adding columns or by doing group level totalizator. Think about what exactly you want and open another question. – Clodoaldo Neto Apr 04 '13 at 11:09
`COALESCE` should not be necessary, `sum()` ignores `NULL` values. – Erwin Brandstetter Apr 04 '13 at 12:52
@Erwin If all the summed values are null the result will be null. Check `select sum(a) is null from (values (null::integer)) s(a)`. I guess he wants a zero in that case. You can argue I should have "coalesced" the result of the `sum` in instead of each value and you would be right. Be free to edit it if you think it makes a difference. – Clodoaldo Neto Apr 04 '13 at 12:57
Then it should be `coalesce(sum( ...` instead of `sum(coalesce( ...` for a cost of O(1) instead of O(n). I applied it. – Erwin Brandstetter Apr 04 '13 at 13:08

ssuperczynski · Answer 2 · 2013-04-04T12:41:41.317

I need to add some other columns (with another classification_indicator_id I modified a bit your answer:

select
data.d::date as "data",
sum(c.classification_indicator_id)::integer as "Segment1",
sum(c4.classification_indicator_id)::integer as "Segment2",
sum(c5.classification_indicator_id)::integer as "Segment3"
from 
  generate_series(
    '2013-03-25'::timestamp without time zone,
    '2013-04-01'::timestamp without time zone,
    '1 day'::interval
) data(d)
left join classifications c on (data.d::date = c.created::date and c.classification_indicator_id = 3)
left join classifications c4 on (data.d::date = c4.created::date and c4.classification_indicator_id = 4)
left join classifications c5 on (data.d::date = c5.created::date and c5.classification_indicator_id = 5)
group by "data"
ORDER BY "data"

But still not working properly. sum for each row is to big, and growing when I add additional columns

With 4 columns                          With 3 column
data      |Segment1|Segment2|Segment3   data      |Segment1|Segment2
-------------------------------------   ----------------------------
2013-03-25|12      |16      |20         2013-03-25|12      |16      
-------------------------------------   ----------------------------
2013-03-26|108     |144     |180        2013-03-26|18      |24

This may produce a "proxy cross join". If there are multiple matching rows in `c`, `c4`, or `c5`, they form a **`CROSS JOIN`**, multiplying their count, which is most probably not what you want. Consider: http://stackoverflow.com/questions/12464037/two-sql-left-joins-produce-incorrect-result/12464135#12464135 — Erwin Brandstetter, Apr 04 '13 at 12:46

sum and generate series does't work in postgresql

2 Answers2

Linked