0

I have a table that looks something like this when ordered:

Row Value
1 -1
2 -1
3 -2
4 -5
5 -6
6 -10
7 -10
8 -12
... ...

I want to create a count for each value but add a zero Count to Values that dont exist. My result should look like this:

Value Count
-1 2
-2 1
-3 0
-4 0
-5 1
-6 1
-7 0
-8 0
-9 0
-10 2
-11 0
-12 1

How would I accomplish this?

Flitschi
  • 73
  • 5

3 Answers3

1

Try this one:

with mytable as (
    select -1 as value union all 
    select -1 union all
    select -2 union all
    select -5 union all
    select -6 union all
    select -10 union all
    select -10 union all
    select -12
)
select value, ifnull(mycount, 0) as count
from unnest(generate_array(-1, -12, -1)) as value
left join (
    select value, count(*) as mycount
    from mytable
    group by value
) 
using(value)

enter image description here

Sergey Geron
  • 9,098
  • 2
  • 22
  • 29
1

Based on this answer to another question we might create a sequence of all values and unite it with the values that exist, finally select only those values that are of interest for us:

SELECT value, MAX(n) FROM
(
  SELECT value, COUNT(*) AS n FROM theTable GROUP BY value
  UNION
  (
    WITH d as (SELECT n FROM (VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) v(n))
    SELECT (SELECT MIN(value) FROM theTable) - 1 + ROW_NUMBER() OVER (ORDER BY NULL), 0
    FROM d d0, d d1, d d2, d d3, d d4, d d5, d d6, d d7
  )
) AS values
GROUP by value
HAVING value <= (SELECT MAX(value) FROM theTable)
ORDER BY value

Comment from another answer:

value goes up to 18 million

So above sequence was adjusted to select up to 100.000.000.

Note, though, that the query will run for a pretty long time.

Aconcagua
  • 24,880
  • 4
  • 34
  • 59
0

Consider below approach

select r.* from (
  select *, case when gap 
      then generate_array(value + 1, -1 + lag(value) over(order by value desc))
    end values
  from (
    select *, value + 1 != lag(value) over(order by value desc) gap
    from (select value, count(1) `count` from your_table group by value)
  ) 
), unnest([struct(value, `count`)] || 
     array(select as struct value, 0 as `count` from unnest(values) value)
) r

if applied to sample data in your question - output is

enter image description here

As you can see here - the idea is to identify gaps and then generate missing values for each gap separately so no any issues with arrays size. Assuming you dont have gaps with more than 1M values - above solution should work for any number of entries in your table - millions, billions, etc.

Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230