0

Something like this: SQL How to create a value for a new column based on the count of an existing column by groups?

But I have more than two distinct values. I have a variable n number of distinct values, so I don't always know have many different counts I have.

And then in the original table, I want each row '3', '4', etc. to have the count i.e. all the rows with the '3' would have the same count, all the rows with '4' would have the same count, etc.

edit: Also how would I split the count via different dates i.e. '2017-07-19' for each distinct values?

edit2: Here is how I did it, but now I need to split it via different dates.

edit3: This is how I split via dates.

#standardSQL
SELECT * FROM
(SELECT * FROM table1) main
LEFT JOIN (SELECT event_date, value, COUNT(value) AS count 
FROM table1
GROUP BY event_date, value) sub ON main.value=sub.value 
AND sub.event_date=SAFE_CAST(main.event_time AS DATE)

edit4: I wish PARTITION BY was documented somewhere better. Nothing seems to be widely written on BigQuery or anything with detailed documentation

#standardSQL
SELECT
  *,
  COUNT(*) OVER (PARTITION BY event_date, value) AS cnt
FROM table1;
Flair
  • 2,609
  • 1
  • 29
  • 41
  • Please give a toy database (in the shape of some lines "create table ..." and "insert into ...") and desired result. – Yunnosch Jul 19 '17 at 16:57
  • I managed to do it via SELECT and JOIN, though I am unsure if it is more efficient than your suggestion, but now I am looking for how to count different via different days and match via the different '3' and '4' via different dates. – Flair Jul 19 '17 at 17:36
  • If you have something which almost works show it, please. If it completely does what you want and you are worried about efficiency, then the code review site is probably what you want. https://codereview.stackexchange.com/ – Yunnosch Jul 19 '17 at 17:39
  • @Yunnosch do you know to split the count where the count is only the same for rows with both the same values and same dates? – Flair Jul 19 '17 at 17:53
  • I am still trying to figure out what you actually are asking. Can you give toy database and desired output? – Yunnosch Jul 19 '17 at 18:07
  • 2
    if you can show example of your data and expected output - you have much more chances to get proper answer! You can read how to show a [Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve) of the code and data that you are having problems with, then we can try to help with the specific problem. You can also read [How to Ask](http://stackoverflow.com/help/how-to-ask). – Mikhail Berlyant Jul 19 '17 at 18:07

1 Answers1

2

The query that you give would better be written using window functions:

SELECT t1.*, COUNT(*) OVER (PARTITION BY value) as cnt
FROM table1 t1;

I am not sure if this answers your question.

If you have another column that you want to count as well, you can use conditional aggregation:

SELECT t1.*,
       COUNT(*) OVER (PARTITION BY value) as cnt,
       SUM(CASE WHEN datecol = '2017-07-19' THEN 1 ELSE 0 END) OVER (PARTITION BY value) as cnt_20170719
FROM table1 t1;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • https://stackoverflow.com/questions/13235981/count-number-of-occurrences-for-each-unique-value . This question and the question I linked in my question answers my question, but I am now unsure of their efficiency when combined. That being said, what does PARTITION do? – Flair Jul 19 '17 at 20:11
  • it is definitely shorter, and it works. Thank you, though unsure on performance, maybe you know or I should do some in-depth research. Otherwise, both queries finish about the same time. ~3 seconds – Flair Jul 19 '17 at 20:39