3

I have a simple table with two columns Bin_name (int) and Count_in_this_bin (int)

I want to convert it to the ratio of each bin to the total count in all bins.

I used the following query in Google BigQuery:

SELECT count_in_bin/(SELECT SUM(count_in_bin) FROM [table])
FROM [table]

Then I get

error:Query Failed Error: Subselect not allowed in SELECT clause

Can anyone let me now the correct way to do this kind of simple divide in BigQuery?

Psyduck
  • 637
  • 3
  • 10
  • 22

2 Answers2

5

BigQuery Legacy SQL

#legacySQL
SELECT 
  count_in_bin, 
  RATIO_TO_REPORT(count_in_bin) OVER() AS ratio
FROM [project:dataset.table]

BigQuery Standard SQL

#standardSQL
SELECT 
  count_in_bin, 
  count_in_bin / total AS ratio
FROM `project.dataset.table`, 
(SELECT SUM(count_in_bin) total FROM `project.dataset.table`)
Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
3

You are looking for window/analytic functions. In standard SQL this would be:

SELECT count_in_bin / SUM(count_in_bin) OVER () 
FROM [table]

In legacy SQL, you seem to need a subquery:

SELECT count_in_bin, count_in_bin / total_count_in_bin
FROM (SELECT count_in_bin, SUM(count_in_bin) OVER () as total_count_in_bin
      FROM [table]
     ) t
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Hi I tried this, but it shows: Error: Missing function in Analytic Expression at: 1.50 - 1.57. Is this legacy sql syntax or standard? – Psyduck Apr 01 '18 at 03:06
  • I just tried again using StandardSQL dialect, it works, but it won't work in legacy syntax. – Psyduck Apr 01 '18 at 03:11
  • @Noob. . . . How strange. The `sum(count_in_bin) over ()` works in legacy SQL. It is the division that throws off the expression. – Gordon Linoff Apr 01 '18 at 03:13
  • Hi man, I was using the Legacy SQL, the query itself worked, but not when I include it in the division. Also, after I checked off the Legacy Syntax option in BigQueyr, I found the standard sql in bigquery actually support subquery. While to me the best solution is just to stick with the standard sql option in bigquery. Thanks – Psyduck Apr 01 '18 at 03:15
  • And yes the second version you provided worked, while the 1st one with SUM in division failed. BigQuery might expect the query to be more clear I guess? – Psyduck Apr 01 '18 at 03:17