0

My query is supposedly return the COLUMN NAME based on the maximum values of the columns. I tried using CASE WHEN to solve this case. but somehow this error occur, probably because I return the name of column instead of the value of the column itself:

No matching signature for operator CASE; all THEN/ELSE arguments must be coercible to a common type but found: INT64, STRING; actual argument types (WHEN THEN) ELSE: (BOOL STRING) (BOOL INT64) INT64 at [3:5]

My code is:

SELECT
ID,
    CASE
        WHEN col1 >= col2 AND col1 >= col3 AND col1 >= col4 AND col1 >= col5 THEN 'col1 '
        WHEN col2 >= col1 AND col2 >= col3 AND col2 >= col4 AND col2 >= col5 THEN 'col2 '
        ELSE 'col1'                                
    END AS Max_Column_Name
FROM table

Sample input is: enter image description here

Is there any way to make this query more simple? So no need to repeat the when..case if there are lots of column. Other initiative is to use GREATEST(col1,col2,col3) like in this post but, Im not sure how to use it in standard sql, bigquery.

Modus Tollens
  • 5,083
  • 3
  • 38
  • 46
swm
  • 519
  • 1
  • 4
  • 20

1 Answers1

3

Below is for BigQuery Standard SQL and works for any number of columns and does not require knowing of column names in advance

#standardSQL
SELECT id, 
  ( SELECT TRIM(col_name, '"')
    FROM UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING(t), r'"\w+":[^,}]+')) kv,
    UNNEST([STRUCT(SPLIT(kv,':')[OFFSET(0)] AS col_name, SPLIT(kv,':')[OFFSET(1)] AS col_value)])
    ORDER BY SAFE_CAST(col_value AS INT64) DESC
    LIMIT 1
  ) AS Max_Column_Name
FROM `project.dataset.table` t   

if to apply to sample example from your question (using col naming as in your initial script) - the result is

Row id  Max_Column_Name  
1   abc col3     
2   def col6     

while if to use exact same code but for data with columns named as in your sample input example - result is

Row id  Max_Column_Name  
1   abc Country_EN   
2   def Country_CA   
Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230