2

I have a table data with an array type columns. For each row of the table I want to calculate average and median of those respective column values_*.

Example Table data:

id   values_1   values_2
 a      2
        4
 b      10
        4
        16
 c     NULL
        6
 d     NULL
       NULL

Sample expected output:

id   avg_values_1   median_values_1    avg_values_2   median_values_2
a         3              3 
b        7.5            10
c         6              6
d        NULL           NULL
loadbox
  • 646
  • 14
  • 34

1 Answers1

2

Below is for BigQuery Standard SQL

#standardSQL
WITH temp AS (
  SELECT id, ARRAY(SELECT * FROM UNNEST(values_1) i WHERE NOT i IS NULL) AS values_1
  FROM `project.dataset.table`
)
SELECT id,  
  (SELECT AVG(i) FROM UNNEST(values_1) AS i) AS avg_values_1,
  (SELECT DISTINCT PERCENTILE_CONT(i, 0.5) OVER() AS median FROM UNNEST(values_1) AS i) AS median_values_1,
FROM temp   

You can test, play with above using sample data from your question as in below example

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 'a' id, [2, 4] values_1 UNION ALL
  SELECT 'b', [10, 4, 16] UNION ALL
  SELECT 'c', [6, NULL] UNION ALL
  SELECT 'd', [NULL, NULL] 
), temp AS (
  SELECT id, ARRAY(SELECT * FROM UNNEST(values_1) i WHERE NOT i IS NULL) AS values_1
  FROM `project.dataset.table`
)
SELECT id, 
  (SELECT AVG(i) FROM UNNEST(values_1) AS i) AS avg_values_1,
  (SELECT DISTINCT PERCENTILE_CONT(i, 0.5) OVER() AS median FROM UNNEST(values_1) AS i) AS median_values_1,
FROM temp  

with output

Row id  avg_values_1    median_values_1  
1   a   3.0             3.0  
2   b   10.0            10.0     
3   c   6.0             6.0  
4   d   null            null       

Note that I had to first introduce temp CTE to eliminate NULL elements from arrays

You can repeat this construct for as many columns as you need/have

Or, if you have more columns than just few - you can use approach shown in https://stackoverflow.com/a/63105643/5221944 to dynamically build and execute the query for all columns at once!

Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230