2

Is it possible to do the following pivot in one query using BigQuery, or would I need to break it up into multiple queries?

Here is the raw data:

enter image description here

And here is the Pivot Table:

enter image description here

Is there a way to construct an arbitrarily-nested Pivot Table in BigQuery? Or does each level need its own SQL Query?

Note, in the above, it would be simple to do a CASE WHEN statement for each of the 6 column combinations (CA-M, CA-F, FR-M, FR-F, US-M, US-F), but for the sake of the general case, let's suppose there may be hundreds of countries so it's not practical to manually write in each combination as a different CASE statement.

David542
  • 104,438
  • 178
  • 489
  • 842

1 Answers1

4

Below is most optimal in my mind option

Step 1 - prepare query based on your data

#standardSQL
WITH country_sex_list AS (
  SELECT Country, Sex 
  FROM yourTable 
  GROUP BY Country, Sex
),
permutations AS (
SELECT 
  STRING_AGG(CONCAT("SUM(CASE WHEN (Country, Sex) = ('", Country, "', '", Sex, "') THEN Income END) AS ", Country, "_", Sex), ',' ORDER BY Country, Sex) AS text
FROM country_sex_list 
)
SELECT 
  CONCAT(
    "SELECT company, ", text, ", SUM(Income) AS Total FROM yourTable GROUP BY Company UNION ALL ",
    "SELECT 'Total' as company, ", text, ", SUM(Income) AS Total FROM yourTable"
  ) AS query
FROM permutations  

Step 2 - take text of result of Step 1 and run it as a query.
Result will be as you expect (see example below)

company   CA_M    FR_F    FR_M    US_F    US_M    Total  
Acme      null  40,000    null    null  40,000   80,000  
Bravo   50,000    null    null  30,000    null   80,000  
Delta     null    null  40,000    null    null   40,000  
Total   50,000  40,000  40,000  30,000  40,000  200,000   

I think these two steps are generic enough to extend to real use-case

Of course, You can run those two steps manually in Web UI or you can script them in client of your choice

Below is dummy data to test with

WITH yourTable AS (
  SELECT 'M' AS Sex, 'US' AS Country, 40000 AS Income, 'Acme' AS Company UNION ALL
  SELECT 'M', 'CA', 50000, 'Bravo' UNION ALL
  SELECT 'F', 'US', 30000, 'Bravo' UNION ALL
  SELECT 'F', 'FR', 40000, 'Acme' UNION ALL
  SELECT 'M', 'FR', 40000, 'Delta'
)
Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
  • this is great thank you for this comprehensive answer. – David542 Feb 17 '17 at 01:29
  • finally, I know we don't have real data to test with, but do you think the above approach would be the most performant approach to use when building a pivot table? For example, doing it all in one query as opposed to multiple queries per pivot level? – David542 Feb 17 '17 at 01:30
  • 1
    @David542 - hard to say for sure, but i think one shot will be more effective and cost wise and performance wise. meantime, wanted to mention that this type of pivoting is not optimal to do within bigquery and better fit to be made in visualization layer, but still I understand there might be some circumstances when you need it done here. if you haven't tried yet - try BigQuery Mate - https://chrome.google.com/webstore/detail/bigquery-mate/nepgdloeceldecnoaaegljlichnfognh - it has interractive visualization embed in Web UI - exactly for such cases when you need quick pivot or any other chart – Mikhail Berlyant Feb 17 '17 at 01:54
  • thanks, this is really helpful and great Chrome tool! I asked an even more generalized question on BigQuery here, please take a look! -- http://stackoverflow.com/questions/42307033/generalized-pivot-table-with-deep-sort-in-google-bigquery?noredirect=1&lq=1 – David542 Feb 17 '17 at 20:45