1

Given the following minimal example:

CREATE OR REPLACE TEMP TABLE Produce AS (
  SELECT 'Kale' as product, 51 as sales, 'Q1' as quarter, 2020 as year UNION ALL
  SELECT 'Kale', 23, 'Q2', 2020 UNION ALL
  SELECT 'Kale', 45, 'Q3', 2020 UNION ALL
  SELECT 'Kale', 3, 'Q4', 2020 UNION ALL
  SELECT 'Kale', 70, 'Q1', 2021 UNION ALL
  SELECT 'Kale', 85, 'Q2', 2021 UNION ALL
  SELECT 'Apple', 77, 'Q1', 2020 UNION ALL
  SELECT 'Apple', 0, 'Q2', 2020 UNION ALL
  SELECT 'Apple', 1, 'Q1', 2021);

SELECT * FROM
  Produce
  PIVOT(SUM(sales) FOR quarter IN ('Q1', 'Q2', 'Q3', 'Q4'))

I'd like to have the list of quarter values (('Q1', 'Q2', 'Q3', 'Q4')) not directly in the query itself, but somehow at the top of my script, because (in my actual script) I'm using it in multiple places and don't want to duplicate it every time. The list will not change in my case, except I explicitly change the whole business logic of the app, so I don't need the "variable" to be mutable ("dynamic pivot"), I just want to not duplicate the list (keep things DRY), so that in case I need to change it, I only need to change it in one place.

My current attempt looks as follows:

DECLARE quarters ARRAY <STRING> DEFAULT ['Q1', 'Q2', 'Q3', 'Q4'];

SELECT * FROM
  Produce
  PIVOT(SUM(sales) FOR quarter IN UNNEST(quarters))

But it does not work (Syntax error: Unexpected ")"). How do I do it correctly?

Tobias Hermann
  • 9,936
  • 6
  • 61
  • 134
  • It's not a good practice to have dynamic pivots in SQL. Where do you plan to use the output of this query? It's better if you pivot the data in BI tool or Excel, wherever you use it. – Sabri Karagönen Jun 28 '22 at 09:30
  • @SabriKaragönen Actually, I don't want the pivot to be dynamic. I'd like to have it static. But at the same time, I'd like to do the same (static) pivot in two statements. Using the variable is only a tool for me to make things DRY. If there is some other way, I'd be happy about that too. I'll clarify my question accordingly. :) – Tobias Hermann Jun 28 '22 at 09:39
  • `The list will never change in my case` so what the point? please explain why you really need this? in many cases - understanding real cause/needs - helps in landing with the right approach. Btw, check out [What is the XY problem?](https://meta.stackexchange.com/a/66378/507852) – Mikhail Berlyant Jun 28 '22 at 21:18
  • I have three statements using this list. Having the list (it's quite long in my case) written out explicitly each time is ugly. Yeah "will never change" was a bit strong. I meant "will not change by itself". In one year (when the business logic changes) I might want to change the list too, and I'd prefer to do it in one place instead of three places. I'll adjust my question accordingly. – Tobias Hermann Jun 29 '22 at 06:31
  • got it. missed the part in your question about using in multiple places – Mikhail Berlyant Jun 29 '22 at 13:42
  • Additional help re [the post you just deleted](https://stackoverflow.com/q/73121061/3404097) re comma-where vs join-on: [Explicit vs implicit SQL joins](https://stackoverflow.com/q/44917/3404097). (But hopefully you could now find that and/or duplicates for yourself googling.) PS [CROSS JOIN vs INNER JOIN in SQL](https://stackoverflow.com/a/25957600/3404097) [What is the difference between “INNER JOIN” and “OUTER JOIN”?](https://stackoverflow.com/a/46091641/3404097) (Hope you don't mind this comment on a different post.) – philipxy Jul 26 '22 at 10:29
  • 1
    @philipxy Thanks a lot. Everything is clear now. I deleted the post because I felt it did not make much sense and would only pollute search results. – Tobias Hermann Jul 26 '22 at 10:38
  • PS [INNER JOIN ON vs WHERE clause](https://stackoverflow.com/q/1018822/3404097) (Addresses semantics rather than performance. The question isn't exactly comma-where vs join-on but the answers address it.) – philipxy Jul 26 '22 at 10:40
  • 1
    @philipxy Thanks again, I got it. :) In the meantime, I posted a [question showing my actual problem](https://stackoverflow.com/questions/73123554/performance-of-different-ways-to-implement-multi-column-anti-join-in-bigquery), which had lead me to the (misguided) initial question. – Tobias Hermann Jul 26 '22 at 12:42

1 Answers1

1

As per my understanding, using an array as pivot columns is not supported. But in dynamic sql, you can utilize an array to generate pivot columns like below.

EXECUTE IMMEDIATE FORMAT("""
  SELECT * FROM Produce PIVOT(SUM(sales) AS a FOR quarter IN ('%s'));
""", ARRAY_TO_STRING(quarters, "','"));
Jaytiger
  • 11,626
  • 2
  • 5
  • 15