0

I have a BigQuery table like this:

enter image description here

Required Output is :

enter image description here

Note : The keys in Extended_property_key column are not fixed, it keeps on adding frequently. Hence the columns in Output will also keep on adding.

I need to build a Bigquery which can handle dynamic adding of columns in output query along with pivoting.

codninja0908
  • 497
  • 8
  • 29
  • Can anyone help on above issue. I have tried many options given which makes use of ANY_VALUE() aggregate function but that is something I am not looking for. – codninja0908 Jun 23 '20 at 05:35
  • Have you checked this [thread](https://stackoverflow.com/a/35187840/9928809) explaining different approaches in Biqgquery pivoting techniques? – Nick_Kh Jun 23 '20 at 13:36

2 Answers2

2

Below is for BigQuery Standard SQL

EXECUTE IMMEDIATE '''
SELECT account_id, ''' || (
  SELECT STRING_AGG(DISTINCT "MAX(IF(Extended_property_key = '" || Extended_property_key || "', Extended_property_value, NULL)) AS " || Extended_property_key)
  FROM `project.dataset.table`
) || '''  
FROM `project.dataset.table`
GROUP BY 1
ORDER BY 1
'''   

If applied to sample data in your question - output is

Row account_id  Key1    Key2    Key3     
1   1           Value1  null    null     
2   7           null    Value2  Value3   
Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
  • I am looking to parameterising the above query as the same table exists for many operators . example - dataset1.table1 , dataset2.table1 , dataset3.table1. In this case I would like to make the above query as a template query with pivot key and table name to be passed as parameters. Such that the same query can be used for multiple operators. I tried using Airflow BigQueryOperator but this operator does not provide parameterization. – codninja0908 Jun 29 '20 at 05:38
1

Try:

CALL fhoffa.x.pivot(
  'p.d.t1' # source table
  , 'p.d.t2' # destination table
  , ['ACCOUNT_ID'] # row_ids
  , 'EXTENDED_PROPERTY_KEY' # pivot_col_name
  , 'EXTENDED_PROPERTY_VALUE' # pivot_col_value
  , 30 # max_columns
  , 'ANY_VALUE' # aggregation
  , '' # optional_limit
);

More details on how to pivot:

Felipe Hoffa
  • 54,922
  • 16
  • 151
  • 325
  • Thanks for this link . I understand that in this step , using aggregate function you are fetching distinct values and the outer query is looping over those values. I am trying to create the exact query without string concatenation but I am getting errors due to quotes I am missing. EXECUTE IMMEDIATE ( "SELECT STRING_AGG(' "||aggregation ||"""(IF('||@pivot_col_name||'="'||x.value||'", '||@pivot_col_value||', null)) e_'||fhoffa.x.normalize_col_name(x.value)) FROM UNNEST(( SELECT APPROX_TOP_COUNT("""||pivot_col_name||", @max_columns) FROM `"||table_name||"`)) x" ) – codninja0908 Jun 29 '20 at 14:00
  • Please post a new question so we have enough space to look at the code and fix it :) – Felipe Hoffa Jun 29 '20 at 19:01
  • I have posted a new question here [link]https://stackoverflow.com/questions/62659762/facing-issue-while-calling-bigquery-stored-procedure Please check and provide explaination about the underlying query which runs in stored procedure link of yours – codninja0908 Jun 30 '20 at 14:28