2

I would like to pivot the fields 'unitId', 'firebase_screen_class' so that each one appears in a separate column:

SELECT
  event.name,
  event_param.value.string_value AS ad_unit,
  COUNT(*) AS event_count
FROM
  `app_events_20170510`, 
  UNNEST(event_dim) AS event, 
  UNNEST(event.params) as event_param
WHERE
  event.name in ('Ad_requested', 'Ad_clicked', 'Ad_shown')
  and event_param.key in ('unitId', 'screen_class')
GROUP BY 1,2

I have used the following query using legacy SQL but it is not showing the correct aggregation results:

SELECT event_name, ad_unit, count(*) FROM
(
SELECT
  event_dim.name as event_name,
  MAX(IF(event_dim.params.key = "firebase_screen_class", event_dim.params.value.string_value, NULL)) WITHIN RECORD as firebase_screen_class,
  MAX(IF(event_dim.params.key = "unitId", event_dim.params.value.string_value, NULL)) WITHIN RECORD as ad_unit
FROM
  [app_events_20170510]
WHERE
  event_dim.name in ('Ad_requested','Ad_shown', 'Ad_clicked')
  and event_dim.params.key in ('unitId','screen_class')
)
group by 1,2

I am looking for the following output:

_________________________________________________________________________
| event_dim.name   | unitId         | screen_class         | count_events|
|__________________|________________|______________________|_____________|
| Ad_requested     | hpg            | socialFeed           |    520      |
|__________________|________________|______________________|_____________|
| Ad_shown         | hpg            | chat                 |    950      |
|__________________|________________|______________________|_____________|
| Ad_requested     | hni            | chat                 |    740      |
|__________________|________________|______________________|_____________|

All events Ad_requested, Ad_shown and Ad_clicked have parameters with the same keys(unitId, screen_class) and also the same values for each key (unitId: hpg, hni / screen_class: socialFeed, chat)

Florent
  • 93
  • 2
  • 8
  • Possible duplicate of [Select several event params in a single row for Firebase events stored in Google BigQuery](https://stackoverflow.com/questions/39998428/select-several-event-params-in-a-single-row-for-firebase-events-stored-in-google) – Elliott Brossard Aug 11 '17 at 19:07

1 Answers1

6

Below is for BigQuery Standard SQL

#standardSQL
WITH `aggregation` AS (
  SELECT
    event.name,
    event_param.key,
    COUNT(*) AS event_count
  FROM
    `app_events_20170510`, 
    UNNEST(event_dim) AS event, 
    UNNEST(event.params) AS event_param
  WHERE
    event.name IN ('Ad_requested', 'Ad_clicked', 'Ad_shown')
    AND event_param.key IN ('unitId', 'firebase_screen_class','house')
  GROUP BY 1, 2
)
SELECT 
  name,
  MAX(IF(key = 'unitId', event_count, NULL)) AS unitId,
  MAX(IF(key = 'firebase_screen_class', event_count, NULL)) AS firebase_screen_class,
  MAX(IF(key = 'house', event_count, NULL)) AS house
FROM `aggregation`
GROUP BY name  

Update based on clarifications in comments:

#standardSQL
SELECT
  event.name,
  (SELECT value.string_value FROM UNNEST(event.params) WHERE key = 'unitId') AS unitId,
  (SELECT value.string_value FROM UNNEST(event.params) WHERE key = 'firebase_screen_class') AS firebase_screen_class,
  (SELECT value.string_value FROM UNNEST(event.params) WHERE key = 'house') AS house,
  COUNT(1) AS event_count
FROM `app_events_20170510`, UNNEST(event_dim) AS event
WHERE event.name IN ('Ad_requested', 'Ad_clicked', 'Ad_shown')
GROUP BY 1,2,3,4

... Out of curiosity, I tried to replicate the query using legacy SQL ... -

Added version for BigQuery Legacy SQL (purely for learning purposes and in hope to help those who consider migrating to Standard SQL as two versions of same task is now presented here)

#legacySQL
SELECT name, product_id, source, firebase_event_origin, COUNT(1) AS event_count
FROM (
  SELECT event_dim.name AS name,
    MAX(IF(event_dim.params.key = 'unitId', event_dim.params.value.string_value, NULL)) WITHIN RECORD AS unitId,
    MAX(IF(event_dim.params.key = 'firebase_screen_class', event_dim.params.value.string_value, NULL)) WITHIN RECORD AS firebase_screen_class,
    MAX(IF(event_dim.params.key = 'house', event_dim.params.value.string_value, NULL)) WITHIN RECORD AS house,
  FROM FLATTEN([project:dataset.app_events_20170510], event_dim) AS event
  WHERE event_dim.name IN ('Ad_requested', 'Ad_clicked', 'Ad_shown')
)
GROUP BY 1, 2, 3, 4
Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
  • Thanks for the answer. I ran the query but it is not what I am trying to achieve. I am looking for: `event_name | unitId.values | house.values | firebase_screen_class.values| count_event` – Florent Aug 11 '17 at 19:51
  • i see now what you mean – Mikhail Berlyant Aug 11 '17 at 19:55
  • yes, sorry. I accidentally posted my comment before finishing it :-) – Florent Aug 11 '17 at 19:55
  • i just realized - i don't know what you mean by xxx.values. is it list of string_values for respective key or something else. I think yo should provide more details/example of output! – Mikhail Berlyant Aug 11 '17 at 20:03
  • I have added more details, please let me know if it is not clear. Thanks ! – Florent Aug 11 '17 at 20:24
  • in my mind - this still can be read in quite multiple ways. what is the logic of grouping in one line? for example you have two lines for `Ad_requested` - what was the logic of getting those? – Mikhail Berlyant Aug 11 '17 at 20:27
  • For the `Ad_requested` event, I want to know how many events exist for each combination of the parameters values for these event. There is two lines for `Ad_requested` because each line has different values for the parameters `screen_class` and `unitId` – Florent Aug 11 '17 at 20:31
  • Thanks Mike, just saw your update. There is an issue with it, it does not show the combination of the parameters. See screenshot here: [link](http://screencast.com/t/Ro7JR8BHYQ). On each row, only one of the 3 parameters have a value the other two parameters show null. – Florent Aug 11 '17 at 20:53
  • most likely not - will check one more time when get a min :) – Mikhail Berlyant Aug 11 '17 at 20:55
  • ok thanks, what I mean is that each event have 3 parameters that is recorded per event. I want to see for each event, the event name and the value for each of the 3 parameters on the same row. I have added comments to the screenshot to explain [link](https://www.screencast.com/t/Et5xeGeXr) – Florent Aug 11 '17 at 21:05
  • glad we did it finally :o) – Mikhail Berlyant Aug 11 '17 at 21:20
  • Out of curiosity, I tried to replicate the query using legacy SQL. I used the following construct: `MAX(IF(event_dim.params.key = "unitId", event_dim.params.string_value, NULL)) WITHIN RECORD productID` but it does not provide the correct aggregation result. Is this query also doable with legacy SQL ? – Florent Aug 12 '17 at 07:22
  • see addition in my answer – Mikhail Berlyant Aug 12 '17 at 17:15
  • @MikhailBerlyant thanks for this great answer! How do I use the FROM with `UNNEST(event_dim)` when I have two datasets (`IOS.app_events_20171106` and `ANDROID.app_events_20171106`)? – Dorian Roy Nov 08 '17 at 19:06
  • 1
    @DorianRoy - try replacing \`app_events_20170510` with (SELECT FROM \`IOS.app_events_20171106\` UNION ALL SELECT FROM \`ANDROID.app_events_20171106\`) – Mikhail Berlyant Nov 08 '17 at 20:39