0

I'm trying without luck to do a query to retrieve the union two tables of events using legacySQL, as standardSQL is not yet supported on data studio.

In standardSQL that would be something like:

SELECT
  *
FROM
 `com_myapp_ANDROID.app_events_*`,
 `com_myapp_IOS.app_events_*`

However, in legacySQL I get an error when trying to refer app_events_*. How do I include all the tables of my events, so I can filter it afterwards on data studio if I can't use the wildcard?

I've tried something like:

select * from (TABLE_QUERY(com_myapp_ANDROID, 'table_id CONTAINS "app_events_"'))

But not sure if this is the right approach, I get:

Cannot output multiple independently repeated fields at the same time. Found user_dim_user_properties_value_index and event_dim_date

Edit: in the end this is the result of the query, as you can't use directly FLATTEN with TABLE_QUERY:

select
  *
from
  FLATTEN((SELECT * FROM TABLE_QUERY(com_myapp_ANDROID, 'table_id CONTAINS "app_events"')),user_dim.user_properties),
  FLATTEN((SELECT * FROM TABLE_QUERY(com_myapp_IOS, 'table_id CONTAINS "app_events"')),user_dim.user_properties)
Strnm
  • 1,006
  • 2
  • 9
  • 21

3 Answers3

3

Table wildcards don't work in legacy SQL as you have guessed so you have to use the TABLE_QUERY() function.

Your approach is right but the first parameter in the TABLE_QUERY function should be the dataset name not the first part of the table name. Assuming your dataset name is app_events that would look like this:

TABLE_QUERY(app_events,'table_id CONTAINS "app_events"')
matt_black
  • 1,290
  • 1
  • 12
  • 17
2

In legacySQL the union table operator is comma

select * from [table1],[table2]

For TABLE_QUERY you would include the dataset name as first param, and the expression for the second

select * from (TABLE_QUERY([dataset], 'table_id CONTAINS "event"'))

to read more how to debug TABLE_QUERY read this linked answer

The Web UI automatically flattens you the results, but when there are independent repeated fields you need to flatten with the FLATTEN wrapper.

It takes two params, table, and repeated field eg: FLATTEN(table, tags)

Also if TABLE_QUERY is involved you need to subselect probably like

select
  *
from
  FLATTEN((SELECT * FROM TABLE_QUERY(com_myapp_ANDROID, 'table_id CONTAINS "app_events"')),user_dim.user_properties)
Community
  • 1
  • 1
Pentium10
  • 204,586
  • 122
  • 423
  • 502
  • Yes, but if I do that, I get an error that I don't get on standardSQL, see my edited question – Strnm Feb 17 '17 at 15:30
  • 1
    Yep, that is another concept, it's called FLATTEN. Always try to run a simple query like selectin a field, and not the entire table. – Pentium10 Feb 17 '17 at 17:42
1

That particular issue you are experiencing is not UNION related - you will see same error message even with just one table if the table has multiple independently repeated fields and you are trying to output them at once. This scenario is specific to Legacy SQL and can be resolved with use of FLATTEN clause
At the same time, most likely you don't actually mean to use SELECT * which cause those repeated fields to be in output all at the same time. If you can narrow down your output list - you have slight chance to address it - but if still few independently repeated fields are in output - you can use FLATTEN technique

Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
  • Maybe my approach is wrong, but all I'm trying is to get a query with all the data available to be used as a datasource on data studio, just like when you select one table as datasource, but with two tables merged, so I wanted all the values and later when using data studio widget's, select which ones to display on each graph. – Strnm Feb 17 '17 at 16:28
  • Moreover, I believe I can't use FLATTEN with TABLE_QUERY? – Strnm Feb 17 '17 at 16:34