66

Is there a way to Select * except [x,y,z column names] in BigQuery? I see some solutions for MySQL but not sure if it applies to BQ.

Thank you.

wubr2000
  • 855
  • 2
  • 8
  • 10
  • 1
    the only answer I have is kind of extension of approach in http://stackoverflow.com/questions/34010002/how-to-create-dummy-variable-columns-for-thousands-of-categories-in-google-bigqu which looks like not what works for you – Mikhail Berlyant Dec 03 '15 at 02:05

3 Answers3

143

There is nothing in current BigQuery SQL dialect that will allow it. But since this is recurring request, we have added work item to support

SELECT * EXCEPT (a, b, c) FROM ...

Update: This functionality is now available in BigQuery standard SQL. Details at https://cloud.google.com/bigquery/sql-reference/enabling-standard-sql Example using public wikipedia table - select all columns except title and comment:

select * except(title, comment) from publicdata.samples.wikipedia limit 10
Mosha Pasumansky
  • 13,206
  • 5
  • 32
  • 55
  • That's great! Thanks alot for implementing this! @MoshaPasumansky – wubr2000 Dec 03 '15 at 18:58
  • 3
    It took a while, but it is available now. Answer updated. – Mosha Pasumansky Apr 27 '16 at 19:45
  • Hi @MoshaPasumansky Thanks for adding this feature! But I just tried it in BQ and it didn't work. Could you let me know the exact syntax? Thank you. – wubr2000 Apr 28 '16 at 23:19
  • 1
    Here is an example on public wikipedia table which skips title and comment columns: select * except(title, comment) from publicdata.samples.wikipedia limit 10 – Mosha Pasumansky May 04 '16 at 00:23
  • Hi @MoshaPasumansky. The example you gave did not work. And I was using: `SELECT * except(title, comment) FROM bigquery-public-data:samples.wikipedia LIMIT 10` – wubr2000 May 05 '16 at 18:03
  • did you uncheck Use Legacy SQL checkbox in the query options? – Mosha Pasumansky May 05 '16 at 18:04
  • I did - I unchecked the legacy SQL box also - tried it with and without checking that box and neither worked. – wubr2000 May 05 '16 at 18:05
  • Your query differs from mine - it has ":" as project separator instead of ".", and goes against project with dashes in it, which requires quoting with backticks. Did my exact query not work for you ? – Mosha Pasumansky May 06 '16 at 02:33
  • can you give me job id for the failed query, I will look up the reason – Mosha Pasumansky May 06 '16 at 05:28
  • Hi @MoshaPasumansky. You are right. It works if I used the exact query you had posted! And now I can use it for my own tables too. This is a very helpful feature. Thank you for helping to implement it! – wubr2000 May 06 '16 at 17:48
  • Does except work with nested fields? (I seem to get syntax errors when I do except(Something.InnerThing) ) (likewise, with wildcards) – CasualT Jun 27 '16 at 19:36
12

In addition to SELECT * EXCEPT() syntax there is a SELECT * REPLACE() syntax - both supported with Standard SQL introduced
Usage is simple and obvious as per documentation

What is less obvious is that you can use both together in the same SELECT, like in example below

WITH orders AS
  (SELECT 5 as order_id,
  "sprocket" as item_name,
  200 as quantity)
SELECT * EXCEPT (order_id) REPLACE ("widget" AS item_name), "more" as more_fields
FROM orders;
Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
-1

Use case: Joining

As pointed out by the previous posts it is now possible to exclude columns from queries using the SELECT * EXCEPT()-syntax.
Anyhow, the feature seems not entirely thought through as one of the crucial use cases to require such functionality is to get rid of duplicate key-columns in joining while keeping one instance of the key-column.

In other words: When joining table_2 on table_1 I want to keep the column used for joining only from table_1.

Especially for large tables this can be tedious if all columns except the desired are needed.
E.g., what currently fails and would be desired:

SELECT *
    EXCEPT(table_2.primary_key_column) -- -> Only exclude key-column from joined table
FROM table_1 AS table_1
LEFT JOIN table_2 AS table_2 USING(primary_key_column)

This fails with a syntax error as the table_2. is not allowed in the EXCEPT()-clause...


Workaround

While EXCEPT() excludes columns with the specified names from the "input"-tables it does not affect expressions within the select statement. In other words: If you specify the required column specifically from the respective table it will not be affected by EXCEPT().

Thus, this allows us to work around our issue like in the example below.

SELECT
    table_1.primary_key_column AS primary_key_column -- -> Not affected
    , * 
    EXCEPT(primary_key_column)
FROM table_1 AS table_1
LEFT JOIN table_2 AS table_2 USING(primary_key_column)

What is still not nice about this solution: Your column order is affected if your key-column is not in first place.

I hope it will get "fixed" in the future as it is not really user friendly and also does not seem very stringent.

Note: If you have more than one join you have to adjust the logic (as EXCEPT() only refers to the * in front of it), which also shows a solution closer to the shown use case issue:

SELECT
    table_1.* --> Includes primary_key_column
    , table_2.* EXCEPT(primary_key_column) --> Specifically excludes primary key from table_2
    , table_3.* EXCEPT(primary_key_column) --> Specifically excludes primary key from table_3
FROM table_1 AS table_1
LEFT JOIN table_2 AS table_2 USING(primary_key_column)
LEFT JOIN table_3 AS table_3 USING(primary_key_column)
Markus
  • 2,265
  • 5
  • 28
  • 54