0

I'm trying to combine 3+ NOAA GSOD data tables to get the data together in one super table. The I am attempting to JOIN the stations table onto the resultant data and then filter by country.

I've been able to do this for just one table but not for more. Below is my attempt at modifying the code to achieve this. I tried several different modifications with no success :(

SELECT * FROM [bigquery-public-data:noaa_gsod.gsod2016] AS gsod2016, 
[bigquery-public-data:noaa_gsod.gsod2015] AS gsod2015 JOIN [bigquery-public-
data:noaa_gsod.stations] AS stations ON gsod2016.stn = stations.USAF AND 
gsod2015.stn = stations.USAF WHERE stations.country = "CB"

NOAA GSOD bigquery data: https://bigquery.cloud.google.com/table/bigquery-public-data:noaa_gsod.gsod2016

Alec Davies
  • 127
  • 11
  • Do you need a join or a union? also don't mix join notations pick a standard either use , notation or inner join not both. and order matters... can't join on it unless it's been defined top down in order. – xQbert Jul 18 '17 at 16:19
  • I'm not entirely sure what you mean by notation or inner join. The code above works as a single line. E.g. SELECT * FROM [bigquery-public-data:noaa_gsod.gsod2016] AS gsod2016 JOIN [bigquery-public-data:noaa_gsod.stations] AS stations ON gsod2016.stn = stations.USAF WHERE stations.country = "CB" – Alec Davies Jul 18 '17 at 17:07
  • gsod2016 has a comma after it. that's non-standard join syntax when used in combination with `join`. – xQbert Jul 18 '17 at 17:08

1 Answers1

1

Use 1 standard inner join or , (preferably inner join syntax) and ensure the tables are in the correct order, you can't join on a table unless its' been defined above the ON.

SELECT * 
FROM [bigquery-public-data:noaa_gsod.stations] AS stations 
INNER JOIN [bigquery-public-data:noaa_gsod.gsod2016] AS gsod2016
  ON gsod2016.stn = stations.USAF 
INNER JOIN [bigquery-public-data:noaa_gsod.gsod2015] AS gsod2015 
  ON gsod2015.stn = stations.USAF 
WHERE stations.country = "CB"

Now all this said did you really mean a join or did you want to UNION ALL the data and add a year

xQbert
  • 34,733
  • 2
  • 41
  • 62
  • Hi xQbert, this seems to run into an infinite query. Yes, I believe I do need to UNION. Although BigQuery doesn't seem to recognise that code. – Alec Davies Jul 18 '17 at 17:05
  • Union say's it's supported... https://chartio.com/resources/tutorials/how-to-union-queries-in-google-bigquery/ Are all the columns of each table of the same data type? did you select each filed/column individually instead of using *? – xQbert Jul 18 '17 at 17:07
  • https://stackoverflow.com/questions/10644993/support-union-function-in-bigquery-sql it looks like it's supported. – xQbert Jul 18 '17 at 17:10
  • Not sure why that would be an infinite query are indexes on stn an usaf in the respective tables? and country in stations? – xQbert Jul 18 '17 at 17:11
  • The columns are the same including the titles, I just gave a go at naming rather then * and didn't get what I wanted either – Alec Davies Jul 18 '17 at 17:37
  • SELECT foo, bar FROM (SELECT stn AS foo, wban AS bar FROM [bigquery-public-data:noaa_gsod.gsod2016]), (SELECT stn AS foo, wban AS bar FROM [bigquery-public-data:noaa_gsod.gsod2015]) – Alec Davies Jul 18 '17 at 17:39
  • That code does work, which is great, but when I try to then include the Station numbers, it collapses with this error. JOIN cannot be applied directly to a table union or to a table wildcard function. Consider wrapping the table union or table wildcard function in a subquery (e.g., SELECT *). – Alec Davies Jul 18 '17 at 17:40
  • GOT IT! Had to make a long query renaming all the fields and creating an encapsulating subquery. I've lost the data from stations but that's not important because I've still managed to only bring up the data from the chosen country. Thanks a lot. You helped me get there. – Alec Davies Jul 18 '17 at 18:10
  • 1
    You just needed to know it was possible and not a technical limit. you did the hard work :P – xQbert Jul 18 '17 at 18:18