0

I have a JSON data from Firebase Backup. Data generated is such that every key is preceded by a hyphen.

Sample data is as follows:

"-GuGCJDEprMKczAMDUj8":{"deviceId":"399a649c6cee6209","dow":"Thursday","downloadFlag":"N","event":"streamStart","halfHourFull":"18h1","liveFlag":"Y","localDate":"2009-01-01","localHalfHour":1,"minutesSinceMidnight":1080,"quarterHourFull":"18q1","stationName":"hit 105","streamListenMethod":"Headphones","timestampLocal":"2009-01-01T18:00:33.679+10:00","timestampUTC":"2009-01-01T08:00:33.679Z"}

When we are trying to load that data into Bigquery then we are encountered with the below mentioned error:

Fields must contain only letters, numbers, and underscores, start with a letter or underscore, and be at most 128 characters long.

Is this a bigquery limitation? If yes, then what's the proposed solution here.

Any help/suggestion is much appreciated.

Carlo Zanocco
  • 1,967
  • 4
  • 18
  • 31
Sains
  • 457
  • 1
  • 7
  • 19

1 Answers1

2

Is this a bigquery limitation? If yes, then what's the proposed solution here.

You need to use different field names instead. One option is to load the data into a single STRING column, e.g. by using 'CSV' for the format with a field delimiter of '|' (or any other character that doesn't appear in your data). Then you can use the JSON_EXTRACT_SCALAR function to extract fields from the JSON, e.g.:

CREATE TABLE dataset.table AS
SELECT
  JSON_EXTRACT_SCALAR(json_string, '$.-GuGCJDEprMKczAMDUj8.deviceId') AS deviceId,
  JSON_EXTRACT_SCALAR(json_string, '$.-GuGCJDEprMKczAMDUj8.dow') AS dow,
  JSON_EXTRACT_SCALAR(json_string, '$.-GuGCJDEprMKczAMDUj8.downloadFlag') AS downloadFlag,
...
FROM dataset.single_column_table
Elliott Brossard
  • 32,095
  • 2
  • 67
  • 99
  • My JSON is a single line and it's very huge thus It looks like the number of columns in the schema causes too much overhead when trying to load the data. isn't there any option in bigquery wherein I can ignore specific keys of JSON while loading the data. – Sains Sep 05 '18 at 02:49
  • Check out the `ignoreUnknownValues` option for load jobs. – Elliott Brossard Sep 05 '18 at 03:47