0

Im trying to load a table from Teradata to BQ. My pipeline is first export the table data to a gcs location and then using bqload, loading the data from gs:// to the bq table.

My bqload looks like this

bq load --autodetect --source_format=CSV --project_id=xx-xx-xx --field_delimiter='^' \
BQTable \
gs:// \
Name:string,Age:INT64,Place:NUMERIC ...

But i get the below error, when im trying to load the data into the bq table using bq load

Error detected while parsing row starting at position: 1830577. Error: Data between
close double quote (") and field separator.

I tried using couple of options in the bqload command, like --quote ="" , when i used this, it gave me a different error like below

Could not parse 'MP' as NUMERIC for field XXXXXX (position 18) starting at
location 1030399 with message 'Invalid NUMERIC value: MP'

Played around with other options as well for long time, but nothing gave me the result. Can someone throw some light?

jahan
  • 103
  • 4
  • 19

2 Answers2

3

I have run into this issue a number of times and found it hard to figure out until I Googled around for some bash help.

So my message, like yours is:

Error while reading data, error message: Error detected while parsing row starting at position: 493727436.
Error: Data between close double quote (") and field separator.

I found that I can isolate that exact byte by calling tail -c with +;

tail -c +493727436 myfile.csv

However my file is really long so I tailed a few bytes before and piped to head to get first 100 bytes

tail -c +493727386 myfile.csv | head -c 100
John Smith
  • 1,027
  • 15
  • 31
AndreJohn
  • 31
  • 5
0

Not sure if still active question but maybe it will help someone in the future.

Looks like have some data incorrectly closed in double quotes (e.g. 123,"some" data,456). It may be caused by escaping special characters - see more about data with double quotes here: Properly escape a double quote in CSV and here: https://gpdb.docs.pivotal.io/43250/admin_guide/load/topics/g-escaping-in-csv-formatted-files.html

Note that NULL value is also escaped by default with " this may cause some extra trouble, like: e.g. 123,"NULL,"some text value in quotes",456 then "NULL," would be interpreted as string value and 'some text...' would throw

Error: Data between close double quote (") and field separator.

You should get rid off any "NULL from your .csv file so it would look like this: 123,,"some text value in quotes",456

b_t_y
  • 1
  • 2