0

I'm trying to upload the dataset from this link (https://www.kaggle.com/datasets/aaditshukla/flipkart-fasion-products-dataset) on either format to BigQuery.

If I try to upload a xlsx file I get the following error: Error while reading data, error message: The Apache Avro library failed to parse the header with the following error: Invalid data file. Magic does not match: bigstore/bigquery-prod-upload-us/prod-scotty-742294406158-0a961394-8a51-4e20-b60a-73171b3ede27 File: bigstore/bigquery-prod-upload-us/prod-scotty-742294406158-0a961394-8a51-4e20-b60a-73171b3ede27

If I try to upload a jason file, I get the following error: Error while reading data, error message: Failed to parse JSON: No object found when new array is started.; BeginArray returned false File: prod-scotty-742294406158-0e3c245a-4029-47a0-bb46-796ac00b04b3

I have tried creating my own schema, but that doesn't work either.

1 Answers1

0

BigQuery doesn't support xlsx files.

The easiest way would be to convert the XLSX file to CSV, and then load that one in BigQuery. You can select "Schema Autodetect" option, or specify the schema yourself.

The JSON file was failing because BigQuery needs instead a newline delimited JSON. You will need to convert it.

Alvaro
  • 813
  • 8
  • 16