0

I Have N number of same type files to be processed and I will be giving a wildcard input pattern(C:\\users\\*\\*). So now how do I find the file name and record ,that has been rejected while uploading to bigquery in java.

raj
  • 1
  • 1

2 Answers2

0

I guess BQ writes to the temp location path that you pass to your pipeline and not to local [honestly not sure about this].

In my case, with python, I used to pass tmp location as GCS bucket, and when I error is show, they usually shows the name of the log file that contains the rejected errors in the command line logs.

And then I use gsutil cp command to copy it to my local computer and read it

Idhem
  • 880
  • 1
  • 9
  • 22
0

BigQuery I/O (Java and Python SDK) supports deadletter pattern: https://beam.apache.org/documentation/patterns/bigqueryio/.

Java

result
      .getFailedInsertsWithErr()
      .apply(
          MapElements.into(TypeDescriptors.strings())
              .via(
                  x -> {
                    System.out.println(" The table was " + x.getTable());
                    System.out.println(" The row was " + x.getRow());
                    System.out.println(" The error was " + x.getError());
                    return "";
                  }));

Python

errors = (
  result['FailedRows']
  | 'PrintErrors' >>
  beam.FlatMap(lambda err: print("Error Found {}".format(err))))
ningk
  • 1,298
  • 1
  • 7
  • 7