I Have N
number of same type files to be processed and I will be giving a wildcard input pattern(C:\\users\\*\\*
).
So now how do I find the file name and record ,that has been rejected while uploading to bigquery in java.
Asked
Active
Viewed 111 times
0

raj
- 1
- 1
2 Answers
0
I guess BQ writes to the temp location path that you pass to your pipeline and not to local [honestly not sure about this].
In my case, with python, I used to pass tmp location as GCS bucket, and when I error is show, they usually shows the name of the log file that contains the rejected errors in the command line logs.
And then I use gsutil cp
command to copy it to my local computer and read it

Idhem
- 880
- 1
- 9
- 22
0
BigQuery I/O (Java and Python SDK) supports deadletter pattern: https://beam.apache.org/documentation/patterns/bigqueryio/.
Java
result
.getFailedInsertsWithErr()
.apply(
MapElements.into(TypeDescriptors.strings())
.via(
x -> {
System.out.println(" The table was " + x.getTable());
System.out.println(" The row was " + x.getRow());
System.out.println(" The error was " + x.getError());
return "";
}));
Python
errors = (
result['FailedRows']
| 'PrintErrors' >>
beam.FlatMap(lambda err: print("Error Found {}".format(err))))

ningk
- 1,298
- 1
- 7
- 7
-
Looks like it will work for streaming jobs and does it work for batch jobs ? – raj Jun 03 '22 at 05:45
-
It should since the examples are in batch. – ningk Jun 03 '22 at 18:35