Trying to read simple Parquet file into my Google DataFlow Pipeline
using the following code
Read.Bounded<KV<Void, GenericData>> results = HadoopFileSource.readFrom("/home/avi/tmp/db_demo/simple.parquet", AvroParquetInputFormat.class, Void.class, GenericData.class);
trigger always the following exception when running the pipeline
IllegalStateException: Cannot find coder for class org.apache.avro.generic.GenericData
seems like this method inside HadoopFileSource can't handle this type of class as for coder
private <T> Coder<T> getDefaultCoder(Class<T> c) {
if (Writable.class.isAssignableFrom(c)) {
Class<? extends Writable> writableClass = (Class<? extends Writable>) c;
return (Coder<T>) WritableCoder.of(writableClass);
} else if (Void.class.equals(c)) {
return (Coder<T>) VoidCoder.of();
}
// TODO: how to use registered coders here?
throw new IllegalStateException("Cannot find coder for " + c);
}
any help will be appreciated
Avi