1

I am trying to create the object of ParquetWriter class which accepts the argument (OutputFile, Mode, WriteSupport, CompressionCodecName, int, boolean, Configuration, int, ParquetProperties). But this constructor has default access modifier in the API which I use. I can't able to access it.

I had included the parquet library from maven

compile group: 'org.apache.parquet', name: 'parquet-hadoop', version: '1.10.1'

I even tried to extend that class but still, I got the error constructor not visible

public class MyParquetWriter  extends ParquetWriter{

    MyParquetWriter(OutputFile file, Mode mode, WriteSupport writeSupport, CompressionCodecName compressionCodecName,
            int rowGroupSize, boolean validating, Configuration conf, int maxPaddingSize,
            ParquetProperties encodingProps) throws IOException {
        super(file, mode, writeSupport, compressionCodecName, rowGroupSize, validating, conf, maxPaddingSize, encodingProps);

    }
}

How can I use this constructor in my project?

UDIT JOSHI
  • 1,298
  • 12
  • 26
  • and what compilation error do yo get? Have a look at this: https://www.programcreek.com/java-api-examples/index.php?api=parquet.hadoop.ParquetWriter It also looks like that constructor will be removed in version 2.0 https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetWriter.java – Jocke May 19 '19 at 06:14
  • Compilation error- The constructor is not visible. The last method of the ParquetWriter is not deprecated which uses the path as HadoopOutputFile.fromPath(path, conf) – UDIT JOSHI May 19 '19 at 06:57
  • Can you suggest any alternative way to write ParquetFile in Hadoop? I also tried ParquetFileWriter but it doesn' support passing – UDIT JOSHI May 19 '19 at 06:58

1 Answers1

3

I had a look at the implementation of the class ParquetWriter and all the constructors are marked as Deprecated anyway.
What you should do is instantiate it with the inteded Builder class that is provided as a nested class in ParquetWriter.

This way you can also be sure your code will be compatible with future versions.

For more information on how to use builders see this article:
https://dzone.com/articles/design-patterns-the-builder-pattern

Edit: What I have been doing in similar situations is to write a Wrapper class, which (in this case) will user a Builder to initialize a private ParquetWriter instance:

public class MyParquetWriterWrapper implements Closeable {
    private final ParquetWriter parquetWriter;

    public MyParquetWriterWrapper(Path file, WriteSupport writeSupport, CompressionCodecName compressionCodecName, int blockSize, int pageSize) throws IOException {
        ParquetWriter.Builder parquetWriterbuilder = new ParquetWriter.Builder() {
            @Override
            protected ParquetWriter.Builder self() {
                return this;
            }

            @Override
            protected WriteSupport getWriteSupport(org.apache.hadoop.conf.Configuration conf) {
                return writeSupport;
            }
        };

        parquetWriterbuilder.withCompressionCodec(compressionCodecName);
        parquetWriterbuilder.withPageSize(pageSize);
        // ... + other properties which you want to be set

        parquetWriter = parquetWriterbuilder.build(); // building the parquetWriter instance
    }

    public ParquetWriter unwrap() {
        return this.parquetWriter;
    }

    @Override
    public void close() throws IOException {
        parquetWriter.close();
    }

Instead of overriding the methods from ParquetWriter, the Wrapper will simply forward calls:

public void write(T object) throws IOException {
    // some code before writing...
    this.parquetWriter.write(object);
    // some code after writing...
}

As is also pointed on in this question, extending a concrete class (especially when not under your control) is generally not considered a best practice. It would be better to inherit from an interface, but ParquetWriter is only using Closeable, which does not get you very far...

sekky
  • 834
  • 7
  • 14
  • I read about builder class . But still i can' t able to access the ParquetWriter constructor which facilates OutputFile for Writing parquet file in hadoop – UDIT JOSHI May 30 '19 at 09:21
  • I wish I can do that, but unable extend concrete class Closeable – UDIT JOSHI May 31 '19 at 09:03
  • My bad: it should be implements Closeable not extends Closable... I edited my answer accordingly – sekky May 31 '19 at 10:59