In my spark code I am writing my dataframe as a parquet file on hdfs. Then I have created an external table by changing columns order over those parquet files in BIGSQL and after querying the table, it shows me following error.
But If I query the same table in hive it works file. We get output by column to column mapping.
Does bigsql support column to column mapping while creating external table over parquet file?
BIGSQL error log:
DB2 LOG: The statement failed because a Big SQL component encountered an error. Component receiving the error: "BigSQL IO". Component returning the error: "UNKNOWN". Log entry identifier: "[BSL-3-25371e740]".. SQLCODE=-5105, SQLSTATE=58040, DRIVER=4.22.36
Learn more about this error
BIG SQL LOG: at com.ibm.biginsights.bigsql.dfsrw.reader.parquet.DfsVectorizedPrimitiveColumnReader.decodeDictionaryIds(DfsVectorizedPrimitiveColumnReader.java:506)
at com.ibm.biginsights.bigsql.dfsrw.reader.parquet.DfsVectorizedPrimitiveColumnReader.readBatch(DfsVectorizedPrimitiveColumnReader.java:173)
at com.ibm.biginsights.bigsql.dfsrw.reader.parquet.DfsVectorizedParquetRecordReader.nextBatch(DfsVectorizedParquetRecordReader.java:428)
at com.ibm.biginsights.bigsql.dfsrw.reader.parquet.DfsVectorizedParquetRecordReader.next(DfsVectorizedParquetRecordReader.java:347)
at com.ibm.biginsights.bigsql.dfsrw.reader.parquet.DfsParquetSplit2Batch.split2Batch(DfsParquetSplit2Batch.java:98)
at com.ibm.biginsights.bigsql.dfsrw.jaro.DfsSplitManager$SplitRunnable.run(DfsSplitManager.java:123)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.lang.Thread.run(Thread.java:785)
at com.ibm.biginsights.bigsql.dfsrw.jaro.DfsSplit2BatchThread.run(DfsSplit2BatchThread.java:58)
2019-04-25 03:42:14,848 ERROR com.ibm.biginsights.bigsql.dfsrw.reader.DfsBaseReader [Master-3-S:5.1001.1.0.0.724] : [BSL-3-25371e740] Exception raised by Reader at node: 3 Scan ID: S:5.1001.1.0.0.724 Table: mc400.hutdnp_ext Spark: false VORC: false VPQ: true VAVRO: false VTEXT: false VRCFILE: false VANALYZE: false
Exception Label: UNMAPPED(java.lang.NullPointerException)
java.lang.NullPointerException
at com.ibm.biginsights.bigsql.dfsrw.reader.parquet.DfsVectorizedPrimitiveColumnReader.decodeDictionaryIds(DfsVectorizedPrimitiveColumnReader.java:506)
at com.ibm.biginsights.bigsql.dfsrw.reader.parquet.DfsVectorizedPrimitiveColumnReader.readBatch(DfsVectorizedPrimitiveColumnReader.java:173)
at com.ibm.biginsights.bigsql.dfsrw.reader.parquet.DfsVectorizedParquetRecordReader.nextBatch(DfsVectorizedParquetRecordReader.java:428)
at com.ibm.biginsights.bigsql.dfsrw.reader.parquet.DfsVectorizedParquetRecordReader.next(DfsVectorizedParquetRecordReader.java:347)
at com.ibm.biginsights.bigsql.dfsrw.reader.parquet.DfsParquetSplit2Batch.split2Batch(DfsParquetSplit2Batch.java:98)
at com.ibm.biginsights.bigsql.dfsrw.jaro.DfsSplitManager$SplitRunnable.run(DfsSplitManager.java:123)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1160)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
SQL statement
select * from schema_name.dummytable
Execution log
Run time: 1.275 s
Status: FAILED
Database: HDP-DEV-BIGSQL