I am using Spark version 2.3 to write and save dataframes using bucketBy
.
The table gets created in Hive but not with the correct schema. I am not able to select any data from the Hive table.
(DF.write
.format('orc')
.bucketBy(20, 'col1')
.sortBy("col2")
.mode("overwrite")
.saveAsTable('EMP.bucketed_table1'))
I am getting below message:
Persisting bucketed data source table
emp
.bucketed_table1
into Hive metastore in Spark SQL specific format, which is NOT compatible with Hive.
The Hive Schema is being created as shown below:
hive> desc EMP.bucketed_table1;
OK
col array<string> from deserializer
How to save and write dataframes to a Hive table that can be viewed later?