1

How to store a Pyspark DataFrame object to a hive table , "primary12345" is a hive table ? am using the below code masterDataDf is a data frame object

masterDataDf.write.saveAsTable("default.primary12345")

getting below error

: java.lang.RuntimeException: Tables created with SQLContext must be TEMPORARY. Use a HiveContext instead.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
Akhil Sudhakaran
  • 11
  • 1
  • 1
  • 2
  • Possible duplicate of [How to save DataFrame directly to Hive?](https://stackoverflow.com/questions/30664008/how-to-save-dataframe-directly-to-hive) – desertnaut Nov 09 '17 at 12:17

1 Answers1

1

You can create one temporary table.

masterDataDf.createOrReplaceTempView("mytempTable") 

Then you can use simple hive statement to create table and dump the data from your temp table.

sqlContext.sql("create table primary12345 as select * from mytempTable");

OR

if you want to used HiveContext you need to have/create a HiveContext

import org.apache.spark.sql.hive.HiveContext;

HiveContext sqlContext = new org.apache.spark.sql.hive.HiveContext(sc.sc());

Then directly save dataframe or select the columns to store as hive table

masterDataDf.write().mode("overwrite").saveAsTable("default.primary12345 ");
Sahil Desai
  • 3,418
  • 4
  • 20
  • 41