0

I'm new to Spark and I'm trying to use PySpark to connect to Hive to perform a query and load data to a dataframe and then write that data to couchbase. Based on examples I have to create a spark context for both to be able to connect to the datasources. However, I am only able to create one context in a script/session. What is the best practice to move a set of data from one data source to another using Spark?

codeBarer
  • 2,238
  • 7
  • 44
  • 75
  • 3
    you can essentially create only one SparkContext, and same has to be used throughout your code no matter how many databases you are connecting to. – toofrellik Aug 28 '19 at 04:08
  • 1
    A bit old so maybe there are better solutions but maybe it could help you: https://stackoverflow.com/questions/32714396/querying-on-multiple-hive-stores-using-apache-spark – Shaido Aug 28 '19 at 05:33

0 Answers0