3

I am currently working on Databricks using R notebooks. I would love to combine functionalities from the two R interfaces to Spark, namely SparkR and sparklyr. Therefor I needed to make use of SparkR functions on sparklyr Spark DataFrames (SDF) and vice versa.

I know that generally it is not possible to do so in a straight-forward way. However, I also know about workarounds to make use of PySpark SDFs in SparkR, which basically means to create a temp view of the PySpark SDF with

 %py 
 spark_df.createOrReplaceTempView("PySparkSDF")

and subsequently to get it into SparkR via

 %r
 SparkR::sql("REFRESH TABLE PySparkSDF ")
 sparkR_df <- SparkR::sql("SELECT * FROM PySparkSDF ")

Is it by any means possible to combine sparklyr and SparkR in a similar way? An explanation would also be highly appreciated!

K.O.T.
  • 111
  • 10
  • Related: https://stackoverflow.com/questions/43551380/convert-sparklyr-data-frame-into-a-sparkr-data-frame and https://stackoverflow.com/questions/40577650/using-sparkr-and-sparklyr-simultaneously – Frank Nov 02 '22 at 18:51

0 Answers0