2

I know how to convert a text file into an RDD with SparkR:

data <- textFile(sc, "data/tsv_wiki")

But I would want to know how to convert an object of type DataFrame in R to an RDD.

Any help would be appreciated.

Jaime Caffarel
  • 2,401
  • 4
  • 30
  • 42
  • Perhaps ampcamp should provide this link in their exercise: https://spark.apache.org/docs/latest/sparkr.html. – r2evans Aug 05 '16 at 13:04
  • I'm not asking this for an ampcamp exercise. I haven't been able to find how to convert a DataFrame to an RDD in the SparkR (1.6) API (https://spark.apache.org/docs/1.6.0/api/R/) – Jaime Caffarel Aug 05 '16 at 13:10
  • Thanks for your comment @r2evans, but that link is just a general overview of SparkR that doesn't even mention RDD... I'm trying to perform this conversion not as part of any online course exercise, but as a way to perform certain dplyr-style operations directly over the RDD (http://stackoverflow.com/questions/33657974/sparkr-dplyr-style-split-apply-combine-on-dataframe) – Jaime Caffarel Aug 05 '16 at 13:20
  • Sorry, I'm (obviously) not a Spark guru. I was inferring from the docs that since they renamed [SchemaRDD to DataFrame](http://spark.apache.org/docs/latest/sql-programming-guide.html#rename-of-schemardd-to-dataframe) and talk about [converting DataFrames from local data frames](https://spark.apache.org/docs/latest/sparkr.html#from-local-data-frames), it was all equivalent. (I'm almost certainly confused, which may be understandable with all the *Data*s and *Frame*s going around :-) – r2evans Aug 05 '16 at 18:39
  • Don't worry :-), I'm also confused. I knew that you can use the `.rdd` method to convert a DataFrame to an RDD. Unfortunately, that method doesn't exist in SparkR from an existing RDD (just when you load a text file, as in the example), which makes me wonder why. – Jaime Caffarel Aug 06 '16 at 14:17

1 Answers1

3

It couldn't be easier.

converted.rdd <- SparkR:::toRDD(dataframe)

Note the triple colon operator, meaning that it's an internal function.

Jaime Caffarel
  • 2,401
  • 4
  • 30
  • 42