Begineer SparkR and ElasticSearch question here!
How do I write a sparkR dataframe or RDD to ElasticSearch with multiple nodes?
There exists a specific R package for elastic but says nothing about hadoop or distributed dataframes. When I try to use it I get the following errors:
install.packages("elastic", repos = "http://cran.us.r-project.org")
library(elastic)
df <- read.json('/hadoop/file/location')
connect(es_port = 9200, es_host = 'https://hostname.dev.company.com', es_user = 'username', es_pwd = 'password')
docs_bulk(df)
Error: no 'docs_bulk' method for class SparkDataFrame
If this were pyspark, I would use the rdd.saveAsNewAPIHadoopFile()
function as shown here, but I can't find any information about it in sparkR from googling. ElasticSearch also has good documentation, but only for Scala and Java
I'm sure there is something obvious I am missing; any guidance appreciated!