It is possible to connect sparklyr with a remote hadoop cluster or it is only possible to use it local? And if it is possible, how? :)
In my opinion the connection from R to hadoop via spark is very important!
It is possible to connect sparklyr with a remote hadoop cluster or it is only possible to use it local? And if it is possible, how? :)
In my opinion the connection from R to hadoop via spark is very important!
Do you mean Hadoop or Spark cluster? If Spark, you can try to connect through Livy, details here: https://github.com/rstudio/sparklyr#connecting-through-livy
Note: Connecting to Spark clusters through Livy is under experimental development in sparklyr
You could use livy which is a Rest API service for the spark cluster.
once you have set up your HDinsight cluster on Azure check for livy service using curl
#curl test
curl -k --user "admin:mypassword1!" -v -X GET
#r-studio code
sc <- spark_connect(master = "https://<yourclustername>.azurehdinsight.net/livy/",
method = "livy", config = livy_config(
username = "admin",
password = rstudioapi::askForPassword("Livy password:")))
Some useful URL https://learn.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-livy-rest-interface