I have a case where I will be running R code on a data that will be downloaded from Hadoop. Then, the output of the R code will be uploaded back to Hadoop as well. Currently, I am doing it manually and I would like to avoid this manual downloading/uploading process.
Is there a way I can do this in R by connecting to hdfs? In other words, in the beginning of the R script, it connects to Hadoop and reads the data, then in the end it uploads the output data to Hadoop again. Are there any packages that can be used? Any changes required in Hadoop server or R?
I forgot to note the important part: R and Hadoop are on different servers.