2

I've install Spark - 1.4.1 (have R 3.1.3 version). Currently testing SparkR to run statistical models. I'm able to run some sample code such as,

Sys.setenv(SAPRK_HOME = "C:\\hdp\\spark-1.4.1-bin-hadoop2.6")
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))
#load the Sparkr library
library(SparkR)
# Create a spark context and a SQL context
sc <- sparkR.init(master = "local")

sqlContext <- sparkRSQL.init(sc)

#create a sparkR DataFrame
DF <- createDataFrame(sqlContext, faithful)

sparkR.stop()

So Next,I'm installing rJava package in to SparkR. But its not installing. Giving below error.

> install.packages("rJava")
Installing package into 'C:/hdp/spark-1.4.1-bin-hadoop2.6/R/lib'
(as 'lib' is unspecified)
trying URL 'http://ftp.iitm.ac.in/cran/bin/windows/contrib/3.1/rJava_0.9-7.zip'
Content type 'text/html; charset="utf-8"' length 898 bytes
opened URL
downloaded 898 bytes

Error in read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package", "Type")) :
  cannot open the connection
In addition: Warning messages:
1: In unzip(zipname, exdir = dest) : error 1 in extracting from zip file
2: In read.dcf(file.path(pkgname, "DESCRIPTION"), c("Package", "Type")) :
  cannot open compressed file 'rJava/DESCRIPTION', probable reason 'No such file or directory'

Also when, I'm running SparkR command on shell it is started as 32-bit application. I highlighted the version info as below. enter image description here

So, please help me to resolve this issue.

csgillespie
  • 59,189
  • 14
  • 150
  • 185
Vijay_Shinde
  • 1,332
  • 2
  • 17
  • 38

2 Answers2

2

When in the SparkR shell, it seems to change where it installs R packages. The key line is

Installing package into 'C:/hdp/spark-1.4.1-bin-hadoop2.6/R/lib'

I suspect that

  • You do not have write permission for `C:/hdp/spark-1.4.1-bin-hadoop2.6/R/lib'
  • You don't want to put the package there in the first place.

You have two options,

  • Start a vanilla R session and install as usual
  • Or, use the lib argument in install.packages to specify where you want to install rJava
csgillespie
  • 59,189
  • 14
  • 150
  • 185
0

I solved the issue. It was R version issue, previously i'm using R 3.1.3.That time it was giving me error, that the rJava package is not available for current R version.

To solve I follow this steps:
1) Installed new R version i.e R 3.2.2
2) Then update the Path variable and new R version path(Windows -> "Path" -> "Edit environment variables to for your account" -> PATH -> edit the value.)
3) Again restart sparkR shell.

enter image description here

Thanks all for your support!!!

Community
  • 1
  • 1
Vijay_Shinde
  • 1,332
  • 2
  • 17
  • 38