2

I am attempting to install spark using sparklyr and

spark_install 

and I get the following error.

    C:\dsvm\tools\UnxUtils\usr\local\wbin\tar.exe: Cannot use compressed or remote archives
C:\dsvm\tools\UnxUtils\usr\local\wbin\tar.exe: Error is not recoverable: exiting now
running command 'tar.exe -zxf "C:\Users\MyPC\AppData\Local\rstudio\spark\Cache/spark-2.0.1-bin-hadoop2.7.tgz" -C "C:/Users/LeviVM/AppData/Local/rstudio/spark/Cache"' had status 2�tar.exe -zxf "C:\Users\MyPC\AppData\Local\rstudio\spark\Cache/spark-2.0.1-bin-hadoop2.7.tgz" -C "C:/Users/LeviVM/AppData/Local/rstudio/spark/Cache"� returned error code 2Installation complete.
cannot open file 'C:\Users\MyPc\AppData\Local\rstudio\spark\Cache/spark-2.0.1-bin-hadoop2.7/conf/log4j.properties': No such file or directoryFailed to set logging settingscannot open file 'C:\Users\MyPc\AppData\Local\rstudio\spark\Cache/spark-2.0.1-bin-hadoop2.7/conf/hive-site.xml': No such file or directoryFailed to apply custom hive-site.xml configuration

Then I downloaded spark from the web and used

spark_install_tar 

This gives me the same error:

C:\dsvm\tools\UnxUtils\usr\local\wbin\tar.exe: Cannot use compressed or remote archives
C:\dsvm\tools\UnxUtils\usr\local\wbin\tar.exe: Error is not recoverable: exiting now

Any advice?

Thanks in advance.

Gopi - MSFT
  • 154
  • 6
Levi Brackman
  • 325
  • 2
  • 17
  • I have a vague memory that windows users have specific requirements to get sparklyr to install. You should do some searching on windows installs of spark. I'm not sure you are yet ready to ask any R questions. – IRTFM Oct 14 '16 at 04:11
  • Thanks. I have spent a huge amount of time searching about this but to no avail. – Levi Brackman Oct 14 '16 at 11:42
  • Then you should cite the material that you were following in efforts to install spark. You don't seem to have `log4j`. When I searched I found this immediately on SO with a Google search: How to set Spark log4j path in standalone mode on windows? – IRTFM Oct 14 '16 at 16:22
  • Would you kindly post the links? – Levi Brackman Oct 14 '16 at 20:50
  • We now have pre-installed Spark standalone in the Microsoft Azure Data Science Virtual Machines (DSVM) - both Windows 2016 edition and the Ubuntu Linux edition. If you spin up a new instance of the DSVM on Azure you shoudl be able to leverage the preinstalled Spark standalone instance for your development. – Gopi - MSFT Oct 09 '17 at 20:51

2 Answers2

1

When I upgraded sparklyr using

devtools::install_github("rstudio/sparklyr") 

The issue went away

Levi Brackman
  • 325
  • 2
  • 17
0

spark_install_tar(tarfile = "path/to/spark_hadoop.tar")

if you still getting error, then untar the tar and set spark_home environment variable points to spark_hadoop untar path.

Then try executing the following in the R console. library(sparklyr) sc <- spark_connect(master = "local")

DSBLR
  • 555
  • 5
  • 9