Good morning, it maybe sounds like a stupid question, but I would like to access a temporary table in Spark by RStudio. I don't have any Spark cluster, and I only run everything local on my PC. When I start Spark through IntelliJ, the instance is running fine:
17/11/11 10:11:33 INFO Utils: Successfully started service 'sparkDriver' on port 59505.
17/11/11 10:11:33 INFO SparkEnv: Registering MapOutputTracker
17/11/11 10:11:33 INFO SparkEnv: Registering BlockManagerMaster
17/11/11 10:11:33 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
17/11/11 10:11:33 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
17/11/11 10:11:33 INFO DiskBlockManager: Created local directory at C:\Users\stephan\AppData\Local\Temp\blockmgr-7ca4e8fb-9456-4063-bc6d-39324d7dad4c
17/11/11 10:11:33 INFO MemoryStore: MemoryStore started with capacity 898.5 MB
17/11/11 10:11:33 INFO SparkEnv: Registering OutputCommitCoordinator
17/11/11 10:11:33 INFO Utils: Successfully started service 'SparkUI' on port 4040.
17/11/11 10:11:34 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://172.25.240.1:4040
17/11/11 10:11:34 INFO Executor: Starting executor ID driver on host localhost
17/11/11 10:11:34 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 59516.
17/11/11 10:11:34 INFO NettyBlockTransferService: Server created on 172.25.240.1:59516
But I am not sure about the port, I have to choose in RStudio/sparklyr:
sc <- spark_connect(master = "spark://localhost:7077", spark_home = "C://Users//stephan//Downloads//spark//spark-2.2.0-bin-hadoop2.7", version = "2.2.0")
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") :
cannot open file 'C:\Users\stephan\AppData\Local\Temp\Rtmp61Ejow\file2fa024ce51af_spark.log': Permission denied
I tried different ports, like 59516, 4040, ... but all led to the same result. The permission denied message I guess can be ignored due that the file is written fine:
17/11/11 01:07:30 WARN StandaloneAppClient$ClientEndpoint: Failed to connect to master localhost:7077
Can please anyone assist me, how I can establish a connection between a local running Spark and RStudio, but without that RStudio is running another Spark instance?
Thanks Stephan