I want to load a file to PySpark local. But I can't and this error appears. Do you know, what I do?
Asked
Active
Viewed 528 times
0
-
that seems to be environment issue, python and pyspark are incompatible. Check the link https://stackoverflow.com/questions/41840296/pyspark-in-ipython-notebook-raises-py4jjavaerror-when-using-count-and-first – Manoj Kumar G Apr 11 '20 at 08:00
-
my python version is 3.7.6 and pyspark is '2.4.5' which is compatible. – M Mostafavi Apr 11 '20 at 09:05
1 Answers
0
When you load a file via Spark, it also expects the protocol and by default, it expected either HDFS (distributed computing file system).
Since you are trying to do local file, follow this approach
sc.textFile("file:///path to the file/")
already answered here

H Roy
- 597
- 5
- 10
-
Thanks , you mean for example this: con = sc.textFile("file:///e:/friendship-data.txt")? I did, but that error apears yet! – M Mostafavi Apr 11 '20 at 09:13
-
I just tried in my mac machine and it worked.. not sure what going wrong in your windows box...... 20/04/11 14:25:30 WARN SparkContext: Killing executors is not supported by current scheduler. >>> sc.textFile("file:////Users/test/dc2.txt") file:////Users/hireshroy/dc2.txt MapPartitionsRDD[1] at textFile at NativeMethodAccessorImpl.java:0 – H Roy Apr 11 '20 at 09:46