How do I load text file to pySpark?

Question

I want to load a file to PySpark local. But I can't and this error appears. Do you know, what I do?

that seems to be environment issue, python and pyspark are incompatible. Check the link https://stackoverflow.com/questions/41840296/pyspark-in-ipython-notebook-raises-py4jjavaerror-when-using-count-and-first — Manoj Kumar G, Apr 11 '20 at 08:00
my python version is 3.7.6 and pyspark is '2.4.5' which is compatible. — M Mostafavi, Apr 11 '20 at 09:05

score 0 · Accepted Answer · answered Apr 11 '20 at 08:38

0

When you load a file via Spark, it also expects the protocol and by default, it expected either HDFS (distributed computing file system).

Since you are trying to do local file, follow this approach

sc.textFile("file:///path to the file/")

already answered here

answered Apr 11 '20 at 08:38

H Roy

Thanks , you mean for example this: con = sc.textFile("file:///e:/friendship-data.txt")? I did, but that error apears yet! – M Mostafavi Apr 11 '20 at 09:13
I just tried in my mac machine and it worked.. not sure what going wrong in your windows box...... 20/04/11 14:25:30 WARN SparkContext: Killing executors is not supported by current scheduler. >>> sc.textFile("file:////Users/test/dc2.txt") file:////Users/hireshroy/dc2.txt MapPartitionsRDD[1] at textFile at NativeMethodAccessorImpl.java:0 – H Roy Apr 11 '20 at 09:46

1 Answers1