0

java version 1.8, spark version 2.11

I want to read some data from the directory ./data in my spark project:

enter image description here

val sparkConf = new SparkConf().setMaster("local").setAppName("WordCount")
val sc = new SparkContext(sparkConf)

val lines = sc.textFile("datas")   // datas is a dictionary which contain some txt file

it comes error:

Error while running command to get file permissions : java.io.IOException: (null) entry in command string

that says I have no permissions to read the file in directory ./datas. But when I change code to read a specific document in ./datas:

val lines = sc.textFile("datas/1.txt")

it works.

How can I read all file in ./datas using sc.textFile("datas")?

blackbishop
  • 30,945
  • 11
  • 55
  • 76
  • 1
    Does this answer your question? [(null) entry in command string exception in saveAsTextFile() on Pyspark](https://stackoverflow.com/questions/40764807/null-entry-in-command-string-exception-in-saveastextfile-on-pyspark) – mck Feb 06 '21 at 08:49

1 Answers1

0

I found the solution:

Because Method textFile can only recognize specific files, I change the path to all files in the current directory (your path)/*.

e.g.

val lines = sc.textFile("datas/*")

it works.

  • Technically you were telling spark to read a directory called datas, when you added /* to it, you are now telling to read all files inside datas directory which is the expected behaviour. – Vikas Saxena Feb 06 '21 at 08:57