2

I am unable to read the local csv file in spark program. I am using PyCharm IDE. Although I am able to use the position argument to read the file but not with file location. Can someone please help?

// code
    # Processing logic here...
    flightTimeCsvDF = spark.read \
        .format("csv") \
        .option("header", "true") \
        .load("data/flight*.csv")
        # .load(sys.argv[1])


\\error
Exception in thread "globPath-ForkJoinPool-1-worker-1" java.lang.UnsatisfiedLinkError: 'boolean org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(java.lang.String, int)'
    at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
    at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:793)
    at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:1218)
    at org.apache.hadoop.fs.FileUtil.list(FileUtil.java:1423)
    at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:601)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1972)
    at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:2014)
    at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:761)
    at org.apache.hadoop.fs.Globber.listStatus(Globber.java:128)

enter image description here

Piyush
  • 21
  • 3
  • [This answer](https://stackoverflow.com/questions/41851066/exception-in-thread-main-java-lang-unsatisfiedlinkerror-org-apache-hadoop-io) should help you out! – vilalabinot Aug 17 '22 at 07:09

2 Answers2

0

Please use the absolute path. From the image attached, I believe using the following will help solve the issue.

.load("C:\\Users\\psultania\\Anaconda3\\envs\\04-SparkSchemaDemo\\data\\flight*.csv")

If you are using different directories for input CSVs, please change the directory definition accordingly.

Hari Palappetty
  • 539
  • 6
  • 14
-2

Yes it works using absolute path