load file in scala intellij IDE

Question

I have this code:

  def main(args: Array[String]): Unit = {

//  Creation of Spark Session
    implicit val sparkSession = SparkSession.builder().appName("seco_trial").getOrCreate()
    println("test")

    val df = sparkSession.read.format("csv").option("header", "true").load("D:/Userfiles/mir/Downloads/extract_sec_mav2.csv")

    println("test2")

the line where i create the val df gives me the following error:

2018-09-17 14:27:24 WARN FileStreamSink:66 - Error while looking for metadata directory. Exception in thread "main" java.io.IOException: No FileSystem for scheme: D at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2660) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667) [...]

Lets exclude wrong path as a problem. My file correctly has the extension .csv

can you please tell me what is the problem? why cant i load the file?

option("header", "true") value should given as boolean value not string value. Use option("header", true) instead — Praveen L, Sep 17 '18 at 12:36
Don't know if it's just a c&p error, but the main method isn't closed off with a closing `}` in your code example. — James Whiteley, Sep 17 '18 at 12:48
Add master when creating SparkSession. For local you can use like this to take all available cores. `implicit val sparkSession = SparkSession.builder().appName("seco_trial").master("local[*]").getOrCreate()` — Praveen L, Sep 17 '18 at 13:36
The only exception you have listed is related to path. Your exception is telling you it cannot find a filesystem with path `D:/...` What other problems do you have? — jacks, Sep 17 '18 at 13:49
Does this help you - https://stackoverflow.com/questions/29704333/spark-load-csv-file-as-dataframe — jacks, Sep 17 '18 at 13:52

score 1 · Answer 1 · edited Sep 17 '18 at 14:16

1

I am not sure why, but I have to add "fie:///" in front.

edited Sep 17 '18 at 14:16

shim

9,289
12
69
108

answered Sep 17 '18 at 13:50

kaileena

119
9

1

file:/// is let Spark know to read the data from local file system. If you don't want to specify the file:// change 'fs.defaultFS' value to 'file:///' , then by default Spark will look up data from local file system. – Lakshman Battini Sep 18 '18 at 00:38
can you please rephrase a bit? i think i kinda of understand but not sure. – kaileena Sep 18 '18 at 08:37

load file in scala intellij IDE

1 Answers1