1

I'm using AWS EMR at work. If I launch a spark shell I can run scala commands but can't read in a local file.

For example:

scala> val citi = spark.read.textFile("CitiGroup2006")
org.apache.spark.sql.AnalysisException: Path does not exist: hdfs://ip-10-99-99-99.ec2.internal:8020/user/hadoop/CitiGroup2006;

I tried entering the full path of the file but I get the same error. The file is in the same directory where I launched the spark shell. It does however work to load a scala file

:load hello.scala

Why does "load" work but not spark.read.textFile?

Chuck
  • 1,061
  • 1
  • 20
  • 45

1 Answers1

2

not so strong on scala.
but its look like spark.read.file read from the HDFS and I guess that your file is on the EMR local.
you can see files on the HDFS using the command:
$ hdfs dfs -ls
and copy files using the -put check out hadoop copy a local file system folder to HDFS and hadoop-common/FileSystemShell

Alone Bashan
  • 116
  • 1
  • 7