This is an old issue and I have solved it by follow the answer in this post: How can I access S3/S3n from a local Hadoop 2.6 installation?
The answer from Kamil Sindi works for me, by adding packages in spark-shell option:
spark-shell --packages com.amazonaws:aws-java-sdk:1.11.967,org.apache.hadoop:hadoop-aws:3.2.0
When I type below command, it works.
scala> sc.hadoopConfiguration.set("fs.s3.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
scala> sc.textFile("s3://test/testdata.txt").foreach(println)
But when I add jars as below:
spark-shell --jars /tmp/hadoop-aws-3.2.0.jar , /tmp/aws-java-sdk-1.11.967.jar
an error throw as below:
java.lang.NoClassDefFoundError: com/amazonaws/AmazonServiceException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
Can any one tell me why adding jars doesn't work. How can I solve this issue by adding jars as other answers suggested?