I'm using spark-worker container, which is based on spark-base container.
How can I solve the exception:
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/README.md
Main.java
context = new SparkContext(
new SparkConf()
.setAppName("Test App")
.setMaster("spark://spark-master:7077")
.set("spark.executor.memory", "1g")
.setJars(new String[] { "target/spark-docker-1.0-SNAPSHOT.jar" })
);
String path = "file:///README.md";
// EXCEPTION HERE!!!
List<Tuple2<String, Integer>> output = context.textFile(path, 2)
...
My Docker containers does not set up HDFS, so I hope they will work with local file system of each spark-worker
. I did on each worker:
shell> docker exec -it spark-worker-# bash
shell> touch README.md
docker-compose.yml
# No HDFS or file system configurations!
version: '3.3'
services:
spark-master:
image: bde2020/spark-master
container_name: spark-master
ports: ['8080:8080', '7077:7077', '6066:6066']
spark-worker-1:
image: bde2020/spark-worker
container_name: spark-worker-1
ports: ['8082:8081']
depends_on:
- spark-master
environment:
- "SPARK_MASTER=spark://spark-master:7077"
spark-worker-2:
image: bde2020/spark-worker
container_name: spark-worker-2
ports: ['8083:8081']
depends_on:
- spark-master
environment:
- "SPARK_MASTER=spark://spark-master:7077"