I have Apache Flink job for parsing csv-files which works fine in from IntelliJ IDEA on Windows. But when I put my job (jar) in docker-container Apache Flink i have problems with permisson to file with class FileSource.forRecordStreamFormat(...)
. Inside the container i have file: /opt/flink/data/test2.csv
. The permissions are ok (I can even changed file from my job). For fileName
I used /opt/flink/data/test2.csv
, //opt/flink/data/test2.csv
, ///opt/flink/data/test2.csv
.
Permissions:
# pwd
/opt/flink/data
# ls -ls
total 16088
1204 -rwxrwxrwx 1 root root 1231979 Jan 24 15:54 test2.csv
14876 -rwxrwxrwx 1 root root 15231523 Jan 22 19:24 test3.csv
8 -rwxrwxrwx 1 root root 6623 Jan 24 14:32 test_Home.xlsx
Docker-compose:
version: "2.2"
services:
jobmanager:
image: flink:1.16-java8
ports:
- "8081:8081"
command: jobmanager
environment:
- |
FLINK_PROPERTIES=
jobmanager.rpc.address: jobmanager
volumes:
- /c/Users/MGubina/Desktop/data:/opt/flink/data
taskmanager:
image: flink:1.16-java8
depends_on:
- jobmanager
command: taskmanager
scale: 1
environment:
- |
FLINK_PROPERTIES=
jobmanager.rpc.address: jobmanager
taskmanager.numberOfTaskSlots: 2
Part of job code:
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
CsvReaderFormat<Product> csvFormat = CsvReaderFormat.forPojo(Product.class);
FileSource<Product> csvSource =
// FileSource.forRecordStreamFormat(csvFormat, Path.fromLocalFile(file)).build(); // firsrt version
FileSource.forRecordStreamFormat(csvFormat, new Path(fileName)).build(); // second version
DataStream<Product> csvInputStream = env.fromSource(csvSource, WatermarkStrategy.noWatermarks(), "csv-source");
...
Logs with exception:
Caused by: java.io.FileNotFoundException: File file:/opt/flink/data/test2.csv does not exist or the user running Flink ('flink') has insufficient permissions to access it.
at org.apache.flink.core.fs.local.LocalFileSystem.getFileStatus(LocalFileSystem.java:106)
at org.apache.flink.connector.file.src.impl.StreamFormatAdapter.openStream(StreamFormatAdapter.java:157)
at org.apache.flink.connector.file.src.impl.StreamFormatAdapter.createReader(StreamFormatAdapter.java:70)
at org.apache.flink.connector.file.src.impl.FileSourceSplitReader.checkSplitOrStartNext(FileSourceSplitReader.java:112)
at org.apache.flink.connector.file.src.impl.FileSourceSplitReader.fetch(FileSourceSplitReader.java:65)
at org.apache.flink.connector.base.source.reader.fetcher.FetchTask.run(FetchTask.java:58)
at org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:142)
I tried to use different ways of getting Path, but no luck this way.
As long as I have in exception File file:/opt/flink/data/test2.csv does not exist
I think that problem might be that in local fyle system in Docker (Unix-like)is needed path like file:///
.
What can I do? Maybe I miss something?