I am new to spark
and scala
. I have below requirement. I need to process all the files under a path which have sub directories. I guess, I need to write a for-loop logic to process across all the files.
Below is the example of my case:
src/proj_fldr/dataset1/20170624/file1.txt
src/proj_fldr/dataset1/20170624/file2.txt
src/proj_fldr/dataset1/20170624/file3.txt
src/proj_fldr/dataset1/20170625/file1.txt
src/proj_fldr/dataset1/20170625/file2.txt
src/proj_fldr/dataset1/20170625/file3.txt
src/proj_fldr/dataset1/20170626/file1.txt
src/proj_fldr/dataset1/20170626/file2.txt
src/proj_fldr/dataset1/20170626/file3.txt
src/proj_fldr/dataset2/20170624/file1.txt
src/proj_fldr/dataset2/20170624/file2.txt
src/proj_fldr/dataset2/20170624/file3.txt
src/proj_fldr/dataset2/20170625/file1.txt
src/proj_fldr/dataset2/20170625/file2.txt
src/proj_fldr/dataset2/20170625/file3.txt
src/proj_fldr/dataset2/20170626/file1.txt
src/proj_fldr/dataset2/20170626/file2.txt
src/proj_fldr/dataset2/20170626/file3.txt
I need the code to iterate the files like In src
loop (proj_fldr
loop(dataset
loop(datefolder
loop(file1 then, file2....))))