OTC_omega_20210302.csv
CH_delta_20210302.csv
MD_omega_20210310.csv
CD_delta_20210310.csv
val hdfsPath = "/development/staging/abcd-efgh"
val fs = org.apache.hadoop.fs.FileSystem.get(spark.sparkContext.hadoopConfiguration)
val files = fs.listStatus(new Path(s"${hdfsPath}")).filterNot(_.isDirectory).map(_.getPath)
val regX = "OTC_*[0-9].csv|CH_*[0-9].csv".stripMargin.r
val filteredFiles = files.filter(fName => regX.findFirstMatchIn(fName.getName).isDefined)
What is regex do i need to give if i need any file name that starts with either (OTC_ or CH_ ) and ends with YYYYMMDD.csv ?
As per the above files i need two outputs OTC_omega_20210302.csv CH_delta_20210302.csv
Please help