0

Ftp server has following files created on daily basis.

  • FGI_WTYUIO_D_2016_04_16_BS.daily.gzip - BS File
  • FGI_WTYUIO_D_2016_04_16_BV.daily.gzip - BV File
  • FGI_GHJK_D_2016_04_16_SATB3.daily.gzip - B3 File
  • FKI_GHJK_D_2016_04_16_SAT.daily.gzip - BV File
  • FKI_GHJK_D_2016_04_16_SATB3.daily.gzip - B3 File
  • FKI_GHJK_D_2016_04_16_SATBS.daily.gzip - BS File
  • FKI_GHJK_D_2016_04_16_SSD.daily.gzip - Need to Ignore
  • FKI_GHJK_D_2016_04_16_SSDBS.daily.gzip - Need to Ignore

So, basically there two filetypes

  • FGI
  • FKI

and Three Report code for each Filetypes

  • BS
  • BV
  • B3

I need to ignore rest of the files. (SSD files).

I need to write regex pattern inside Javascript to fetch these files. which has following variables.

  • fileDate - Date ex. 2016_04_16
  • matchReportCode - ex. BV,BS,B3

So, if fileDate = 2016_04_15 and matchReportCode='SV' (BS,BV). Then I should only fetch following files.

  • FGI_WTYUIO_D_2016_04_15_BS.daily.gzip - FGI BS File
  • FGI_WTYUIO_D_2016_04_15_BV.daily.gzip - FGI BV File
  • FKI_GHJK_D_2016_04_16_SAT.daily.gzip - FKI BV File
  • FKI_GHJK_D_2016_04_16_SATBS.daily.gzip - FKI BS File

So, if fileDate = 2016_04_19 and matchReportCode='3S' (B3,BS). Then I should only fetch following files.

  • FGI_WTYUIO_D_2016_04_15_BS.daily.gzip - FGI BS File
  • FGI_GHJK_D_2016_04_16_SATB3.daily.gzip - FGI B3 File
  • FKI_GHJK_D_2016_04_16_SATB3.daily.gzip - FKI B3 File
  • FKI_GHJK_D_2016_04_16_SATBS.daily.gzip - FKI BS File

So far I could only come up with this.

FileRegex = "F[KG]I_.*_D_" + fileDate + "_[A-z]{0,3}L{0,1}[" + matchReportCode + "]{0,1}.daily.gzip";

Can someone please help ? I am new to regex. Thanks.

Laurel
  • 5,965
  • 14
  • 31
  • 57
jigarshah
  • 410
  • 2
  • 8
  • 20
  • Correct Regex : FileRegex = "F[KG]I_.*_D_" + fileDate + "_[A-z]{0,3}B{0,1}[" + matchReportCode + "]{0,1}.daily.gzip"; – jigarshah May 02 '16 at 03:27
  • [`[A-z]` matches more than you think](http://stackoverflow.com/questions/29771901/why-is-this-regex-allowing-a-caret/29771926#29771926). – Wiktor Stribiżew May 02 '16 at 12:50

2 Answers2

0

The following may be a bit better:

FileRegex = "F[KG]I_[^_]+_D_" + fileDate + "_(?!SSD)[a-zA-Z]{0,3}((B[" + matchReportCode + "])|(?<^FKI.*)).daily.gzip";

This will match FKI and FGI file names that have the chosen fileDate and up to three letters preceding the chosen reportCode.

The other changes include changing [A-z], to [a-zA-Z], this is because regex character class range expression uses the ascii representations, and there are characters ([- etc.) between A and z that are not alphabetical (which appears to be your intent).

Also .*_ became [^_]+_, this requires that there be at least one character besides the underscore, prevents the engine from having to backtrack as much, as well as making the regex easier to edit if another segment is added.

I also added a negative lookahead (?!SSD) at the start of the last segment, which requires that the segment not start with SSD.

The or condition at the end ((B[" + matchReportCode + "])|(?<^FKI.*)) requires that either the file match the report code, or that the file name start with FKI (followed by any number of characters to get back to the end). The ^ is the start of line anchor when used outside of a character class ([...]).

Emma Talbert
  • 227
  • 1
  • 9
  • How to ignore SSD fles ? Regex will copy that as well , right ? Also, I don't have BV code in FKI type - FKI_GHJK_D_2016_04_16_SAT.daily.gzip - – jigarshah May 02 '16 at 04:03
  • See my updated answer. I wasn't sure if the missing report code was a typo, and I can't comment on other posts to clarify because I only have like 10 rep at the moment. – Emma Talbert May 02 '16 at 04:37
0

You can use negative lookahead:

var fileDate = '2016_04_19';
var matchReportCode = 'BS';
var re = new RegExp('F[KG]I_\\w+_D_' + fileDate + 
  '_(?!SSD)[\\w\\d]*' + matchReportCode + '?\\.daily\\.gzip');

// re.test('FKI_GHJK_D_2016_04_19_SSDBS.daily.gzip') === false
// re.test('FKI_GHJK_D_2016_04_19_SATBS.daily.gzip') === true
Ruslan Osmanov
  • 20,486
  • 7
  • 46
  • 60