0

I'm writing a node.js script to group tons of screenshots.
I have got two different patterns that I want to match:

/(?<date>\d{4}-\d{2}-\d{2}) (?<time>\d{2}_\d{2}_\d{2})(-| - )(?<window>.*?)(-| - )(?<index>\d{6})(?<extension>\.(png|jpg|jpeg))/g
/(?<date>\d{4}-\d{2}-\d{2}) (?<time>\d{2}_\d{2}_\d{2})(-| - )(?<window>.*?)(?<extension>\.(png|jpg|jpeg))/g
  1. '2017-08-31 18_57_42-shouldwork.png' matches 2nd as expected
  2. '2017-08-31 18_57_43-shouldwork.png' does not match either
  3. '2017-08-31 18_57_42-shouldwork - Kopie.png' matches 2nd as expected
  4. '2017-08-31 18_57_42-shouldwork2.png' does not match
  5. '2019-03-09 11_11_09 - shouldwork - 000003.png' matches 1st as expected
  6. '2019-03-09 11_11_10 - shouldwork - 000003.png' matches 2nd
  7. 'should fail.png' does not match either as expected

Here is also fiddle where you can see it with my code (reduced to the problematic parts) https://jsfiddle.net/sfwr750n/
and here is a link to regex101 https://regex101.com/r/dxGFNN/1

At first I thought it was just node.js, but Chrome has the same problem (didn't try firefox, last time I checked it didn't support named groups), even more confusing is the fact that regex101 matches everything as expected.

Ekusu
  • 132
  • 1
  • 8
  • `(?` did you mean to use a non-capturing group `(?:` because right now this is just bizarre syntax – VLAZ Mar 09 '19 at 13:27
  • 1
    *"Here is also fiddle where you can see it with my code (reduced to the problematic parts)"* I recommend using **on-site** Stack Snippets, not off-site resources. [Here's how to do one](https://meta.stackoverflow.com/questions/358992/). – T.J. Crowder Mar 09 '19 at 13:28
  • no those are named groups – Ekusu Mar 09 '19 at 13:28
  • 2
    @VLAZ - Those are [*named* capture groups](https://github.com/tc39/proposal-regexp-named-groups), a relatively-recent addition to JavaScript's regular expressions (ES2018). – T.J. Crowder Mar 09 '19 at 13:28
  • OK, I definitely didn't know about them. So the next question is, does Node.JS know about them? Are they currently supported in your version? – VLAZ Mar 09 '19 at 13:29
  • @VLAZ - If it didn't, the OP would be getting a syntax error. – T.J. Crowder Mar 09 '19 at 13:30
  • @VLAZ yes they work but it arbitrarily seems to fail, not only in node but chrome too – Ekusu Mar 09 '19 at 13:30
  • whats even more weird is that '2017-08-31 18_57_42-shouldwork.png' works but '2017-08-31 18_57_43-shouldwork.png' fails, onlyone number changed ('_42'->'_43') – Ekusu Mar 09 '19 at 13:32
  • Well, regex101 didn't really understand them and removing the named part made the regex work. So, I have no idea how Node.js would handle them. Light googling seems to imply they aren't concretely officially supported in Chrome, so I suppose they might be buggy. *Should* they work all the time? If you're using an experimental feature, then I wouldn't expect them to work every time. – VLAZ Mar 09 '19 at 13:32
  • @VLAZ - Named capture groups are fully supported in Chrome and have been since *at least* Chrome v64 (and in Node.js v10+). The seemingly chaotic results are because the regexes have state. – T.J. Crowder Mar 09 '19 at 13:38
  • 1
    @T.J.Crowder completely missed the `g` flag. In my defence, I had to scroll sideways to see it...and I simply didn't. – VLAZ Mar 09 '19 at 13:40

1 Answers1

2

Your regular expressions use the g flag, which means that they retain state. For instance, you've said your second string doesn't match either of your expressions, but it does, provided the expression is starting at the beginning:

const rex = /(?<date>\d{4}-\d{2}-\d{2}) (?<time>\d{2}_\d{2}_\d{2})(-| - )(?<window>.*?)(?<extension>\.(png|jpg|jpeg))/g;
const str = "2017-08-31 18_57_43-shouldwork.png";
console.log(rex.exec(str)); // Works
console.log(rex.exec(str)); // Fails
.as-console-wrapper {
  max-height: 100% !important;
}

I'd suggest that you don't use the g flag, and do use anchors at the beginning and end so you're matching the entire string. Alternately, if you're looking for these strings within a larger block of text, just be sure to set lastIndex = 0 on the regular expression when starting to search a new block of text so it doesn't continue from where it previously left off.

T.J. Crowder
  • 1,031,962
  • 187
  • 1,923
  • 1,875
  • Thank you. Seems like that was the problem, I was sure I tried it without the `g`. In my case the start/end anchors don't matter since the strings are in an array and I'm just looping through them. I guess I will have to re-read stuff about the flags. – Ekusu Mar 09 '19 at 13:39
  • @Ekusu - Just beware that without anchors, `2017-08-31 18_57_43-shouldwork.png flibbery dot` will also match. – T.J. Crowder Mar 09 '19 at 13:40
  • Yhea I know but that shouldn't be a problem in my case unless GreenShot decides to go crazy. I had them in at first but removed them. – Ekusu Mar 09 '19 at 13:44