0

I have a command-line program that its first argument ( = argv[ 1 ] ) is a regex pattern.

./program 's/one-or-more/anything/gi/digit-digit'

So I need a regex to check if the entered input from user is correct or not. This regex can be solve easily but since I use library and std::regex_match and this function by default puts begin and end assertion (^ and $) at the given string, so the nan-greedy quantifier is ignored.

Let me clarify the subject. If I want to match /anything/ then I can use /.*?/ but std::regex_match considers this pattern as ^/.*?/$ and therefore if the user enters: /anything/anything/anyhting/ the std::regex_match still returns true whereas the input-pattern is not correct. The std::regex_match only returns true or false and the expected pattern form the user can only be a text according to the pattern. Since the pattern is various, here, I can not provide you all possibilities, but I give you some example.
Should be match

/.//
s/.//
/.//g
/.//i
/././gi
/one-or-more/anything/
/one-or-more/anything/g/3
/one-or-more/anything/i
/one-or-more/anything/gi/99
s/one-or-more/anything/g/4
s/one-or-more/anything/i
s/one-or-more/anything/gi/54

and anything look like this pattern

Rules:

  1. delimiters are /|@#
  2. s letter at the beginning and g, i and 2 digits at the end are optional
  3. std::regex_match function returns true if the entire target character sequence can be match, otherwise return false
  4. between first and second delimiter can be one-or-more +
  5. between second and third delimiter can be zero-or-more *
  6. between third and fourth can be g or i
  7. At least 3 delimiter should be match /.// not less so /./ should not be match
  8. ECMAScript 262 is allowed for the pattern

NOTE


If you need more details please comment me, and I will update the question.
Thanks.

Community
  • 1
  • 1
Shakiba Moshiri
  • 21,040
  • 2
  • 34
  • 44

2 Answers2

1

You could use this regular expression:

^s?([/|@#])((?!\1).)+\1((?!\1).)*\1((gi?|ig)(\1\d\d?)?|i)?$

See regex101.com

Note how this also rejects these cases:

///anything/
/./anything/gg
/./anything/ii
/./anything/i/12

How it works:

Some explanation of the parts that are different:

  • ((?!\1).): this will match any character that is not the delimiter. This way you are sure you can keep track of the exact number of delimiters used. You can this way also prevent that the first character after the first delimiter, is again that delimiter, which should not be allowed.
  • (gi?|ig): matches any of the valid modifier combinations, except a sole i, which is treated separately. So this also excludes gg and ii as valid character sequences.
  • (\1\d\d?)?: optionally allows for an extra delimiter (after a g modifier -- see previous) to be added with one or two digits following it.
  • ( |i)?: for the case there is no g modifier present, but just the i or none: then no digits are allowed to follow.
trincot
  • 317,000
  • 35
  • 244
  • 286
  • Many thanks but it should not match `s/.///33` at least **g** is needed [link](https://regex101.com/r/Zb6DXL/6) – Shakiba Moshiri Feb 25 '17 at 20:23
  • 1
    I had already updated my regex and answer to deal with that. Please check again. – trincot Feb 25 '17 at 20:27
  • Okay I did a fast test and it worked. Please add more detail that **how it works** and I after full test it with my program accept your answer – Shakiba Moshiri Feb 25 '17 at 20:33
  • Added more detail. – trincot Feb 25 '17 at 20:40
  • Why not? In that case `gi` is interpreted as the replacement string, not as the modifiers. It is essentially one of the patterns you want to accept: `s/one-or-more/anything/33`, where *anything* happens to be `gi`. – trincot Feb 25 '17 at 20:44
  • I know but here `s/./gi/33` the **gi** is `anything` and **33** is put instead of **gi** flags – Shakiba Moshiri Feb 25 '17 at 20:46
  • How does it differ from `s/one-or-more/anything/33` which you want to accept? – trincot Feb 25 '17 at 20:47
  • I am so sorry but as you see this is not a simple pattern to guess all cases with that I made a mistake – Shakiba Moshiri Feb 25 '17 at 20:49
  • So now you don't want to have digits at the end, *unless* there is also a modifier? Do I understand correctly? – trincot Feb 25 '17 at 20:56
  • Yes. the digits are for **index** substitution and it allows if there is a **g** flag at least – Shakiba Moshiri Feb 25 '17 at 20:59
  • So the digits are also not allowed when there is just an `i` modifier and not a `g` modifier? – trincot Feb 25 '17 at 21:00
  • Yes. if you can match only **g** okay it is better – Shakiba Moshiri Feb 25 '17 at 21:02
  • OK, see the "updated rules" section I added to my answer. If OK, I will integrate it into the first part of my answer. Oops, had to update the regex link as well: https://regex101.com/r/Zb6DXL/9 – trincot Feb 25 '17 at 21:04
  • okay, It takes me a few minutes to test all possibilities. Many thanks to you. I will comment you if I need. then you can edit your answer for last time – Shakiba Moshiri Feb 25 '17 at 21:18
  • It is better more than I thought at first. Thank you so much and excuse me if I annoyed you. And a favor to me -- can you introduce to me a good book on learning regex to read, also at level average or upper – Shakiba Moshiri Feb 25 '17 at 21:38
  • 1
    I don't really have a book to suggest: I just learned by doing. – trincot Feb 25 '17 at 21:45
  • @trincot this doesnt seem to match `s/one-or-more/anything/33` – Theo Feb 25 '17 at 21:52
  • Indeed, Theo, the rules changed. See the comments in this thread. – trincot Feb 25 '17 at 21:53
  • @k-five, I consolidated the edits into the final answer. For an interactive tutorial on the basics (no look around), have a look at https://regexone.com, for a more comprehensive reference, have a look at this book: https://www.princeton.edu/~mlovett/reference/Regular-Expressions.pdf – trincot Feb 25 '17 at 21:55
  • 1
    @trincot ok fair enough, I was busy trying to make it support that exact scenario, I will delete my answer :( – Theo Feb 25 '17 at 21:56
  • @trincot. Hi, I called you may you can help me with the same **problem** that I have on window. [Here I asked](http://stackoverflow.com/questions/42627957/the-same-regex-but-different-results-on-linux-and-windows-only-c) – Shakiba Moshiri Mar 06 '17 at 14:25
1

This is a tricky one, but I took the challenge - here is what I have ended up with:

^s?([\/|@#])(?:(?!\1).)+\1(?:(?!\1).)*\1(?:i|(?:gi?|ig)(\1\d{1,2})?)?$

Pattern breakdown:

  • ^ matches start of string
  • s? matches an optional 's' character
  • ([\/|@#]) matches the delimeter characters and captures as group 1
  • (?:(?!\1).)+ matches anything other than the delimiter character one or more times (uses negative lookahead to make sure that the character isn't the delimiter matched in group 1)
  • \1 matches the delimiter character captured in group 1
  • (?:(?!\1).)* matches anything other than the delimiter character zero or more times
  • \1 matches the delimiter character captured in group 1
  • (?: starts a new group
    • i matches the i character
    • | or
    • (?:gi?|ig) matches either g, gi, or ig
    • (\1\d{1,2})? followed by an optional extra delimiter and 0-9 once or twice
  • )? closes group and makes it optional
  • $ matches end of string

I have used non capturing groups throughout - these are groups that start ?:

Shakiba Moshiri
  • 21,040
  • 2
  • 34
  • 44
Theo
  • 1,608
  • 1
  • 9
  • 16
  • We are here to learn not just downvote or upvote. Since I can learn a little thing from you answer it is worth it to me – Shakiba Moshiri Feb 25 '17 at 22:01
  • OK - I've added a disclaimer just to be clear it doesn't cope with your current scenario; but glad if it helps you learn. – Theo Feb 25 '17 at 22:06
  • The answer **trincot** is correct as I wanted. However thanks for your attempt. Just to mention, it should not match `s/./g/i/33` but your pattern matches it. The digits are allowed if there is a `g` flag. – Shakiba Moshiri Feb 25 '17 at 22:12
  • Could not bear to leave an incorrect answer so have updated – Theo Feb 25 '17 at 22:17
  • Okay looks good. You do your best. Just remove `/m` at the end of your pattern – Shakiba Moshiri Feb 25 '17 at 22:25
  • @k-five thank you yes I have fixed, I had left in from my testing, but of course you are only testing one item – Theo Feb 25 '17 at 22:31