0

I am really bad at regex and I am trying to do the following:

How do I get all strings that starts and end with %%.

If these words appear in a string I want to be able to grab them: %%HELLO_WOLD%%, %%STUFF%%

Here's what I came up with so far: %%[a-zA-Z0-9]\w+

Kevin Pimentel
  • 2,056
  • 3
  • 22
  • 50
  • `%%.*%%` is all you need – emsimpson92 Jul 02 '18 at 18:54
  • 1
    Possible duplicate of [Regex, get string value between two characters](https://stackoverflow.com/questions/2034687/regex-get-string-value-between-two-characters) or https://stackoverflow.com/questions/1454913/regular-expression-to-find-a-string-included-between-two-characters-while-exclud and many others – ficuscr Jul 02 '18 at 18:55
  • 1
    Possible duplicate of [Regular Expression to find a string included between two characters while EXCLUDING the delimiters](https://stackoverflow.com/questions/1454913/regular-expression-to-find-a-string-included-between-two-characters-while-exclud) – mickmackusa Jul 03 '18 at 01:51
  • What should be the result for these two strings `%%HELLO_WOLD%%%%STUFF%%` and `%%HELLO_WOLD%%STUFF%%`? – Toto Jul 03 '18 at 09:40

1 Answers1

1

You could use anchors to assert the start ^ and the end $ of the line and match zero or more times any character .* or if there must be at least one character your might use .+

^%%.*%%$

Or instead of .* you could add your character class [a-zA-Z0-9]+ which will match lower and uppercase characters and digits or use the \w+ which will match a word character.

Note that the character class [a-zA-Z0-9] does not match an underscore and \w does.

If you want to find multiple matches in a string you might use %%\w+%%. This will also match %%HELLO_WOLD%% in %%%%%HELLO_WOLD%%%.

If there should be only 2 percentage signs at the beginning and at the end, you could use a positive lookahead (?= and positive lookbehind (?<= to assert that what is before and after the 2 percentage signs is not a percentage sign or are the start ^ or end $ of the string.

(?<=^|[^%])%%\w+%%(?=[^%]|$)

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
  • `[a-zA-Z0-9]` will not match an underscore though – emsimpson92 Jul 02 '18 at 19:01
  • Using anchors seems ill-advised because the OP is trying to match multiple occurrences in a string. Also, a greedy quantifier on `.` is going to be bad too. If you are going to answer this mega-duplicate, you should probably do some repairs on this answer for the benefit of the OP and future researchers. – mickmackusa Jul 03 '18 at 01:54
  • @mickmackusa Thanks for your comment, I see your point according to the multiple occurrences and I have updated my answer. Can you elaborate on the `bad` part of using a greedy quantifier when using anchors? – The fourth bird Jul 03 '18 at 06:39
  • 1
    No, you're good to go greedy with the anchors. Just that greedy is bad for multiple matches. – mickmackusa Jul 03 '18 at 06:46
  • I wouldn't mention anchors or lookarounds at all because they are unnecessary. But I'm not your boss and this is not my answer. – mickmackusa Jul 03 '18 at 06:55