I have a set of lines where most of them follow this format
STARTKEYWORD some text I want to extract ENDKEYWORD\n
I want to find these lines and extract information from them.
Note, that the text between keywords can contain a wide range of characters (latin and non-latin letters, numbers, spaces, special characters) except \n
.
ENDKEYWORD
is optional and sometimes can be omitted.
My attempts are revolving around this regex
STARTKEYWORD (.+)(?:\n| ENDKEYWORD)
However capturing group (.+)
consumes as many characters as possible and takes ENDKEYWORD
which I do not need.
Is there a way to get some text I want to extract
solely with regular expressions?