Extract the 1st occurrence of a string based on a keyword

Question

I want to capture all ASCII string within the parentheses before the keyword "end". However, I am only interested in capturing the 1st matching group.

How do I ignore the 2nd matching group?

This is the sample regex which I wrote: \((.+?)\) end

And this is the sample string which I used: "There are some other sentences before (some otherwords which I am not interested in) all these.This is a sample string (something which I am interested in) end. This is another repeated string (with some otherwords) end."

I am only interested to obtain the output "somethings which I am interested in" which is in between the parentheses.

@splash58 sometimes, it may not be at the start of the string. — ilovetolearn, Feb 06 '19 at 07:09
why not just remove the global flag. Can you share the code that you are using. like this https://regex101.com/r/Z5ohnY/1 — aelor, Feb 06 '19 at 07:11
I am only interested to extract the words in the parentheses if the word after the parentheses is "end". i.e. This is a sentence (Words which I am interested in) end. — ilovetolearn, Feb 06 '19 at 07:15
If there are 2 keyword "end", how do I get the regex to match the 1st matching group? — ilovetolearn, Feb 06 '19 at 07:19
@Gurman, https://regex101.com/r/qrh4zL/5, if there are parentheses within the parentheses, we will have issues with the regex. I am not sure if we can even extract such keywords. — ilovetolearn, Feb 06 '19 at 07:34

Allan · Accepted Answer · 2019-02-06T08:16:07.623

Let me answer to your original question first.

I want to capture all ASCII string within the parentheses before the keyword "end". However, I am only interested in capturing the 1st matching group.

How do I ignore the 2nd matching group?

Input:

There are some other sentences before (some otherwords which I am not interested in) all these.This is a sample string (something which I am interested in) end. This is another repeated string (with some otherwords) end.

Expected capture:

somethings which I am interested in

Regex to use:

^(?<!\) end).*?\(([^()]+?)\) end

Demo: https://regex101.com/r/dVo9Zi/1

Additional notes:

In one of your comments you said:

if there are parentheses within the parentheses, we will have issues with the regex. I am not sure if we can even extract such keywords.

If you need to analyze nested structures, you have to forget about regex and for a parser, as explained here: Can regular expressions be used to match nested patterns?

If you really mean all ASCII strings in your question, then you will have to adapt [^()] in the regex and replace it by the successive intervals in hexadecimal of all ASCII characters and you will have to explicitly exclude ( and ). This gives you the following character class: [\x00-\x27\x2A-\x7F]. Reference: http://www.asciitable.com/, demo: https://regex101.com/r/dVo9Zi/2

Extract the 1st occurrence of a string based on a keyword

1 Answers1