3

I need to match words of certain criteria, but only if they are within quotes*. I'd wish to do this via Regex.

I think there's a "jQuery plugin for that". Google for jQuery basic arithmetic plugin.

Let's say I have the above text as the subject of a Regex search, and I'd wish to find the word "plugin", if it is inside quotes. I do not need to match the second "plugin" word at the end of the sentence, since it's not within quotes, but (edit) I do need to match multiple occurrences of "plugin", were they enclosed by quotes (even if there are multiple occurrences inside a single quotation block).

With a working expression, the following words (highlighted with bold text) should be matched:

I think there's a "jQuery plugin for that". Google for jQuery basic arithmetic plugin.

I have a theoretical solution that would use a positive lookahead to determine if there is an even or odd number of quotes, and match the word if that number is odd.

What regular expression should I use to accomplish this?

*double quotes only


I use this expression to match words that are not surrounded with quotes, following a very similar logic:

\bplugin\b(?=[^"]*(?:"[^"]*"[^"]*)*$)
John Weisz
  • 30,137
  • 13
  • 89
  • 132
  • Which regex engine are you using? – i_am_jorf Apr 29 '15 at 16:01
  • The one coming with JavaScript and PHP. I need code that works on both, that is, no lookbehinds. – John Weisz Apr 29 '15 at 16:02
  • @stribizhev No, that question would like to include the entire quotation if it is a match. I need to return a _certain_ word from within the quotation, **not the entire quotation itself**. EDIT: the referenced question's answer suggests an approach using capturing groups (if I understand it right). I'm interested in doing a positive lookahead, so it can be done in one run instead of a second, to extract. – John Weisz Apr 29 '15 at 16:07
  • What about if there is a line break inside the quotes? – dawg Apr 29 '15 at 16:36
  • @dawg Line break should be treated as simple whitespace. Line breaks inside quotes should not affect the outcome. – John Weisz Apr 29 '15 at 16:37
  • 1
    How do you define "inside"? The posted solutions will match if there is a double quote to the left and another to the right, but "inside" implies you need to pair up quotes, so that *"my" plugin beats "your" plugin* should not match on the first *plugin*, even though it is technically "between" quotes; only *my* and *your* are "inside" quotes. Also, if this is an issue, do we need to worry about the corner case of unpaired quotes? What about escaped quotes? – tripleee Apr 29 '15 at 16:41
  • I am not sure, but what about `"\S[^"]*(plugin)"|"(plugin)[^"]*\S"|"\S[^"]*(plugin)[^"]*\S"|"(plugin)"`? Please have a look at https://regex101.com/r/yQ7oB5/2 – Wiktor Stribiżew Apr 29 '15 at 20:24

2 Answers2

2

You can use a simple regex like this:

".*?(plugin).*?"

Working demo

Regular expression visualization

Federico Piazza
  • 30,085
  • 15
  • 87
  • 123
  • Unfortunately, this matches the entire quotation. I only need to match a given word from within the quotation, not the entire thing. – John Weisz Apr 29 '15 at 16:08
  • 1
    @JánosWeisz the regex matches the entire quotation, but if you want to capture the word just surround plugin with parentheses. You can see it in the working demo – Federico Piazza Apr 29 '15 at 16:09
1

This should work:

"[^"]*(plugin)[^"]*"


" : This looks for a double quote.
[^"]* : Followed by 0 or more characters that are not double quotes.
(plugin) : Followed by the text you are looking for inside its own subgroup
[^"]* : Followed by 0 or more characters that are not double quotes.
" : Followed by a double quote
Daniel Hilgarth
  • 171,043
  • 40
  • 335
  • 443
  • Very close, but I do not need to match everything inside quotes, only the a given word, in this example, "plugin". I have edited my answer to include a second blockquote, with the matched words in bold. – John Weisz Apr 29 '15 at 16:04
  • @JánosWeisz You can easily create a sub-group for the matched text. Doesn't that give you what you need? – Daniel Hilgarth Apr 29 '15 at 16:07
  • Now that I'm thinking about it, it would make sense to use a capturing group, right? – John Weisz Apr 29 '15 at 16:08
  • @JánosWeisz Exactly, that's what I am talking about – Daniel Hilgarth Apr 29 '15 at 16:09
  • Although the question did not explicitly mention this, AFAIK this solution would be unable to extract _all_ matches, unless I'm missing something. This was the original reason I tended to prefer a positive lookahead based solution. – John Weisz Apr 29 '15 at 17:02