-1

I've been trying for a few days now to write a regex that will capture sentences that start with a particular string, and end with a dissallowed character (<). This sentence may contain any punctuation (off the top of my head []()-,.!?\/) and most importantly ' and ", however always will end and start with the same thing (<). So my regex is as follows:

    "starting string foo (?:[a-zA-z0-9_]|[-,.!?()\[\]\'\"\/]|[\s])+"

This works fine, gets all sentences starting with "starting string foo" and ends with the < after. It successfully gets sentences with every piece of punctuation.... except double quotes ("). I don't understand why this is the case when it can easily get single quotes (') and other punctuation eg. slashes and dashes.

for example- of the string

     starting string foo Hubble revisits the famous "pillars of creation" with a new lens <

it only captures

    starting string foo Hubble revisits the famous

but strings like

     starting string foo Buzz Aldrin's self-portrait during Gemini 12 with the Earth reflecting off his visor, 12 November 1966 [2651x2632] <

with all kinds of punctuation (' - [ ,) it captures all that i want-

    starting string foo Buzz Aldrin's self-portrait during Gemini 12 with the Earth reflecting off his visor, 12 November 1966 [2651x2632]
user3662991
  • 1,083
  • 1
  • 11
  • 11

1 Answers1

2

What's wrong with

/starting string foo (.*)\</
jcuenod
  • 55,835
  • 14
  • 65
  • 102
  • That works, but doesnt seem to stop (just gets the entire rest of the huge string!). – user3662991 Mar 15 '15 at 10:30
  • ok finally got it to work thanks heaps. The final code was "starting string foo [^<]+" – user3662991 Mar 15 '15 at 11:03
  • I think you may actually be looking for non-greedy matching. See here and let me know, I'll modify my answer http://stackoverflow.com/questions/11898998/regex-match-non-greedy – jcuenod Mar 15 '15 at 14:34