1

I want to extract everything from the string below, except "from" and ", from":

from Old French, from Latin innocentia, from innocent- ‘not harming’ (based on nocere ‘injure’).

This is my regex:

(?:from)(.*)(?:,.from)(.*)

For this regex, I will get Old French, from Latin innocentia and innocent- ‘not harming’ (based on nocere ‘injure’). as a result. How do I edit my regex snippet so that it can match the expected conditions without repeating the non-capturing group (?:,.from)?

The result should be:

  • Old French
  • Latin innocentia
  • innocent- ‘not harming’ (based on nocere ‘injure’).
Duy Nguyen
  • 155
  • 1
  • 9

2 Answers2

1
line="from Old French, from Latin innocentia, from innocent- ‘not harming’ (based on nocere ‘injure’)."
line.split(/, from|from/)

=>

[ '',
 ' Old French',
 ' Latin innocentia',
 ' innocent- ‘not harming’ (based on nocere ‘injure’).' ]

Which might be close enough. Try online: https://repl.it/Chp8

TessellatingHeckler
  • 27,511
  • 4
  • 48
  • 87
0

You can just use a regex to split the string. This will return the same results with better speed than using the backtracking nightmare .*.

You could use this regex (based off yours) to do so:

(,.)?from

More information on splitting can be found here.

Laurel
  • 5,965
  • 14
  • 31
  • 57
  • Thank you. Your answer also helped me. Aside from the `split` function. Is there no way around this using pure regex? – Duy Nguyen Jul 31 '16 at 01:02
  • @DuyNguyen Not with normal JS regex repeated capture groups. See here: http://stackoverflow.com/a/3537914/6083675 – Laurel Jul 31 '16 at 01:16