-2

How would I get the regex of just words (a-z,A-Z,0-9) that is right after another regex group? For example,

AAFS142 American Literature 3 I don't want this to be read!

AAFS209 American Music 3 I don't want this to be read

I have [A-Z]{4}\d{3}\b for the bolded part. Now how would I be able to make a regex for American Literature and American Music based strictly on the fact that it is after the other regex?

This is what the regex should catch (in bold).

AAFS142 American Literature 3 I don't want this to be read!

AAFS209 American Music 3 I don't want this to be read

melpomene
  • 84,125
  • 8
  • 85
  • 148
John A
  • 83
  • 1
  • 3
  • 12
  • 1
    Ever heard of a lookbehind? – cs95 Oct 02 '17 at 21:05
  • Please provide more examples, expected inputs and outputs and the programming language used. – Jan Oct 02 '17 at 21:07
  • You can just add `.*` to the end of your pattern to include anything that comes after it. I get the feeling that isn't exactly what you want, but your question was a bit vague. – CAustin Oct 02 '17 at 21:08
  • `([A-Z]{4}\d{3})\s+(\w+\s+\w+)`? Using positive lookbehind `(?<=[A-Z]{4}\d{3}\s)(\w+\s+\w+)` – ctwheels Oct 02 '17 at 21:08
  • 4
    @cᴏʟᴅsᴘᴇᴇᴅ Not in JavaScript. – melpomene Oct 02 '17 at 21:19
  • John, please explain what the trailing boundary is for the expected match. I was going to post https://jsfiddle.net/7prrL25k/, but you mentioned there may also be digits in the expected match. – Wiktor Stribiżew Oct 02 '17 at 21:42

1 Answers1

0

One way to work around JS's lack of lookbehind is to use non-capturing (?: ) and capturing groups (). Non-capturing groups are matched but not 'remembered', and capturing groups are stored.

As it is with regex, the following is a little dense, but you can see there are three sets of parentheses - a non-capturing, a capturing, and then a non-capturing:

let regexString = /(?:[A-Z]{4}[\d]{3}\s)([A-za-z0-9 ]+)(?:\s\d)/g

The first non-captured group (?:[A-Z]{4}[\d]{3}\s) matches but doesn't remember the course alphanumeric code. The second ([A-za-z0-9 ]+) matches and captures any list of A-Za-z0-9 characters at least once + in the () – i.e. the title. The last tells it to stop matching for the space+'3' with (?:\s\d).

The second issue with capturing groups and javascript is that they are only returned when you use regexString.exec(), and not when you use .match(). .exec() returns an array, with the matched text as the zeroth (not what is wanted here as it includes the non-capturing groups), and then the subsequent indices are capturing groups.

let match1 = x.exec('AAFS209 American Music 3')[1] = 'American Music'.

let match2 = x.exec('AAFS241 3D American Musical Theatre 3')[1] = '3D American Musical Theatre'. (Not sure if that is a course, let alone a thing, but one can hope. Also, wanted to make the regex worked with digits in the title).