3

I have this text:

<pre id="lyricsText">
__Verse 1
At the starting of the week
At summit talks, you'll hear them speak
It's only Monday

Negotiations breaking down
See those leaders start to frown
It's sword and gun day

__Chorus 1
Tomorrow never comes until it's too late

__Verse 2
You could be sitting taking lunch
The news will hit you like a punch
It's only Tuesday

You never thought we'd go to war
After all the things we saw
It's April Fools' day

__Chorus 1 (R:2)
Tomorrow never comes until it's too late

__Verse 3
You hear a whistling overhead
Are you alive or are you dead?
It's only Thursday

You feel a shaking on the ground
A billion candles burn around
Is it your birthday?

__Chorus 1 (R:2)
Tomorrow never comes until it's too late

__Outro
Make tomorrow come, I think it's too late
</pre>

and I'm trying to capture the headers. To do that I use this pattern:

var headers = /__.*/g;

which works fine, but I want to exclude (R:2) or anything similar to that. I'm using another pattern to capture and modify (R:x) parts:

/(\(R.{0,2}\))/g

I couldn't find a way to make them work together.

How do I write capture /__.*/ and if exist exclude /\(R.{0,2}\)/ ?

FIDDLE

akinuri
  • 10,690
  • 10
  • 65
  • 102

2 Answers2

4

Assuming that for __Chorus 1 (R:2) you wish to match __Chorus 1, this would do it:

/__(.(?!\(R.{0,2}\)))*/g;

Matches as many characters as it can until the next sequence is (R.{2})

Output on your fiddle:

["__Verse 1", "__Chorus 1", "__Verse 2", "__Chorus 1", "__Verse 3", "__Chorus 1", "__Outro"] 
OGHaza
  • 4,795
  • 7
  • 23
  • 29
2
var headers = /__[A-Za-z0-9 ]*(?=\(R:\d\))?/g;

JS Fiddle: http://jsfiddle.net/gjmf4/3/

Kevin Bowersox
  • 93,289
  • 19
  • 159
  • 189
  • Not sure why my answer got so much more attention. I started out pretty much exactly like this except without the final `?`, realised it *only* matched the titles with `(R..)` on the end so changed approach entirely - oblivious to the fact that `?` after the lookahead would have fixed it.. – OGHaza Feb 14 '14 at 21:10
  • @OGHaza Well, what might have cause this is that you cannot put quantifiers on lookarounds. – Jerry Feb 14 '14 at 21:11
  • @Jerry by quantifiers your referring to the `?`? I am not a regex genius and this is actually what I was hoping to do, put something out there that works and get some feedback. – Kevin Bowersox Feb 14 '14 at 21:11
  • 1
    @KevinBowersox Yup, `?` is indeed a quantifer, just like `+` or `*` or `{1,2}`. I just saw the question myself, but if you want to put `?` on a lookahead, you need to group it first, something like that: `(?:(?=\(R:\d\)))?` – Jerry Feb 14 '14 at 21:12
  • @Jerry Then why does my fiddle work? I'm noticing it the expression does not work in RegExr though. – Kevin Bowersox Feb 14 '14 at 21:14
  • This also solves the problem, but previously it didn't capture the numbering at the end of the headers. Also I'm using `\(R.{0,2}\)` on purpose because there are some `(R)`s without any digits next to it in some other texts so I suppose `\(R:\d\)` wouldn't work. – akinuri Feb 14 '14 at 21:16
  • @KevinBowersox I was just thinking same. Maybe there's something more into it. Well, JS' regex sure has its weirdness at times, one of them being [this](http://stackoverflow.com/q/20921288/1578604). Might be worth digging into. – Jerry Feb 14 '14 at 21:16
  • @Jerry Learned something new about quantifiers so thank you. Found this mentioned: `Many regex flavors, including those used by Perl and Python, only allow fixed-length strings. You can use literal text, character escapes, Unicode escapes other than \X, and character classes. You cannot use quantifiers or backreferences.` – Kevin Bowersox Feb 14 '14 at 21:21
  • @KevinBowersox Ahh, that's yet another issue with another type of lookaround: lookbehinds. It doesn't have anything to do with what we have here, unfortunately ^^; – Jerry Feb 14 '14 at 21:28