regex to match section of string between two constants with new line in the center

Question

I have a block of text in the following format:

FARE CALC INDICATOR: 9 
PHL US CUN264.00AA MIA102.00AA PHL88.37NUC454.37END ROE1.00 US 
XT34.40US5.00XA5.00AY 23.20UK28.14XD9.00XFPHL4.5 MIA4.5

               **FARE BREAKDOWN/FOP/TOUR CODE**

Using Javascript and regex, I need to match this section:

PHL US CUN264.00AA MIA102.00AA PHL88.37NUC454.37END ROE1.00 US XT34.40US5.00XA5.00AY 23.20UK28.14XD9.00XFPHL4.5 MIA4.5

Basically I need to find the next line break after FARE CALC INDICATOR: and return all the text between that point and **FARE BREAKDOWN/FOP/TOUR CODE**

I tried .match(/FARE CALC INDICATOR:([\s\S]+)\*\*FARE BREAKDOWN\/FOP\/TOUR CODE\*\*/)

This almost works but, if there is any text between FARE CALC INDICATOR: and the next new line (like the number 9 in this example) that text gets captured also and should not.

The number 9 in this example could potentially be any character and is not limited to one character

As there seems to be no `dotall` flag in javascript see the workarounds given in http://stackoverflow.com/q/1068280/2870069 — Jakob, Oct 17 '13 at 12:14

score 3 · Accepted Answer · answered Oct 17 '13 at 12:09

3

As dot . doesn't match newline, you could do:

.match(/FARE CALC INDICATOR:.*([\s\S]+)\*\*FARE BREAKDOWN\/FOP\/TOUR CODE\*\*/)

answered Oct 17 '13 at 12:09

Toto

89,455
62
89
125

score 1 · Answer 2 · answered Oct 17 '13 at 12:06

1

You can try this:

/FARE CALC INDICATOR:[^\r\n]*\r?\n\s*([\s\S]+?)\s+\*\*FARE BREAKDOWN\/FOP\/TOUR CODE\*\*/

the capture group begin at the next newline after FARE CALC INDICATOR: and stop before the last newline after the content.

answered Oct 17 '13 at 12:06

Casimir et Hippolyte

88,009
5
94
125

score 0 · Answer 3 · answered Oct 17 '13 at 14:29

0

You don't really need any fancy regex for this. Consider this code:

str = "FARE CALC INDICATOR: 9 \n" + 
"PHL US CUN264.00AA MIA102.00AA PHL88.37NUC454.37END ROE1.00 US \n" + 
"XT34.40US5.00XA5.00AY 23.20UK28.14XD9.00XFPHL4.5 MIA4.5\n" + 
"\n" + 
"               **FARE BREAKDOWN/FOP/TOUR CODE**\n";

var p1 = 'FARE CALC INDICATOR: 9'; // start pattern
var p2 = '**FARE BREAKDOWN/FOP/TOUR CODE**'; // end pattern

var i1 = str.indexOf( p1 );
var i2 = str.indexOf( p2 );
// TODO: check if i1 and i2 are > 0 here

var substr = str.substring(i1 + p1.length + 1, i2 - 1);

console.log( substr );
// PHL US CUN264.00AA MIA102.00AA PHL88.37NUC454.37END ROE1.00 US 
// XT34.40US5.00XA5.00AY 23.20UK28.14XD9.00XFPHL4.5 MIA4.5

answered Oct 17 '13 at 14:29

anubhava

761,203
64
569
643

With this example yes but, "The number 9 in this example could potentially be any character and is not limited to one character" Is the reason Im going with regex – Wesley Smith Oct 18 '13 at 04:40
Actually you could have replaced p1 and p2 with any other string and it would work fine. – anubhava Oct 18 '13 at 05:08
True, but Im going to be parsing 1,000s of blocks like this and the characters that come after ":" are generated in many different unpredictable patterns. Well, at least not easily predictable. But thank you though I do see the merit of this solution otherwise. – Wesley Smith Oct 18 '13 at 07:35

regex to match section of string between two constants with new line in the center

3 Answers3