-3

I have sentences within text I need to extract. They are formatted as such:

Less word word....etc Number% (<-- eg. 99%)

OR

More word word etc Number%

I would like to create a regular expression that will capture everything between the word Less or the word More and the ending percentage sign.

The challenge I'm having is that I cannot use the ^ or the $ characters as these sentences don't start with a new line.

Is there a way to signify that I'd like to capture ech instance of SENTENCES (not lines) beginning with:

(less | more)

and ending with:

%

To clarify - I want to include Less and More and the percent symbol in what I'm capturing.

Here is what I have so far:

(?=Less|More)(.*)((\%|\s\bpercent\b))

The above code captures the first instance of the word 'Less' and everything after it. I would like every sentence to be captured separately.

For example, the two sentences below should be captured separately:

Less than a dollar 99,5%

More than a dollar and less than a cent 95%

EDIT FOR CLARIFICATION:

What I'm after is a solution that won't capture the ENTIRE text below.

More than 55 percent RANDOM STRING. RANDOM STRING. Less than a dollar 99,5% More than a dollar and less than a cent 95%

My aim is to capture three sentences from the above string, preferably all in one group:

More than 55 Percent

Less than a dollar 99,5%

More than a dollar and less than a cent 95%

redwytnblak
  • 143
  • 1
  • 1
  • 10
  • Duplicate of: [Regex Match all characters between two strings](https://stackoverflow.com/q/6109882/8967612). – 41686d6564 stands w. Palestine Jul 29 '21 at 19:26
  • This is a good start but it's not really a duplicate as this shows me how to capture sentences BETWEEN two strings not including them. – redwytnblak Jul 29 '21 at 19:29
  • **Try writing something yourself** and then if it doesn't work, show us specifically what you did so we can help you along. You start it, and then we help. We don't write it for you. Show us the actual code that you've tried, and then describe what happened and what's not right, and then we can help you from there. Chances are you'll get pretty close to the answer if you just try it yourself first. – Andy Lester Jul 29 '21 at 19:34
  • 1
    @Andy Lester I've edited my post with what I have so far. – redwytnblak Jul 29 '21 at 19:38
  • Instead of . you may try " not a % " : ^% – Pierre Jul 29 '21 at 19:53

1 Answers1

1

If you want the capture included, then you simply include it instead of using a lookaround.
Please note, that according to your specification, the . at the end of sentences is not included. Perhaps you want to add that as a possibility at the end like this: percent)\.?

We are also using the non-greedy modifier .*? So that two sentences on the same line will be captured uniquely, otherwise it will capture everything from the first More|Less to the last %|percent on the same line.

let regex = /(Less|More).*?(\%|\spercent)/g

let string = `The above code captures the first instance of the word 'Less' and everything after it. I would like every sentence to be captured separately.

More than 55 percent.

For example, the two sentences below should be captured separately:

Less than a dollar 99,5% but More than the least possible percent.

More than a dollar and less than a cent 95%`

let capture = string.match(regex);
console.log("Content of capture as array:");
console.log(capture);

capture.forEach((capturedString, index) => {
  console.log("Sentence number " + Number(index+1) + ": " + capturedString); 
});

let everythingAsOneSentence = capture.join(" ");

console.log(everythingAsOneSentence);
Timothy Alexis Vass
  • 2,526
  • 2
  • 11
  • 30
  • Thank you. Maybe it's my fault for not explaining - I tried your snippet above and it works however what I'm trying to do is separate this out from the rest of the text. When I test your solution it will capture the ENTIRE text below. More than 55 percent RANDOM STRING. RANDOM STRING. Less than a dollar 99,5% More than a dollar and less than a cent 95% My aim is to capture three sentences from the above string, preferably all in one group: More than 55 Percent Less than a dollar 99,5% More than a dollar and less than a cent 95% – redwytnblak Jul 29 '21 at 19:50
  • It captures them and puts them into an array. What do you mean is the problem? – Timothy Alexis Vass Jul 29 '21 at 19:51
  • I have updated the answer to make it more clear. – Timothy Alexis Vass Jul 29 '21 at 19:54
  • If it helps you, then please mark it as correct, otherwise I'd like to know what further problem you have and I'll help you. – Timothy Alexis Vass Jul 29 '21 at 19:57