-1

I want to match a full stop at the end of each sentence in a paragraph of text.

There must be atleast 3 words before the fullstop(.)

This is to ensure only full stops at the end of sentences are counted. Periods in Microsoft v 3.2.1 are skipped! Please note that the words may not necessarily contain latin characters. I plan to use in other languages so we can't use [a-Z] here!

What i tried \b.+\b\s+\b.+\b\s+\b.+\b[.] But this selects the whole sentence!

Probably one can use a if else construct? if \b.+\b\s+\b.+\b\s+\b.+\b[.] is found, select the dot or else don't. Is it possible?

Shubham Kanodia
  • 6,036
  • 3
  • 32
  • 46
  • 1
    What have you already tried? – André Snede Jan 29 '14 at 12:23
  • 2
    Is there a class of students having the same idea? (Let the people att SO do the heavy lifting... ;)) This was asked half an hour ago... [How to...](http://stackoverflow.com/questions/21430447/how-to-split-paragraphs-into-sentences) – SamWhan Jan 29 '14 at 12:30

1 Answers1

1

Ok. Here's a go at it:

(?:(?:\s|^|\.)[^\s\d.]+){3}(\.)

Expl.: (Non capturing) find a space, full stop or start of line followed by any number (at least one) of characters that isn't a space, digit or a full stop. Repeat this 3 times. Then capture a full stop :D Done!

Check it out here.

Regards

SamWhan
  • 8,296
  • 1
  • 18
  • 45
  • Amazing! You almost got it. One small problem though. Right now, this selects the three words along with the dot. I am using a search and replace algorithm, as a result of which the three words before the dot also get replaced. Is there a way to replace only the "dot"? – Shubham Kanodia Jan 29 '14 at 17:25
  • Got it! used `(?(?:(?:\s|^|\.)[^\s\d.]+){3})(\.)` in the find box and `${name}` in the replace ;) – Shubham Kanodia Jan 29 '14 at 17:46