Equivalent Regex Unsupported lookbehind assertion IOS Safari

Question

This regex:

var text = "Mr. Smith bought cheapsite.com for 1.5 million dollars, i.e. he paid a lot for it. Did he mind? Adam Jones Jr. thinks he didn't. In any case, this isn't true... Well, with a probability of .9 it isn't."
// break string up in to sentences based on punctation and quotation marks
var tokens = text.match(/(?<=\s+|^)[\"\'\‘\“\'\"\[\(\{\⟨](.*?[.?!])(\s[.?!])*[\"\'\’\”\'\"\]\)\}\⟩](?=\s+|$)|(?<=\s+|^)\S(.*?[.?!])(\s[.?!])*(?=\s+|$)/g);

breaks on IOS Safari due to unsupported lookbehind assertions ((?<= ) and (?<! )). Is there an equivalent (or similar) regex for sentence tokenization that I can use? Preferably it should not break due to other iOS safari compatibility issues as referenced here: (https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp#assertions) ECMAScript (ECMA-262) The definition of 'RegExp' in that specification.

Instead of links, please post your regex and samples in question — anubhava, Oct 18 '20 at 05:25
It can be any paragraph string with one or more sentences, I'll post an example regardless. — Ariel Frischer, Oct 18 '20 at 05:33
Can you try this regex in Safari: `/(?:\s|^)(?:["'‘“'"\[({⟨].*?[.?!](?:\s[.?!])*["'’”'"\])}⟩]|\S.*?[.?!](?:\s[.?!])*)(?=\s|$)/gm` — anubhava, Oct 18 '20 at 05:45
@Moderators that closed this question, it is not a duplicate it is just related by category. What I'm asking is quite specific that other post does not have the answer for it. — Ariel Frischer, Oct 19 '20 at 00:44
Yes I agree with point raised by @Ariel and reopened this question so that others can also try to post answers below — anubhava, Oct 19 '20 at 05:24

score 0 · Answer 1 · answered Oct 18 '20 at 05:55

0

Here is a version of your regex that you can use without using any lookbehind assertions to break input into sentences:

/(?:\s|^)(?:["'‘“'"\[({⟨].*?[.?!](?:\s[.?!])*["'’”'"\])}⟩]|\S.*?[.?!](?:\s[.?!])*)(?=\s|$)/gm

RegEx Demo

Please keep in mind that your regex may break on sentences where there are words ending with dots such as Jr., Sr. Mr. etc and few more cases like that.

answered Oct 18 '20 at 05:55

anubhava

761,203
64
569
643

2

Not bad, but I added another example text that is better for this. Can you add support for Mr. Mrs. Jr.? – Ariel Frischer Oct 18 '20 at 06:32
As I said, your original regex itself doesn't work with `Mr.`, `Sr.` etc and unfortunately there is no robust solution without lookbehind. – anubhava Oct 18 '20 at 06:38
Oh ok, I'll look for something a bit more robust then. – Ariel Frischer Oct 18 '20 at 06:43

score 0 · Answer 2 · answered May 18 '22 at 10:40

0

I used ?! instead of ?<=. In my case it worked just fine.

answered May 18 '22 at 10:40

Giorgi Gvimradze

1,714
1
17
34

Equivalent Regex Unsupported lookbehind assertion IOS Safari

2 Answers2

Linked