5

@thg435 wrote this answer to a javascript question:

> a = "foo 1234567890 bbb 123456"
"foo 1234567890 bbb 123456"
> a.replace(/\d(?=\d\d(\d{3})*\b)/g, "[$&]")
"foo 1[2]34[5]67[8]90 bbb [1]23[4]56"

It works well with Hindu-Arabic numerals; i.e. 1,2,3,4,... . But when I try to apply the regex to Eastern Arabic numerals, it fails. Here is the regex I use (I've just replaced \d with [\u0660-\u0669] ):

/[\u0660-\u0669](?=[\u0660-\u0669][\u0660-\u0669]([\u0660-\u0669]{3})*\b)/g

It actually works if my string is ١٢٣٤foo, but fails when it's ١٢٣٤ foo or even foo١٢٣٤:

> a = "١٢٣٤foo  ١٢٣٤ foo  foo١٢٣٤"
"١٢٣٤foo  ١٢٣٤ foo  foo١٢٣٤"
> a.replace(/[\u0660-\u0669](?=[\u0660-\u0669][\u0660-\u0669]([\u0660-\u0669]{3})*\b)/g, "[$&]")
"١[٢]٣٤foo  ١٢٣٤ foo  foo١٢٣٤"

What actually matters to me are separated numbers (e.g. ١٢٣٤). Why it cannot match separated numbers?

Update:

Another requirement is that the regex should only match numbers with 5 or more digits (e.g. ١٢٣٤٥ and not ١٢٣٤). I initially thought that that's as simple as adding {5,} at the end of the expression, but that doesn't work.

Community
  • 1
  • 1
Iryn
  • 255
  • 2
  • 5
  • 13

1 Answers1

1

Oddly, I'm experiencing the opposite behavior from you (the first one doesn't work and the other two do), but how about if you replaced the \b with (?![\u0660-\u0669])? Then it seems to work no matter what's before or after it:

[\u0660-\u0669](?=[\u0660-\u0669][\u0660-\u0669]([\u0660-\u0669]{3})*(?![\u0660-\u0669]))

Edit: This seems to work for the new requirement - to only add the brackets if the run of digits is 3 digits long or more:

[\u0660-\u0669](?=[\u0660-\u0669]{2}([\u0660-\u0669]{3})+(?![\u0660-\u0669]))|(?<=[\u0660-\u0669]{2})[\u0660-\u0669](?=[\u0660-\u0669]{2}(?![\u0660-\u0669]))

Incidentally, some Regex processors will treat those digits as a match for \d. Here is that second Regex with \d instead of those character ranges, which should be a little easier to read:

\d(?=\d{2}(\d{3})+(?!\d))|(?<=\d{2})\d(?=\d{2}(?!\d))
JLRishe
  • 99,490
  • 19
  • 131
  • 169
  • works well with nearly all regex engines except javascript's.. this is a problem with javascript's regex..also i have doubt about nested lookahead's support in javascript – Anirudha Apr 26 '13 at 17:34
  • This solved my problem. Only one simple more question: How can I only match 5 or more digits numbers (e.g. 12345 and not 1234)? Where should I add {5,}? – Iryn Apr 26 '13 at 19:05
  • The new regex doesn't work. Can you please check the code, or create a jsfiddle? – Iryn Apr 26 '13 at 20:27