1

I have two 1-6 digit numbers separated by a slash. I want these split up into groups of at most 3 digits, taking from the right.

For example:

0/1 ->           [,0,,1]
1234/3 ->        [1,234,,3]
12345/1234 ->    [12,345,1,234]
123456/789123 -> [123,456,789,123]

I need to use a regular expression to do this because I want to do this for a location in NGINX. It's possible to do this with application logic but that is not the question due to performance.

Similar question which solves part of this was here using a negative lookahead: Regular expression to match last number in a string

What regex can achieve this split?

UPDATE: This regex comes close to what I want (https://regex101.com/r/bQtNdK/3): (?<prefix1>\d{0,3}?)(?<threes1>\d{0,3})\/(?<prefix2>\d{0,3}?)(?=\d)(?<threes2>\d{0,3}) It fails matching if the second number behind the slash is more than 3 digits long.

UPDATE2: Now this regex works for most combinations (https://regex101.com/r/bQtNdK/5): (?<prefix1>\d{0,3}?)(?<threes1>\d{1,3})\/(?<prefix2>\d{0,3})(?<threes2>\d{3}) I don't understand why this starts to fail if I use the same regex for prefix2/threes2 like prefix1/threes1 (i.e. make prefix2 also lazy). Any ideas how to solve this? So close...

Community
  • 1
  • 1
kdb
  • 1,994
  • 3
  • 13
  • 11
  • Any language in particular you're looking at? You can match the numbers themselves by ([0-9]{1,6})/([0-9]{1,6}) using the parens as capture groups. – Iluvatar Dec 05 '16 at 17:44
  • I need it for NGINX, so PCRE compatible. Matching the numbers itself is no issue. But your example matches all 6 possible numbers but I need to split it into groups of threes. So I need to get 4 groups: 0-3 numbers, 1-3 numbers, 0-3 numbers, 1-3 numbers. And then prepend zeroes... – kdb Dec 05 '16 at 17:47
  • 1
    Adding stuff isn't something you can do with regexes directly, you'll need something else. I'm not familiar with NGINX or what tools you have to work with though. Can I offer you an example in Python? – Iluvatar Dec 05 '16 at 17:52
  • No sorry, would know how to do it in Python. Need it for NGINX. – kdb Dec 05 '16 at 17:55
  • You'll require most of the work to be done by application code. Regex may be part of the story, but can't *be* the story. – Bohemian Dec 05 '16 at 17:58
  • @Bohemian, the question is the opposite of "too broad". I'm specifically asking if this solution is possible with pure regex, not with application code which would be easy. If it's not possible with pure regexes (although I think it could at least be narrowed down to a few), then there is no answer. – kdb Dec 05 '16 at 19:58
  • @kdb regex *matches*. It doesn't "do" anything, like format a string. It can be used to match what you want to replace with app code. But there are lots of ways to do that. So your question is either unanswerable because it's nonsensical, or because there are manybways to achieve what you are wanting to do. Either way, closure is probably appropriate. – Bohemian Dec 05 '16 at 20:03
  • @Bohemian, closure might be appropriate but it's incorrect to call it nonsensical. My second question asks specifically how to match a number and split it into groups but starting from the back. See similar answer where it's part of the way: http://stackoverflow.com/questions/5320525/regular-expression-to-match-last-number-in-a-string and same thing with prepending zeroes, see similar answer here: http://stackoverflow.com/questions/3121596/is-it-possible-to-pad-integers-with-zeros-using-regular-expressions None of those questions were called "nonsensical". – kdb Dec 05 '16 at 20:20
  • @kdb questions should definitely be **1** question. You had 2. I've edited you question into an answerable, single question. Once you have the split done, you can then set about adding leading zeroes to each element. Note that it may still be closed for "showing no effort", as you haven't posted an attempt of any kind. – Bohemian Dec 05 '16 at 20:27
  • @Bohemian, thanks I guess. I agree the rewording is easier and I fixed an error in the edit. Not sure why I'm showing no effort as I replied immediately to questions people had and sent links to other answers. Don't know why this question is such a problem. – kdb Dec 05 '16 at 20:52
  • @kdb you have showed no effort to *solve the problem yourself*. You're basically asking people to do your work for you. – Bohemian Dec 05 '16 at 20:56
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/129841/discussion-between-kdb-and-bohemian). – kdb Dec 05 '16 at 21:08

2 Answers2

0

I don't know that it's possible without the ability for the regex engine to remember all intermediate matches of a match group that matched an arbitrary number of times (.NET can do this, not sure what others). PCRE will apparently only remember the 'last' match for each group, other wise you could use something like this : (?<prefix1>\d{0,2})(?:(?<threes1>\d{3})*)\/(?<prefix2>\d{0,2})(?<threes2>\d{3})*\s

Community
  • 1
  • 1
Scott Weaver
  • 7,192
  • 2
  • 31
  • 43
  • This is getting closer. Your regex is not quite doing the right match but this here for example is matching correctly if we have at least 3 digits on each side of the slash (https://regex101.com/r/bQtNdK/2): `(?\d{0,3})(?=\d)(?:(?\d{3})*)\/(?\d{0,3})(?=\d)(?\d{3})*\s` – kdb Dec 06 '16 at 08:52
0

This regex seems to be correct now (regex101): (?<prefix1>\d{0,3}?)(?<suffix1>\d{1,3})\/(?<prefix2>\d{0,3}?)(?<suffix2>\d{1,3})\/

kdb
  • 1,994
  • 3
  • 13
  • 11