0

I am trying to normalise UK telephone numbers to international format.

The following strings should resolve to: +447834012345

  • 07834012345
  • +447834012345
  • +4407834012345
  • +44 (0) 7834 012345
  • +44 0 7834 012345
  • 004407834012345
  • 0044 (0) 7834012345
  • 00 44 0 7834012345

So far, I have got this:

"+44" + mobile.replaceAll("[^0-9]0*(44)?0*", "")

This doesn't quite cut it, as I am having problems with leading 0's etc; see table below. I'd like to try and refrain from using the global flag if possible.

Mobile              | Normalised         | 
--------------------+--------------------+------
07834012345         | +4407834012345     | FAIL
+447834012345       | +447834012345      | PASS
+4407834012345      | +447834012345      | PASS
+44 (0) 7834 012345 | +44783412345       | FAIL
+44 0 7834 012345   | +44783412345       | FAIL
004407834012345     | +44004407834012345 | FAIL
0044 (0) 7834012345 | +4400447834012345  | FAIL
00 44 0 7834012345  | +44007834012345    | FAIL
+4407834004445      | +447834004445      | PASS

Thanks

cyorkston
  • 225
  • 1
  • 10
  • Do all of these numbers have the same length? I mean the normalised numbers, not the unformatted ones. – Seth Jun 06 '16 at 13:30
  • I'm not interested in length really. My main goal is to remove non digits, leading 0's and the UK country code 44 if found logically at the beginning of this string. This country code could be preceded and/or proceeded by 0's. Hope that helps. – cyorkston Jun 06 '16 at 13:35
  • You might want to whack a `^` at the start of that, so you don't strip out `00440` from `07777 700440`. – Andy Turner Jun 06 '16 at 13:41
  • Could you expand on *"This doesn't quite cut it"* with some examples? What are your succeeding and failing cases? – jonrsharpe Jun 06 '16 at 13:57
  • @cyorkston I think you should first remove all spaces, then you could use `^[^0-9]?0*(44)?0*(\\(0\\))?` to replace your unwanted stuff (untested, not sure if I got the escaping of the parentheses correctly) – Sebastian Proske Jun 06 '16 at 14:24

2 Answers2

1

If you still want the regex I was able to get it working like this:

"+44" + System.out.println(replaceAll("[^0-9]", "")
  .replaceAll("^0{0,2}(44){0,2}0{0,1}(\\d{10})", "$2"));

EDIT: Changed the code to reflect failed tests. Removed non-numeric characters before running the regex.

EDIT: Update code based on comments.

markbernard
  • 1,412
  • 9
  • 18
  • many thanks that covers the cases that I mentioned. However it does fail in the following cases: `+44 783 401 2345` +44+44 783 401 2345 `0783 401 2345` +440783 401 2345 `078 34 01 23 45` +44078 34 01 23 45 – cyorkston Jun 07 '16 at 10:28
  • @cyorkston Can you really dial the country code(+44 +44) twice? – markbernard Jun 07 '16 at 12:40
  • @cyorkston I updated the code. Some times its just easier to deal with the numbers in a regex and remove the clutter before running it. – markbernard Jun 07 '16 at 12:47
  • I guess you could also do `"+44" + mobile.replaceAll("[^0-9]", "").replaceAll("^0{0,2}(44){0,2}0{0,1}(\\d{10})", "$2"))` however is really not possible to do it as a single regex? Many thanks for this by the way, you've been a great help. – cyorkston Jun 07 '16 at 13:34
0

Like my answer here, I would also suggest looking at the Google libphonenumber library. I know it is not regex but it does exactly what you want.

An example of how to do it in Java (it is available in other languages) would be the following from the documentation:

Let's say you have a string representing a phone number from Switzerland. This is how you parse/normalize it into a PhoneNumber object:

String swissNumberStr = "044 668 18 00";
PhoneNumberUtil phoneUtil = PhoneNumberUtil.getInstance();
try {
  PhoneNumber swissNumberProto = phoneUtil.parse(swissNumberStr, "CH");
} catch (NumberParseException e) {
  System.err.println("NumberParseException was thrown: " + e.toString());
}

At this point, swissNumberProto contains:

{
  "country_code": 41,
  "national_number": 446681800
}

PhoneNumber is a class that is auto-generated from the phonenumber.proto with necessary modifications for efficiency. For details on the meaning of each field, refer to https://github.com/googlei18n/libphonenumber/blob/master/resources/phonenumber.proto

Now let us validate whether the number is valid:

boolean isValid = phoneUtil.isValidNumber(swissNumberProto); // returns true

There are a few formats supported by the formatting method, as illustrated below:

// Produces "+41 44 668 18 00"
System.out.println(phoneUtil.format(swissNumberProto, PhoneNumberFormat.INTERNATIONAL));
// Produces "044 668 18 00"
System.out.println(phoneUtil.format(swissNumberProto, PhoneNumberFormat.NATIONAL));
// Produces "+41446681800"
System.out.println(phoneUtil.format(swissNumberProto, PhoneNumberFormat.E164));
Community
  • 1
  • 1
Halfwarr
  • 7,853
  • 6
  • 33
  • 51