0

Possible Duplicate:
Regular Expression for Phone Number

I'm kinda new to regular expressions, so I don't quite yet know its possibilites. Therefore, I don't know whether it will be difficult or not to solve this problem.

I have very liberally formatted phone strings and need to format them in a fixed way (at least try my best). "899-123-4-45; 22-56-87", "5 99-25-31-71", "577-345-678,274-89-56".

Here's the info I know:

New landline phones start with 2-XX-XX-XX followed by 6 numbers (between them can be two type of delimiters either "-" or " " number of them is not known).

Old landline phones only contain 6 numbers XX-XX-XX.

Old cellphone codes contain 8XX-YY-YY-YY 9 numbers. The first one is 8. XX is the operator's code (don't know all of them).

New cellphone codes contain 5XX-YY-YY 9 numbers, the only difference is the first number.

Some records contain old landline codes, new landline codes, old cellphone codes and new cellphone codes.

I need to store all numbers in new format, with only two delimiters "-" and ",". ex: "599-12-34-56,2-45-61-34", "2-45-65-12", "574-12-34-56".

I just don't know where to start. Should I try to split big strings with only the ones that contain numbers and then retrieve only the number and determine which format it is? Or is there a simpler solution?

How would you parse this string: "574-12-34-56; 2 456 324, 455-566 2 22 40 56"? First split these into 3 parts? Can I split it with either ";" or " " or ","? Then should I retrieve only numbers and determine their style and format it properly?

Community
  • 1
  • 1
Sandro Dolidze
  • 177
  • 1
  • 13
  • 1
    A string like `574-12-34-56; 2 456 324, 455-566 2 22 40 56` might be possible to handle..but if you'd have a string like `574-12-34-56; 2 456 324, 455-566 2 22 40 56 2 22 40 56 2 22 40 56` I don't think it will be. – Ben Oct 19 '12 at 06:36
  • I don't think that Regular Expression for Phone Number is closely related question to this one – Sandro Dolidze Oct 19 '12 at 06:43

2 Answers2

1

The best solution (in my opinion) would be to use multiple regular expressions, each one for a different format. Regular expressions tend to grow quite large quite fast so maintaining them can be a daunting experience.

What I would do would be to use something like these:

  • (2)[ -]+(\d{2})[ -]+(\d{2})[ -]+(\d{2}) to match the first pattern (new land line phones): 2-XX-XX-XX.
  • (\d{2})[ -]+(\d{2})[ -]+(\d{2}) to match the second pattern (old land line phones): 6 digits.
  • (8\d{2})[ -]+(\d{2})[ -]+(\d{2})[ -]+(\d{2}) to match the third pattern (old mobile phones): 8XX-YY-YY-YY.
  • (5\d{2})[ -]+(\d{2})[ -]+(\d{2})[ -]+(\d{2}) to match the fourth pattern (new mobile phones): 5XX-YY-YY-YY.

You will have to use the above expressions to see which expression will match the format of the number you have entered. Note that these expressions assume that the numbers making up the phone number are separated by a white space () or a dash (-).

If the pattern will match, the regular expression engine will also throw in the numbers making up the phone number into groups, denoted by the ( and ) brackets. You can then reconstruct the phone number in any way you wish by accessing these groups and create new strings representing the newly formatted phone.

To see how you can use regular expression groups in Javascipt, please take a look here.

Community
  • 1
  • 1
npinti
  • 51,780
  • 5
  • 72
  • 96
1

I would suggest a different approach:

First, split the string on the characters that delimit phone numbers:

result = subject.split(/[,;]/);

Second, on each of the substrings, remove all non-digit characters (possibly except for + so you keep information on international numbers):

result[i] = result[i].replace(/[^\d+]+/g, "");

Now you have all numbers without any delimiters. Then you can look at the strings, sort them into different categories (mobile, landline, international etc.) and perhaps re-introduce your own separators. If you want that at all.

Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • this approach also seems interesting. I'll try both of them and see which one work better. Thank you for different approach ;) – Sandro Dolidze Oct 19 '12 at 09:48