2

In PHP I'm searching for phonenumbers in a certain text. I use explode() to divide the text in different parts,using the area code of the city I'm searching for as the delimiter. The problem is that phonenumbers that include the same numbers as the area-code are not returned well.

For example:

"foofoo 010-1234567 barbar" splits into "foofoo " and "-1234567 barbar"

but

"foofoo 010-1230107 barbar" splits into "foofoo ", "-123" and "7 barbar" !

I can use the first one to reconstruct the phonenummer with the areacode, but the second goes wrong of course...

I guess I need a regular expression to split the text with some kind of mechanism to not split on short strings, instead of explode() , but I don't know how to do it.

Any ideas or a better way to search for phonenumbers in a text ?

UPDATE: The format is NOT consistent, so looking for the hyphen is no solution. Some phone numbers have spaces between the area code and number, some have hooks, some have nothing, etc. Dutch phonenumbers have an areacode of 2,3 or 4 numbers and are usually 10 numbers in total.

salathe
  • 51,324
  • 12
  • 104
  • 132
Dylan
  • 9,129
  • 20
  • 96
  • 153

3 Answers3

10

To find phone numbers like:

  • 010-1234010
  • 010 1234010
  • 010 123 4010
  • 0101234010
  • 010-010-0100

Try this:

$text = 'foofoo 010-1234010 barbar 010 1234010 foofoo ';
$text .= ' 010 123 4010 barbar 0101234010 foofoo 010-010-0100';

$matches = array();

// returns all results in array $matches
preg_match_all('/[0-9]{3}[\-][0-9]{6}|[0-9]{3}[\s][0-9]{6}|[0-9]{3}[\s][0-9]{3}[\s][0-9]{4}|[0-9]{9}|[0-9]{3}[\-][0-9]{3}[\-][0-9]{4}/', $text, $matches);
$matches = $matches[0];

var_dump($matches);
sammoore
  • 330
  • 1
  • 10
  • if you need to find another type of phone number, add more regex. syntax is easy. [0-9]{3} matches 3 numbers (replace 3...) [\s] matches a space [\-] matches a dash. | separates different "queries" (can't think of the word) look at my search string and go from there. – sammoore Aug 12 '11 at 18:15
  • Well, it looks promising, but it only returns 9 numbers now. – Dylan Aug 12 '11 at 18:18
  • 9 numbers?? if you mean it only returns one phone number, remove the $matches = $matches[0] line. it was needed on my system. otherwise, not sure what you mean, elaborate?? – sammoore Aug 12 '11 at 18:19
  • 1
    Dutch phone numbers are 10 numbers. I changed the regex and now it's working, so thanks.... /[0-9]{3}[\-][0-9]{7}|[0-9]{3}[\s][0-9]{7}|[0-9]{3}[\s][0-9]{3}[\s][0-9]{4}|[0-9]{10}|[0-9]{3}[\-][0-9]{3}[\-][0-9]{4}/ – Dylan Aug 12 '11 at 18:22
  • 1
    I agree with @salathe, if there's any other matches you need, do this yourself. I don't even know regex and after looking up regex syntax it took me about 10 minutes to write this code snippet **myself**. Google is a wonderful thing. Doing it yourself is great for your own knowledge and knowing how you're code works. If you've come here to have people write code for you, you've come to the wrong place. – sammoore Aug 12 '11 at 18:24
2

You could use a regular expression to match the phone numbers. There are many, many ways to skin this particular cat (and likely many identical questions here on SO) a super-basic example might look like the following.

$subject = "foofoo 010-1230107 barbar 010-1234567";
preg_match_all('/\b010-\d+/', $subject, $matches);
$numbers = $matches[0];
print_r($numbers);

The above would output the contents of the $numbers array.

Array
(
    [0] => 010-1230107
    [1] => 010-1234567
)
salathe
  • 51,324
  • 12
  • 104
  • 132
  • 3
    Too bad we're here to answer questions, not read minds. Authoring regular expressions requires a (usually) strict set of rules: "match a phone number" is not enough to work with, please consider all of the phone formats that you wish to match then write a regex accordingly. I would suggest **doing this yourself**, to avoid the inevitable hours/days of to-ing and fro-ing without arriving at a code snippet that you can copy/paste. – salathe Aug 12 '11 at 18:14
  • I was not specifically asking for a regex, I just suggested this could be one of the solutions. Doing it myself was also an option, but since I'm no expert at that and the fact that I considered this to be a useful question, I decided to post it on here... – Dylan Aug 12 '11 at 18:50
  • We thank you for asking, and you have my suggestion. If you need help compiling a list of phone number formats that you might like to support, perhaps another question is in order? – salathe Aug 12 '11 at 21:45
0

If you delete all of the non numeric characters, you will only be left with the phone number. You can then take that string and parse it into ###-###-#### if you wish.

$phone = preg_replace('/\D/', '', 'Some test with 123-456-7890 that phone number');
//$phone is now 1234567890
echo substr($phone, 0, 3);//123
echo subsr($phone, 3, 3);//456
echo substr($phone, 6);//7890

Not sure if that is what you are looking for or not.

Kyle
  • 4,421
  • 22
  • 32
  • I'm afraid then there's the danger that postal codes are also converted into phonenumbers. – Dylan Aug 12 '11 at 18:02
  • True. I guess I didn't take that into consideration. – Kyle Aug 12 '11 at 18:15
  • 2
    @Kyle, Why would you? There is no mention of postal codes being involved in the question. – salathe Aug 12 '11 at 18:17
  • @salathe, in 'a certain text' that contains phone numbers, it's quite logical that the text also includes other numbers like postalcodes, isn't it? – Dylan Aug 12 '11 at 18:47
  • 5
    @Dylan do we also have to take care not to match distances, house numbers, dates of birth, and bra sizes? – salathe Aug 12 '11 at 21:50