80

I'm looking for an way to parse a substring using PHP, and have come across preg_match however I can't seem to work out the rule that I need.

I am parsing a web page and need to grab a numeric value from the string, the string is like this

producturl.php?id=736375493?=tm

I need to be able to obtain this part of the string:

736375493

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
MonkeyBlue
  • 2,234
  • 6
  • 31
  • 41

4 Answers4

106
$matches = array();
preg_match('/id=([0-9]+)\?/', $url, $matches);

This is safe for if the format changes. slandau's answer won't work if you ever have any other numbers in the URL.

php.net/preg-match

COil
  • 7,201
  • 2
  • 50
  • 98
David Fells
  • 6,678
  • 1
  • 22
  • 34
  • 3
    Thanks for the suggestion, I tried this code initially and it did not work so I tweaked it to `preg_match('/id=(.*)\?/', $url, $matches);` and it works perfectly now. Thanks :) – MonkeyBlue May 09 '11 at 20:52
27
<?php
$string = "producturl.php?id=736375493?=tm";
preg_match('~id=(\d+)~', $string, $m );
var_dump($m[1]); // $m[1] is your string
?>
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • 5
    It's in $m[1] because (from the docs): "If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on." – bnunamak Jul 09 '17 at 07:51
6
$string = "producturl.php?id=736375493?=tm";
$number = preg_replace("/[^0-9]/", '', $string);
slandau
  • 23,528
  • 42
  • 122
  • 184
  • if the string is like `producturl.php?id=736375493?=tm&page=2` your going to end up with an extra 2 in your `$number`. – UnkwnTech May 09 '11 at 20:09
  • Very true, I was under the assumption that all his strings would be in the format he posted. – slandau May 09 '11 at 20:10
  • Yep this is giving me an extra number in the string, I have just tried this which works to an extent. `preg_match('/id(.*)=', $body, $matches);` but it's still giving me =tm at the end of the number on some lines. – MonkeyBlue May 09 '11 at 20:27
3

Unfortunately, you have a malformed url query string, so a regex technique is most appropriate. See what I mean.

There is no need for capture groups. Just match id= then forget those characters with \K, then isolate the following one or more digital characters.

Code (Demo)

$str = 'producturl.php?id=736375493?=tm';
echo preg_match('~id=\K\d+~', $str, $out) ? $out[0] : 'no match';

Output:

736375493

For completeness, there 8s another way to scan the formatted string and explicitly return an int-typed value. (Demo)

var_dump(
    sscanf($str, '%*[^?]?id=%d')[0]
);

The %*[^?] means: greedily match one or more non-question mark characters, but do not capture the substring. The remainder of the format parameter matches the literal sequence ?id=, then greedily captures one or more numbers. The returned value will be cast as an integer because of the %d placeholder.

mickmackusa
  • 43,625
  • 12
  • 83
  • 136