2
preg_match(#(?:([a-zA-Z-]+):)?(?://(?:([a-zA-Z0-9_!$&\'()*+,;=._~%-]+)(?::([a-zA-Z0-9_!$&\'()*+,;=._~%-]*))?@)?([a-zA-Z0-9-._]+)(?::([0-9]*))?)?([a-zA-Z0-9_!$&\'()*@+,:;=._~/%-]*)(?:\\?([0-9a-zA-Z!$&\'()*@+,:;=._~%-]*))?(?:\\#(.*))?#, $uri, $m));

the regex above is used to match urls and the result is supposed to be m[1] = scheme m[2] = user m[3] = pass m[4] = host m[5] = port m[6] = path m[7] = queryString m[8] = fragment

it works well except when the queryString includes array, for example: ?ar[k1]=v1&ar[k2]=v2

My questions are: 1.What is the meaning of the sharp # in the regex 2.how can I modify the regex to make it matches the queryString include array

Jarod
  • 189
  • 9
  • With all due respect but you are doing this the wrong way :) There are far better ways to parse these kind of strings and they do not require regexes. Try `parse_str` (http://php.net/manual/en/function.parse-str.php) for example. –  Jul 01 '12 at 14:43
  • It's perfectly fine to use a regex for this. However, if you are clueless as to how they work, you shouldn't try to adapt it. In paticular if it's condensed and uncommented like this one. -- Else[Open source RegexBuddy alternatives](http://stackoverflow.com/questions/89718/is-there) and [Online regex testing](http://stackoverflow.com/questions/32282/regex-testing) for some helpful tools, or [RegExp.info](http://regular-expressions.info/) for a nicer tutorial. – mario Jul 01 '12 at 14:55

4 Answers4

3

You are better of using parse_url, captures the querystring also, which you can then use with parse_str to get an array of key -> value pairs.

Wrikken
  • 69,272
  • 8
  • 97
  • 136
2

Use parse_str instead: http://php.net/manual/en/function.parse-str.php

This does exactly what you required and is built-in, and most importantly, sans regex (look at that monster) :s.

To directly answer your question by the way, # is just a delimiter of the regex.

Andreas Wong
  • 59,630
  • 19
  • 106
  • 123
2

1.The sharp (# 2nd) in the regex meaning a part of URL

scheme://username:password@domain:port/path?query_string#fragment_id

2.Parse a URL and return its components

HanhNghien
  • 225
  • 1
  • 6
1

This regular expression seems to be quite strictly adapting the syntax of URIs as per RFC 3986 which actually doesn’t allow plain [ or ] inside the query:

  query       = *( pchar / "/" / "?" )
  pchar       = unreserved / pct-encoded / sub-delims / ":" / "@"
  unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
  pct-encoded = "%" HEXDIG HEXDIG
  sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
              / "*" / "+" / "," / ";" / "="

Now if you want to allow these characters too, use this for the query part in your existing regular expression:

… (?:\\?([0-9a-zA-Z!$&\'()*@+,:;=._~%[\]-]*))? …
Community
  • 1
  • 1
Gumbo
  • 643,351
  • 109
  • 780
  • 844