45

Is there a way to reverse the url from a parsed url?

$url = 'http://www.domain.com/dir/index.php?query=blabla#more_bla';
$parse = parse_url($url);
print_r($parse);
/*
array(
 'scheme'=>'http://',
 etc....
)
*/
$revere = reverse_url($parse); // probably does not exist but u get the point

echo $reverse;
//outputs:// "http://www.domain.com/dir/index.php?query=blabla#more_bla"

Or if there is a way validate a url that is missing part of its recommended urls e.g

www.mydomain.com

mydomain.com

should all return http://www.mydomain.com or with correct sub domain

Val
  • 17,336
  • 23
  • 95
  • 144
  • I had a look at `http_build_url` but it looks like such a hassle if urls differ have other properties not mentioned on the `$url` – Val Dec 04 '10 at 17:52
  • Neither `www.example.com` nor `example.com` are valid absolute URLs; would be interpreted as URL path. – Gumbo Dec 04 '10 at 18:07

6 Answers6

36

These are the two functions I use for decomposing and rebuilding URLs:

function http_parse_query($query) {
    $parameters = array();
    $queryParts = explode('&', $query);
    foreach ($queryParts as $queryPart) {
        $keyValue = explode('=', $queryPart, 2);
        $parameters[$keyValue[0]] = $keyValue[1];
    }
    return $parameters;
}

function build_url(array $parts) {
    return (isset($parts['scheme']) ? "{$parts['scheme']}:" : '') . 
        ((isset($parts['user']) || isset($parts['host'])) ? '//' : '') . 
        (isset($parts['user']) ? "{$parts['user']}" : '') . 
        (isset($parts['pass']) ? ":{$parts['pass']}" : '') . 
        (isset($parts['user']) ? '@' : '') . 
        (isset($parts['host']) ? "{$parts['host']}" : '') . 
        (isset($parts['port']) ? ":{$parts['port']}" : '') . 
        (isset($parts['path']) ? "{$parts['path']}" : '') . 
        (isset($parts['query']) ? "?{$parts['query']}" : '') . 
        (isset($parts['fragment']) ? "#{$parts['fragment']}" : '');
}

// Example
$parts = parse_url($url);

if (isset($parts['query'])) {
    $parameters = http_parse_query($parts['query']);
    foreach ($parameters as $key => $value) {
        $parameters[$key] = $value; // do stuff with $value
    }
    $parts['query'] = http_build_query($parameters);
}

$url = build_url($parts);
NobleUplift
  • 5,631
  • 8
  • 45
  • 87
27

You should be able to do

http_build_url($parse)

NOTE: http_build_url is only available by installing pecl_http.

According to the docs it's designed specifically to handle the output from parse_url. Both functions handle anchors, query params, etc so there are no "other properties not mentioned on the $url".

To add http:// when it's missing, use a basic check before parsing it:

if (strpos($url, "http://") != 0)
    $url = "http://$url";
Clemens Tolboom
  • 1,872
  • 18
  • 30
Brad Mace
  • 27,194
  • 17
  • 102
  • 148
  • 1
    my problem is (maybe im over thinking it) but what if `http://` is `https://` as its not very common but e-commerce websites have this feature. I am developing this as a cms system and they deffently will need to use https or other protocols, and sometimes ports. would it handle it correctly? – Val Dec 04 '10 at 18:19
  • The only way to find that out is to try making a request to the website in question. For that you might look at http://php.net/manual/en/function.httprequest-send.php – Brad Mace Dec 04 '10 at 18:21
  • 9
    NOTE: Fatal error: Call to undefined function http_build_url() .... I think you may need to install an extra module – Val Dec 04 '10 at 18:22
  • would this be available for shared hosting coz u know how hosting providers are sometimes :) – Val Dec 04 '10 at 18:28
  • I don't have much experience with shared hosting. This is one of the "official" extensions though so I would think most providers would accommodate it. – Brad Mace Dec 04 '10 at 18:31
  • I am using the wamp server and it doesn't seem to have pecl_http extention in it any ideas? – Val Dec 05 '10 at 13:02
  • never mind if you are looking for the answer to above comment ... http://stackoverflow.com/questions/4359075/wamp-server-php-extention-pecl-missing hope it helps. – Val Dec 05 '10 at 14:47
  • this is not a standard PHP function. it is a PECL extension . Vote down – Alex Skrypnyk Jan 14 '15 at 01:16
  • @Alex.Designworks You can vote how you like, but the original question never specified standard PHP functions only. Libraries are just a fact of life in programming. – Brad Mace Jan 15 '15 at 14:48
  • I know that the answer is very old but it still ends up rather high in Google search results so note might be useful. `pecl_http` extension dropped functional approach in favor of OOP design in version 2. Now it should be `$httpUrl = new \http\Url($parsed); $url = $httpUrl->toString();` – Tomasz Kapłoński Mar 18 '20 at 12:47
  • 1
    @TomaszKapłoński I think that'd be good to have as a standalone answer that people can upvote – Brad Mace Mar 18 '20 at 19:21
  • If you already use parse_url() you don't really need the strpos part. Also, this adds an http:// if the url starts with a scheme other than http, e.g. https:// or starts with // - not a good idea. If you already use parse_url(), you can just add a scheme, if it is empty: $parts['scheme'] or change the scheme if you want to change from https to http or vice versa. – Sybille Peters Aug 22 '20 at 04:44
14

This function should do the trick:

function unparse_url(array $parsed): string {
    $pass      = $parsed['pass'] ?? null;
    $user      = $parsed['user'] ?? null;
    $userinfo  = $pass !== null ? "$user:$pass" : $user;
    $port      = $parsed['port'] ?? 0;
    $scheme    = $parsed['scheme'] ?? "";
    $query     = $parsed['query'] ?? "";
    $fragment  = $parsed['fragment'] ?? "";
    $authority = (
        ($userinfo !== null ? "$userinfo@" : "") .
        ($parsed['host'] ?? "") .
        ($port ? ":$port" : "")
    );
    return (
        (\strlen($scheme) > 0 ? "$scheme:" : "") .
        (\strlen($authority) > 0 ? "//$authority" : "") .
        ($parsed['path'] ?? "") .
        (\strlen($query) > 0 ? "?$query" : "") .
        (\strlen($fragment) > 0 ? "#$fragment" : "")
    );
}

Here is a short test for it:

function unparse_url_test() {
    foreach ([
        '',
        'foo',
        'http://www.google.com/',
        'http://u:p@foo:1/path/path?q#frag',
        'http://u:p@foo:1/path/path?#',
        'ssh://root@host',
        '://:@:1/?#',
        'http://:@foo:1/path/path?#',
        'http://@foo:1/path/path?#',
    ] as $url) {
        $parsed1 = parse_url($url);
        $parsed2 = parse_url(unparse_url($parsed1));

        if ($parsed1 !== $parsed2) {
            print var_export($parsed1, true) . "\n!==\n" . var_export($parsed2, true) . "\n\n";
        }
    }
}

unparse_url_test();
Jesse
  • 6,725
  • 5
  • 40
  • 45
  • 1
    Calling `isset` for each URL part followed by a `strlen` is inefficient because `parse_url` only populates the return array with parts that have a value. The only `strlen` calls that return false are passed null in your code, and even then it would be more efficient to compare to null than call a function. – NobleUplift Feb 04 '16 at 17:03
  • To further my point, parse_url returns `array("path"=>"://:@:1/")` for the last URL in your example. Completely strips out `"query"` and `"fragment"`. – NobleUplift Feb 04 '16 at 17:50
  • 2
    That's true if `unparse_url()` is only being called as `unparse_url(parse_url(...))`, but of course code using `unparse_url()` is going to *do something* to the result of `parse_url()` before giving to to `unparse_url()`. This may involve adding keys with empty values, and so this function needs to handle that correctly. – Jesse Feb 04 '16 at 22:05
  • Fair point if that's your use case, but during my QA testing earlier today I found a test case you missed: `unparse_url(parse_url('http://:@foo:1/path/path?#'))` will result in `http://foo:1/path/path`. Now, to my knowledge no service uses an empty string as the anonymous/guest username, but it's still a mismatch. – NobleUplift Feb 04 '16 at 23:25
  • 1
    It seems `user` and `pass` are the only two things for which `parse_url()` distinguishes between being empty (`""`) and unset. I've updated the code, thanks. – Jesse Feb 05 '16 at 03:42
4

This answer is appendix to accepted answer by @BradMace. I originally added this as a comment but he suggested add this as separate answer so here it is.

Original answer to use http_build_url($parse) provided by pecl_http would work for extension version 1.x - versions 2.x and later are object oriented and syntax changed.

In newer version (tested on pecl_http v.3.2.3) implementation should be:

$httpUrl = new \http\Url($parsed);
$url = $httpUrl->toString();
Tomasz Kapłoński
  • 1,320
  • 4
  • 24
  • 49
  • Documentation: https://mdref.m6w6.name/http/Url/__construct and https://mdref.m6w6.name/http/Url#Functions: – Ken May 10 '22 at 07:34
4

A decade later using the decades old method :)

Considering you will always have the scheme & host. Optionally, path, query and fragment offsets:

$a = parse_url('https://example.com/whatever-season/?drink=water&sleep=better');

Build back with classy sprintf:

function buildUrl(array $a){
        return sprintf('%s://%s%s%s%s',
                            $a['scheme'], $a['host'], $a['path'] ?? '',
                            $a['query'] ? '?' . $a['query'] : '',
                            $a['fragment'] ? '#' . $a['fragment'] : '');
}

Where path offset is operated by the null coalescing operator (PHP 7+).

  • 1
    It's actually not a gimme to have scheme and host--you're assuming a valid and full URL was passed to `parse_url()`. Try looking at it with a relative URL. `parse_url('/abc/def?ghi');` – haz Oct 07 '20 at 02:27
3

Another implemention:

function build_url(array $elements) {
    $e = $elements;
    return
        (isset($e['host']) ? (
            (isset($e['scheme']) ? "$e[scheme]://" : '//') .
            (isset($e['user']) ? $e['user'] . (isset($e['pass']) ? ":$e[pass]" : '') . '@' : '') .
            $e['host'] .
            (isset($e['port']) ? ":$e[port]" : '')
        ) : '') .
        (isset($e['path']) ? $e['path'] : '/') .
        (isset($e['query']) ? '?' . (is_array($e['query']) ? http_build_query($e['query'], '', '&') : $e['query']) : '') .
        (isset($e['fragment']) ? "#$e[fragment]" : '')
    ;
}

The results should be:

{
    "host": "example.com"
}
/* //example.com/ */

{
    "scheme": "https",
    "host": "example.com"
}
/* https://example.com/ */

{
    "scheme": "http",
    "host": "example.com",
    "port": 8080,
    "path": "/x/y/z"
}
/* http://example.com:8080/x/y/z */

{
    "scheme": "http",
    "host": "example.com",
    "port": 8080,
    "user": "anonymous",
    "query": "a=b&c=d",
    "fragment": "xyz"
}
/* http://anonymous@example.com:8080/?a=b&c=d#xyz */

{
    "scheme": "http",
    "host": "example.com",
    "user": "root",
    "pass": "stupid",
    "path": "/x/y/z",
    "query": {
        "a": "b",
        "c": "d"
    }
}
/* http://root:stupid@example.com/x/y/z?a=b&c=d */

{
    "path": "/x/y/z",
    "query": "a=b&c=d"
}
/* /x/y/z?a=b&c=d */
mpyw
  • 5,526
  • 4
  • 30
  • 36