1

I've been using the following function to remove a get parameter for a long time:

function removeGetParameter($url, $varname) {
    return preg_replace('/([?&])'.$varname.'=[^&]+(&|$)/','$1',$url);
}

(taken from this thread)

But I have noticed that if I remove all parameters, I'll get a question mark suffix ? at the end of the string. I can't just trim the ? off if found at the end of the string, as it might be the value of another GET parameter (i.e http://example.com?myquestion=are+you+here?).

I came up with this solution, but I doubt its efficiency.

function removeGetParameter($url, $varname) {
    $p = parse_url($url);
    parse_str($p['query'], $vars);
    unset($vars[$varname]);
    $vars_str = (count($vars) ? '?':'').http_build_query($vars);
    return $p['scheme'].'://'.$p['host'].$vars_str;
}

It gets the job done, but I believe it is slower than many other options.

Is there any common, safe method to remove a specific GET parameter from a given URL?

Thanks!


Edit: I added that if condifion to be safer (we might have no variables at all)

function removeGetParameter($url, $varname) {

    $p = parse_url($url); // parse the url to parts.
    $final = $p['scheme'].'://'.$p['host']; // build the url from the protocol and host.

    if(!empty($p['query'])) { // if we have any get parameters
        parse_str($p['query'], $vars); // make an array of them ($vars)
        unset($vars[$varname]); // unset the wanted
        $vars_str = (count($vars) ? '?':'').http_build_query($vars); // if has any variables, add a question mark
        return $final.$vars_str; // merge all together
    } else {
        return $final; // no variables needed.
    }
}

Is this the optimal solution?

Itay Ganor
  • 3,965
  • 3
  • 25
  • 40
  • 1
    Break into parts with `parse_url()`, break the query part with `parse_str()`, unset the arg you don't want, then rebuild the URL with `http_build_query()`. – Alex Howansky Dec 14 '17 at 20:26
  • @AlexHowansky That's what I did in my function. I guess I need to add an `if` condition, to make sure there's a query in the URL. But i'm looking for a nicer solution, if it even exists.. – Itay Ganor Dec 14 '17 at 20:28
  • 1
    Oh ha I only noticed the 1st func, heh sorry. – Alex Howansky Dec 14 '17 at 20:29
  • FYI, for something like `http://example.com?myquestion=are+you+here?`, the second question mark (the one that's part of the parameter value) really should have been encoded by whoever/whatever generated that URL. So the example that you gave really should never occur. – Patrick Q Dec 14 '17 at 20:32
  • An url isn't only a scheme, a host and a query. Look at the [*thomas at gielfeldt dot com* `unparse_url` function](http://php.net/manual/en/function.parse-url.php) in the notes. And no, there isn't a nicer way to do it (except if you want to build a wrong regex pattern). – Casimir et Hippolyte Dec 14 '17 at 20:41
  • You don't need the `if()` statement -- `parse_str()` works just fine on an empty string. Just call it with a default like `parse_str($p['query'] ?? '', $vars)` – Alex Howansky Dec 14 '17 at 20:43
  • Ditto for the `count()` check -- `http_build_query()` works fine with empty input. – Alex Howansky Dec 14 '17 at 20:43
  • @AlexHowansky I have to have the `if()` because If the URL doesn't have get parameters at all, I get an error in the error_log. and about the `count()` check: I need to check it, so I won't have the question mark at the end of the string for nothing. Correct me if I'm wrong – Itay Ganor Dec 14 '17 at 20:49
  • You're not getting an error, you're getting a warning about referring to an array index that doesn't exist -- the `??` operator will take care of that. And a lone question mark on the end is fine, it won't break anything. – Alex Howansky Dec 14 '17 at 20:51
  • @AlexHowansky I need to get rid of that lone question mark becuase I host these URLs in my database, and my goal is to save in table size, so I try my best to avoid saving duplicate values. This question mark can lead to these duplicates. Can you give me an example with the `??` operator? Thank you! – Itay Ganor Dec 14 '17 at 20:56
  • 1
    `parse_str($p['query'] ?? '', $vars);` – Alex Howansky Dec 14 '17 at 20:57

2 Answers2

2

Alright, my final code:

function removeGetParameter($url, $varname) {
    $p = parse_url($url); // parse the url to parts.

    parse_str($p['query'] ?? '', $vars); // make an array of the parameters if exist ($vars)
    unset($vars[$varname]); // unset the unwanted
    $vars_str = http_build_query($vars); // build the query string

    return unparse_url($p, $vars_str);
}


function unparse_url($p, $custom_query = NULL) {
    // my customization to http://php.net/manual/en/function.parse-url.php thomas at gielfeldt dot com
    $scheme   = isset($p['scheme']) ? $p['scheme'] . '://' : '';
    $host     = isset($p['host']) ? $p['host'] : '';
    $port     = isset($p['port']) ? ':' . $p['port'] : '';
    $user     = isset($p['user']) ? $p['user'] : '';
    $pass     = isset($p['pass']) ? ':' . $p['pass']  : '';
    $pass     = ($user || $pass) ? "$pass@" : '';
    $path     = isset($p['path']) ? $p['path'] : '';

    $toquery = $custom_query ?? ($p['query'] ?? ''); // get query string -> the given one, or the one that already exists.
    $query    = (strlen($toquery) > 0 ? '?'.$toquery : ''); // add the question mark only if has query.

    $fragment = isset($p['fragment']) ? '#' . $p['fragment'] : '';

    return "$scheme$user$pass$host$port$path$query$fragment";
}

Thank you very much for your help :)

Itay Ganor
  • 3,965
  • 3
  • 25
  • 40
-1

You're right. The regex version is faster, even with another wrapper regex to take care of the annoying ? in the end:

<?php
set_time_limit(0);

$url = 'http://google.com/?teste=waldson&name=patricio';

function removeGetParameter($url, $varname) {
    return preg_replace('#\\?$#', '', preg_replace('/([?&])'.$varname.'=[^&]+(&|$)/','$1',$url));
}

function removeGetParameter2($url, $varname) {
    $p = parse_url($url);
    parse_str($p['query'], $vars);
    unset($vars[$varname]);
    $vars_str = (count($vars) ? '?':'').http_build_query($vars);
    return $p['scheme'].'://'.$p['host'].$vars_str;
}


$startTime = microtime(TRUE);
for ($i = 0; $i < 300000; ++$i) {
    removeGetParameter($url, 'teste');
    removeGetParameter($url, 'name');
}
$totalTime = microtime(true) - $startTime;


$startTime2 = microtime(TRUE);
for ($i = 0; $i < 300000; ++$i) {
    removeGetParameter2($url, 'teste');
    removeGetParameter2($url, 'name');
}
$totalTime2 = microtime(true) - $startTime2;


echo 'Time regex: '  . $totalTime . '<br />';
echo 'Time parse_url: '  . $totalTime2;

I got these results:

Time regex: 2.1999499797821
Time parse_url: 2.8799331188202

This modified version of your removeGetParameter should do the work.

Waldson Patricio
  • 1,489
  • 13
  • 17
  • This is weird, when doing a small number of removes (something like 10 instead of 300,000), it seems like the `parse_url` is faster. – Itay Ganor Dec 14 '17 at 20:52
  • Maybe one of them is system dependent (I'm not sure). Both of them solve your problem and runs faster enough, so don't bother with them. "Premature optimization is the root of all evil." (KNUTH, David). – Waldson Patricio Dec 14 '17 at 21:01