122

I'm making a search page, where you type a search query and the form is submitted to search.php?query=your query. What PHP function is the best and that I should use for encoding/decoding the search query?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Ali
  • 261,656
  • 265
  • 575
  • 769
  • 3
    Do you experience any problems? Browsers and PHP should handle this automatically already (e.g. putting `foo bar` in a text field, creates `foo+bar` in the URL). – Felix Kling Jan 20 '11 at 08:33
  • @Felix I'm going to call the searcher script using `file_get_contents` – Ali Jan 20 '11 at 08:36

6 Answers6

227

For the URI query value use urlencode/urldecode; for anything else use rawurlencode/rawurldecode.

To create entire query string use http_build_query()

The difference between urlencode and rawurlencode is that

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
Gumbo
  • 643,351
  • 109
  • 780
  • 844
  • *application/x-www-form-urlencoded* is just a special variant of the *Percent-Encoding* and is only applied to encode HTML form data. – Gumbo Jan 20 '11 at 08:44
  • 1
    @Click Upvote: They can’t be compared in that way. The *application/x-www-form-urlencoded* format *is* the *Percent-Encoding* format except that the space is encoded with `+` instead of `%20`. And besides that, *application/x-www-form-urlencoded* is used to encode form data while the *Percent-Encoding* has a more general usage. – Gumbo Jan 20 '11 at 10:25
  • 22
    **rawurlencode()** is compatible with javascript **decodeURI()** function – Clive Paterson Mar 03 '15 at 06:40
  • Is this rule always the empirical one? I mean, when I need to encode a query string I always use `urldecode`. Then, what about the URI path (e.g. `/a/path with spaces/`) and URI fragment (e.g. `#fragment`). Should I always use `rawurldecode` for these two? – tonix Sep 12 '18 at 16:53
  • A good rule of thumb is that for paths (Like /my%20folder/) go with `rawurlencode`; but for POST and GET fields go with `urlencode` (Like /?folder=my+folder)` – Soroush Falahati Jan 29 '19 at 15:45
  • @SoroushFalahati do i need to encode basic parameters (e.g. `"name=b&age=c&location=d"`) sent to a PHP file via AJAX? – oldboy Jul 29 '19 at 01:06
  • @BugWhisperer; well if you are in PHP it means you are on the receiving end of a request. So no; you don't need to encode in PHP; you should rather decode; which is also done automatically with PHP. However, if you are trying to send requests from the PHP script to another URL you should consider encoding your parameters' values and parameters' names depending on the source of those variables with the `urlencode` function. – Soroush Falahati Jul 29 '19 at 02:51
  • @SoroushFalahati `"name=b&age=c&location=d"` is a string that is sent to a JS AJAX function that then posts the string to a PHP page which then parses the data and inserts it into a table. i still shouldnt be encoding this? couldnt an end user make a string up and send it to the JS function which then passes it to PHP to be inserted into a table??? theyre being POSTed, not GETed, so the parameter string isnt passed along via a url, but rather in the background – oldboy Jul 29 '19 at 06:16
  • @BugWhisperer, get arguments and post arguments both should be encoded. So yeah; you probably should. Some Ajax libraries depending on how you use them might do this automatically tho. That being said, this is not a security measure. Escaping here is just used for sending valid values. A user can send any value he/she wants regardless of escaping. Make sure you have validation checks in your Ajax Callback. – Soroush Falahati Jul 30 '19 at 17:52
  • @SoroushFalahati thanks. yeah i encoded the `[type=text]` and `[type=email]` with `encodeURICompoent` just to prevent characters like `&` from messing with the string. i did not encode `[type=number]` -- do you think this is a problem, leaving it unencoded? i am most certainly validating everything in the callback. appreciate it! – oldboy Jul 31 '19 at 00:16
24

The cunningly-named urlencode() and urldecode().

However, you shouldn't need to use urldecode() on variables that appear in $_POST and $_GET.

Oliver Charlesworth
  • 267,707
  • 33
  • 569
  • 680
  • 6
    could you please elaborate why urldecode() should not be used with $_POST. because i've been doing that since ages without any problems. – sid May 21 '19 at 13:18
  • do i need to encode basic parameters (e.g. `"name=b&age=c&location=d"`) sent to a PHP file via AJAX? – oldboy Jul 29 '19 at 01:07
  • Re *"you shouldn't need to"*: Why not? Because they are already encoded? – Peter Mortensen Dec 13 '22 at 21:40
  • @sid I think their point is that you don't need to. If you're encoding some strings for a URL link, and then reading that data on the landing page with `$_GET`, that data will get decoded by default without the need for you to run it through `urldecode()` too. – A Friend May 25 '23 at 20:13
13

Here is my use case, which requires an exceptional amount of encoding. Maybe you think it is contrived, but we run this in production. Coincidently, this covers every type of encoding, so I'm posting as a tutorial.

Use case description

Somebody just bought a prepaid gift card ("token") on our website. Tokens have corresponding URLs to redeem them. This customer wants to email the URL to someone else. Our web page includes a mailto link that lets them do that.

PHP code

// The order system generates some opaque token
$token = 'w%a&!e#"^2(^@azW';

// Here is a URL to redeem that token
$redeemUrl = 'https://httpbin.org/get?token=' . urlencode($token);

// Actual contents we want for the email
$subject = 'I just bought this for you';
$body = 'Please enter your shipping details here: ' . $redeemUrl;

// A URI for the email as prescribed
$mailToUri = 'mailto:?subject=' . rawurlencode($subject) . '&body=' . rawurlencode($body);

// Print an HTML element with that mailto link
echo '<a href="' . htmlspecialchars($mailToUri) . '">Email your friend</a>';

Note: the above assumes you are outputting to a text/html document. If your output media type is text/json then simply use $retval['url'] = $mailToUri; because output encoding is handled by json_encode().

Test case

  1. Run the code on a PHP test site (is there a canonical one I should mention here?)
  2. Click the link
  3. Send the email
  4. Get the email
  5. Click that link

You should see:

"args": {
  "token": "w%a&!e#\"^2(^@azW"
}, 

And of course this is the JSON representation of $token above.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
William Entriken
  • 37,208
  • 23
  • 149
  • 195
  • Equivalently, and less semantically (because `mailto:` is not HTTP), you can use `$mailToUri 'mailto:?' . http_build_query(['subject'=>$subject, 'body'=>$body], null, '&', PHP_QUERY_RFC3986);`. – William Entriken Aug 21 '19 at 13:34
3

You can use URL encoding functions. PHP has the

rawurlencode()

function.

ASP.NET has the

Server.URLEncode()

function.

In JavaScript, you can use the

encodeURIComponent()

function.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Amranur Rahman
  • 1,061
  • 17
  • 29
1

Based on what type of RFC standard encoding you want to perform or if you need to customize your encoding you might want to create your own class.

/**
 * UrlEncoder make it easy to encode your URL
 */
class UrlEncoder{
    public const STANDARD_RFC1738 = 1;
    public const STANDARD_RFC3986 = 2;
    public const STANDARD_CUSTOM_RFC3986_ISH = 3;
    // add more here

    static function encode($string, $rfc){
        switch ($rfc) {
            case self::STANDARD_RFC1738:
                return  urlencode($string);
                break;
            case self::STANDARD_RFC3986:
                return rawurlencode($string);
                break;
            case self::STANDARD_CUSTOM_RFC3986_ISH:
                // Add your custom encoding
                $entities = ['%21', '%2A', '%27', '%28', '%29', '%3B', '%3A', '%40', '%26', '%3D', '%2B', '%24', '%2C', '%2F', '%3F', '%25', '%23', '%5B', '%5D'];
                $replacements = ['!', '*', "'", "(", ")", ";", ":", "@", "&", "=", "+", "$", ",", "/", "?", "%", "#", "[", "]"];
                return str_replace($entities, $replacements, urlencode($string));
                break;
            default:
                throw new Exception("Invalid RFC encoder - See class const for reference");
                break;
        }
    }
}

Use example:

$dataString = "https://www.google.pl/search?q=PHP is **great**!&id=123&css=#kolo&email=me@liszka.com)";

$dataStringUrlEncodedRFC1738 = UrlEncoder::encode($dataString, UrlEncoder::STANDARD_RFC1738);
$dataStringUrlEncodedRFC3986 = UrlEncoder::encode($dataString, UrlEncoder::STANDARD_RFC3986);
$dataStringUrlEncodedCutom = UrlEncoder::encode($dataString, UrlEncoder::STANDARD_CUSTOM_RFC3986_ISH);

Will output:

string(126) "https%3A%2F%2Fwww.google.pl%2Fsearch%3Fq%3DPHP+is+%2A%2Agreat%2A%2A%21%26id%3D123%26css%3D%23kolo%26email%3Dme%40liszka.com%29"
string(130) "https%3A%2F%2Fwww.google.pl%2Fsearch%3Fq%3DPHP%20is%20%2A%2Agreat%2A%2A%21%26id%3D123%26css%3D%23kolo%26email%3Dme%40liszka.com%29"
string(86)  "https://www.google.pl/search?q=PHP+is+**great**!&id=123&css=#kolo&email=me@liszka.com)"

* Find out more about RFC standards: https://datatracker.ietf.org/doc/rfc3986/ and urlencode vs rawurlencode?

DevWL
  • 17,345
  • 6
  • 90
  • 86
1

You know how people keep saying things like: "Never manually craft a JSON string in PHP -- always call json_encode() for stability/reliability."?

Well, if you are building a query string, then I say: "Never manually craft a URL query string in PHP—always call http_build_query() for stability/reliability."

Demo:

$array = [
    'query' => 'your query',
    'example' => null,
    'Qbert says:' => '&%=#?/'
];

echo http_build_query($array);

echo "\n---\n";

echo http_build_query($array, '', '&amp;');

Output:

query=your+query&Qbert+says%3A=%26%25%3D%23%3F%2F
---
query=your+query&amp;Qbert+says%3A=%26%25%3D%23%3F%2F

The fine print on this function is that if an element in the input array has a null value, then that element will not be included in the output string.

Here is an educational answer on the Joomla Stack Exchange site which encourages the use of &amp; as the custom delimiter: Why are Joomla URL query strings commonly delimited with "&" instead of "&"?

Initially packaging your query string data in array form offers a compact and readable structure, then the call of http_build_query() does the hard work and can prevent data corruption. I generally opt for this technique even for small query string construction.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
mickmackusa
  • 43,625
  • 12
  • 83
  • 136