This question is about the proper use of rawurlencode, http_build_query & htmlspecialchars.
Until now my standard way of creating HTML link in vanilla PHP was this:
$qs = [
'foo' => 'foo~bar',
'bar' => 'bar foo',
];
echo '<a href="?' . http_build_query($qs) . '">Link</a>';
Recently I have learned that this is not 100% correct. Here are few issues:
http_build_query
uses by default PHP_QUERY_RFC1738 instead of PHP_QUERY_RFC3986. RFC3986 is the standard and superseded RFC1738 which in PHP is only kept for legacy use.While the "special" HTML characters in the key and value part will be encoded to the percent-encoded representation, the argument separator will be an ampersand. In most sane situations this would not be a problem, but sometimes your key name might be
quot;
and then your link will become invalid:$qs = [ 'a' => 'a', 'quot;' => 'bar', ]; echo '<a href="?' . http_build_query($qs) . '">Link</a>';
The code above will generate this link:
?a=a"%3B=bar
!
IMO this implies that the functionhttp_build_query
needs to be called context-aware with the 3-rd argument&
when in HTML, and with just&
when inheader('Location: ...');
. Another option would be to pass it throughhtmlspecialchars
before displaying in HTML.PHP manual for
urlencode
(which should be deprecated long time ago IMO) suggests to encode only the value part of query string and then pass the whole query string throughhtmlentities
before displaying in HTML. This looks very incorrect to me; the key part could still contain forbidden URL characters.$query_string = 'foo=' . urlencode($foo) . '&bar=' . urlencode($bar); echo '<a href="mycgi?' . htmlentities($query_string) . '">';
My conclusion is to do something along this lines:
$qs = [
'a' => 'a',
'quot;' => 'bar foo',
];
echo '<a href="?' . http_build_query($qs, null, '&', PHP_QUERY_RFC3986) . '">Link</a>';
What is the recommended way to create HTML links in PHP? Is there an easier way than what I came up with? Have I missed any crucial points?