319

Is it safe to pass raw base64 encoded strings via GET parameters?

Alix Axel
  • 151,645
  • 95
  • 393
  • 500

11 Answers11

334

There are additional base64 specs. (See the table here for specifics ). But essentially you need 65 chars to encode: 26 lowercase + 26 uppercase + 10 digits = 62.

You need two more ['+', '/'] and a padding char '='. But none of them are url friendly, so just use different chars for them and you're set. The standard ones from the chart above are ['-', '_'], but you could use other chars as long as you decoded them the same, and didn't need to share with others.

I'd recommend just writing your own helpers. Like these from the comments on the php manual page for base64_encode:

function base64_url_encode($input) {
 return strtr(base64_encode($input), '+/=', '._-');
}

function base64_url_decode($input) {
 return base64_decode(strtr($input, '._-', '+/='));
}
ArchimedesMP
  • 270
  • 1
  • 8
Joe Flynn
  • 6,908
  • 6
  • 31
  • 44
  • 57
    Great solution, except comma is not unreserved in URLs. I recommend using '~' (tilde) or '.' (dot) instead. – kralyk Feb 01 '13 at 14:19
  • 16
    @kralyk: I reccomend just using `urlencode` as suggested by rodrigo-silveira's answer. Creating two new functions to save few chars in url length, it's like entering in your house passing through the window instead of just using the door. – Marco Demaio Feb 26 '14 at 11:27
  • 8
    @MarcoDemaio, without knowing how it will be used, it's impossible to say that it's just a few characters. Every encoded character will have triple the length, and why wouldn't "+++..." be a valid base64 string? URLs have browser limits, and tripling a URL might make you hit those limits. – leewz Sep 04 '15 at 18:44
  • 1
    Ironically, @kralyk suggests tilde, and yet tilde is *not* a URL-safe character! Lots of misinformation floating around. :) – Randal Schwartz Sep 29 '15 at 15:45
  • 11
    @RandalSchwartz tilde _is_ URL-safe. From RFC3986: `unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"` – kralyk Sep 30 '15 at 10:53
  • 1
    Ahh, that's a relatively recent change. I was using legacy information. – Randal Schwartz Oct 01 '15 at 17:03
  • 5
    Since `,` should be urlencoded to `%2C`, I suggest using `._-` instead of `-_,` like the only variant in https://en.wikipedia.org/wiki/Base64#Variants_summary_table that keeps the trailing = – PaulH Jul 03 '16 at 12:26
  • Note, that this is not entirely save: It can happen, that the last char of your URL becomes a `.` which is then not considered a part of the URL by some mail clients. I still recommend the replacement proposed here, though, because some mail clients optimize `//`to `/` in URLs and also not accept trailing `=`-signs as part of an URL. – MaPePeR Jan 08 '19 at 07:50
  • To clarify: Don't invent your own url-safe 64. Use [base64url](https://tools.ietf.org/html/rfc4648#section-5). That uses `minus` and `underline` (with `equals` for pad), as Joe Flynn's answer describes. – ToolmakerSteve Dec 12 '20 at 00:44
261

No, you would need to url-encode it, since base64 strings can contain the "+", "=" and "/" characters which could alter the meaning of your data - look like a sub-folder.

Valid base64 characters are below.

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=
Thiyagaraj
  • 3,585
  • 1
  • 18
  • 15
  • 6
    URLencoding is a waste of space, especially as base64 itself leaves many characters unused. – Michał Górny Sep 03 '09 at 18:34
  • 35
    I am not sure I understand what you are saying - URL encoding wont alter any of the characters except the last three characters in the list above, and that is to prevent them from being interpreted incorrectly since they have other meanings in URLS. The same goes for base64, the original data could be binary or anything, but it is encoded in a form that can be transmitted easily using simple protocols. – Thiyagaraj Sep 03 '09 at 19:42
  • 5
    Firstly, you should escape '+' too as it may be converted into space. Secondly, there are at least few characters which are safe for use in URLs and aren't used in ‘standard’ charset. Your method can even increase the size of transferred data _three times_ in certain situations; while replacing those characters with some other will do the trick while preserving same length. And it's quite standard solution too. – Michał Górny Sep 03 '09 at 20:58
  • 14
    http://en.wikipedia.org/wiki/Base64#URL_applications — it says clearly that escaping ‘makes the string unnecessarily longer’ and mentions the alternate charset variant. – Michał Górny Sep 03 '09 at 23:02
  • 2
    Because of this answer, I diagnosed my problem as being exactly what it mentioned. Some of the base 64 characters (+,/,=) were being altered because of URL processing. When I URL encoded the base 64 string, the problem was resolved. – Chuck Krutsinger Jan 29 '15 at 21:13
  • 4
    @MichałGórny If you're using JSON as a GET parameter, Base 64 encoding will (depending on your data) likely reduce the size of the request string. (And before you say this is a silly idea, we're using JSON in query strings to facilitate deep linking into our app.) For our app, this approach achieved a reduction of about 30%. (To be fair, an even greater reduction could be achieved by avoiding Base64 entirely and instead writing our own JSON (de)serializers that use URL-encoding-friendly characters (e.g. `(['` instead of `{["`). – rinogo Oct 26 '15 at 20:28
  • 2
    I'm assuming you mean that reduction was from doing base64 before url-encoding, which makes perfect sense to me, because url-encoding is really inefficient. – Arlen Beiler Sep 13 '19 at 20:46
  • You forgot spaces... If it's a base64 of a binary file, it can include spaces (which are sometimes ignored, but not always)... – spa900 Dec 04 '20 at 16:00
  • @spa900 - the 64-character encoding set "of a binary file" you are referring to must be different than the standard "Base64", which does NOT include the space character. It uses only PRINTABLE characters (table is in the wiki link of comments above). – ToolmakerSteve Dec 12 '20 at 00:39
95

@joeshmo Or instead of writing a helper function, you could just urlencode the base64 encoded string. This would do the exact same thing as your helper function, but without the need of two extra functions.

$str = 'Some String';

$encoded = urlencode( base64_encode( $str ) );
$decoded = base64_decode( urldecode( $encoded ) );
rodrigo-silveira
  • 12,607
  • 11
  • 69
  • 123
  • 3
    The result is not exactly the same. urlencode uses 3 characters to encode non-valid characters and joeshmo's solution uses 1. It's not a big difference, but it's still a waste. – Josef Borkovec Mar 01 '13 at 12:07
  • 1
    @JosefBorkovec Really? Then this would also mean the same number of bytes base64->url->encoded could be a variety of different resulting length, while the other solution gives a predictable lenght, right? – humanityANDpeace Mar 29 '14 at 14:33
  • 1
    @humanityANDpeace Yes, urlencode is a shitty solution because it triples the size of certain base64 strings. You also can't reuse the buffer since the output is larger than the input. – Navin Jan 06 '16 at 10:13
  • 10
    Expansion from 1 to 3 chars occurs on 3 out of 64 characters on average, so it is a 9% overhead (2*3/64) – PaulH Jul 03 '16 at 12:32
  • Be careful with `/` character if you pass it not as a GET parameter, but as a path in the URL. It will change your path if you don't replace `/` with something else on both sides. – Tom Raganowicz Sep 01 '17 at 11:15
  • base64 should not be decoded until after url parsing has already taken place, so mistaking a + or / as being part of the url after decoding the base64 should not be an issue. Parse the path of the url first, then if there are base64 segments in the path, decode those individually. Same goes for url parameters. – theferrit32 Mar 22 '18 at 15:30
  • Should'nt the second line be urldecode(base64_decode($encoded )); It seems backwards to me. – Mfoo Mar 28 '19 at 16:39
  • URL encoding the contents of the query params could be problematic with some rest clients, we faced that exact issue when encoding our Base64 query params before sending them, when using programmatic clients like RestAssured or Groovy http connection it would work well, but when using Postman or Curl the contents of the query param were different. The reason seems to be that some clients perform an extra encoding to the url, so the query params ended up going to the server with double encoding – raspacorp Jul 10 '19 at 21:13
  • My choice. In Java: `url = URLEncoder.encode(Base64.getEncoder().encodeToString(value), StandardCharsets.UTF_8)` `Base64.getDecoder().decode(URLDecoder.decode(code, StandardCharsets.UTF_8))` – Grigory Kislin Apr 30 '20 at 16:09
  • Unfortunately, the "decode" part of this answer is wrong. "urldecode" is done for you automatically when php populates $_GET, and doing "urldecode" again will be wrong for `+`. See [Jeffory Beckers' answer](https://stackoverflow.com/a/12591941/199364) for the details. – ToolmakerSteve Dec 12 '20 at 00:58
49

Introductory Note I'm inclined to post a few clarifications since some of the answers here were a little misleading (if not incorrect).

The answer is NO, you cannot simply pass a base64 encoded parameter within a URL query string since plus signs are converted to a SPACE inside the $_GET global array. In other words, if you sent test.php?myVar=stringwith+sign to

//test.php
print $_GET['myVar'];

the result would be:
stringwith sign

The easy way to solve this is to simply urlencode() your base64 string before adding it to the query string to escape the +, =, and / characters to %## codes. For instance, urlencode("stringwith+sign") returns stringwith%2Bsign

When you process the action, PHP takes care of decoding the query string automatically when it populates the $_GET global. For example, if I sent test.php?myVar=stringwith%2Bsign to

//test.php
print $_GET['myVar'];

the result would is:
stringwith+sign

You do not want to urldecode() the returned $_GET string as +'s will be converted to spaces.
In other words if I sent the same test.php?myVar=stringwith%2Bsign to

//test.php
$string = urldecode($_GET['myVar']);
print $string;

the result is an unexpected:
stringwith sign

It would be safe to rawurldecode() the input, however, it would be redundant and therefore unnecessary.

Community
  • 1
  • 1
  • 1
    Nice answer. You can use PHP code without the starting and ending tags on this site if the question is tagged [tag:php] (also most often it's clear from the context of the question). If you add two spaces at the end of a line you will see the `
    `, so no need to type much HTML. I hope this helps, I edited your answer a little to even more improve it.
    – hakre Sep 26 '12 at 08:32
  • 1
    Thank you for mentioning that PHP decodes the URL for you. That saves me from falling inside a rabbit hole. – Cocest Dec 05 '19 at 15:04
  • Great Answer -> You do not want to urldecode() the returned $_GET string as +'s will be converted to spaces. It would be safe to rawurldecode() the input, however, – MarcoZen Jun 04 '20 at 22:22
19

Yes and no.

The basic charset of base64 may in some cases collide with traditional conventions used in URLs. But many of base64 implementations allow you to change the charset to match URLs better or even come with one (like Python's urlsafe_b64encode()).

Another issue you may be facing is the limit of URL length or rather — lack of such limit. Because standards do not specify any maximum length, browsers, servers, libraries and other software working with HTTP protocol may define its' own limits.

Jørgen
  • 352
  • 2
  • 12
Michał Górny
  • 18,713
  • 5
  • 53
  • 76
13

Its a base64url encode you can try out, its just extension of joeshmo's code above.

function base64url_encode($data) {
return rtrim(strtr(base64_encode($data), '+/', '-_'), '=');
}

function base64url_decode($data) {
return base64_decode(str_pad(strtr($data, '-_', '+/'), strlen($data) % 4, '=', STR_PAD_RIGHT));
}
Andy
  • 436
  • 4
  • 7
5

I don't think that this is safe because e.g. the "=" character is used in raw base 64 and is also used in differentiating the parameters from the values in an HTTP GET.

Mischa
  • 893
  • 5
  • 7
2

If you have sodium extension installed and need to encode binary data, you can use sodium_bin2base64 function which allows you to select url safe variant.

for example encoding can be done like that:

$string = sodium_bin2base64($binData, SODIUM_BASE64_VARIANT_URLSAFE);

and decoding:

$result = sodium_base642bin($base64String, SODIUM_BASE64_VARIANT_URLSAFE);

For more info about usage, check out php docs:

https://www.php.net/manual/en/function.sodium-bin2base64.php https://www.php.net/manual/en/function.sodium-base642bin.php

Biegacz
  • 61
  • 3
1

For url safe encode, like base64.urlsafe_b64encode(...) in Python the code below, works to me for 100%

function base64UrlSafeEncode(string $input)
{
   return str_replace(['+', '/'], ['-', '_'], base64_encode($input));
}
Igor
  • 755
  • 1
  • 10
  • 22
0

In theory, yes, as long as you don't exceed the maximum url and/oor query string length for the client or server.

In practice, things can get a bit trickier. For example, it can trigger an HttpRequestValidationException on ASP.NET if the value happens to contain an "on" and you leave in the trailing "==".

Nicole Calinoiu
  • 20,843
  • 2
  • 44
  • 49
0

For those using .NET, they can utilize the Encode and Decode methods of Base64UrlEncoder class which is found in package Microsoft.IdentityModel.Tokens v6.31.0.

katrash
  • 1,065
  • 12
  • 13