86

I have a string and want to test using PHP if it's a valid base64 encoded or not.

djot
  • 2,952
  • 4
  • 19
  • 28
Alias
  • 869
  • 1
  • 6
  • 3
  • 1
    This is probably a duplicate. – Gumbo Nov 25 '10 at 14:33
  • Beware of the `base64_encode(base64_decode($data, true)) === $data` technique. See the comments under: [Amir's answer @ Detect base64 encoding in PHP?](https://stackoverflow.com/a/17018924/2943403) which bang on about how many ways it fails and why. – mickmackusa Jan 20 '22 at 11:33

20 Answers20

144

I realise that this is an old topic, but using the strict parameter isn't necessarily going to help.

Running base64_decode on a string such as "I am not base 64 encoded" will not return false.

If however you try decoding the string with strict and re-encode it with base64_encode, you can compare the result with the original data to determine if it's a valid bas64 encoded value:

if ( base64_encode(base64_decode($data, true)) === $data){
    echo '$data is valid';
} else {
    echo '$data is NOT valid';
}
PottyBert
  • 1,902
  • 1
  • 13
  • 14
  • Why will that not work? the string 'node' encodes to bm9kZQ== (I've tested it) – PottyBert Nov 06 '13 at 15:33
  • 5
    @Sam That is because "test" is a perfectly fine base64 string. It uses only base64 characters (a-z, A-Z, 0-9) and it's length is cleanly divisible by four. Those are the only requirements for a valid base64 string. What did you expect to happen when entering "test"? – Kevin Sep 16 '14 at 13:57
  • @PottyBert Do you have any idea how to find what is wrong with it if your snippet returns not valid? I used several online encoders to encode a png image, all return a different base64 string and all are invalid according to this snippet. – Kevin Sep 16 '14 at 14:04
  • Worked like a charm, even with Laravel's crypt() library. – niczak Sep 30 '14 at 14:10
  • @Kevin if you get different results when base64 encoding the image, then something is wrong, the same data should encode the same way each time. – PottyBert Sep 30 '14 at 18:58
  • @PottyBert I know right ^^, I'm not sure what I was working on anymore though – Kevin Sep 30 '14 at 21:24
  • 1
    This will generate a warning if `$data` doesn't have valid characters because the second base64_decode will return **FALSE** and the first one will be enconding `false` as bool. `base64_decode( false ) === $data` so I recommend to put an @ to prevent a warning. – Zerquix18 Oct 15 '15 at 21:27
  • @catbadger assuming you mean "123412341234" as a string then it works, that's a valid base64 string, if you mean the integer, then it's not a string so won't work. – PottyBert Mar 13 '17 at 17:55
  • ok, so it's fine to base64 decode that then run decryption on it? i think not. – catbadger Mar 15 '17 at 13:23
  • @catbadger base64 is not encryption, it's a way of encoding data such that it is transmittable using mechanisms which only support the ASCII charset – PottyBert Mar 15 '17 at 13:29
  • I understand that @PottyBert. The issue is that in my specific application of this code, I need to base64 decode something that is sometimes not encrypted... some is not base64 decoded. There is no "good way" to tell whether something is base64 encoded was my point. – catbadger Mar 16 '17 at 19:20
  • 1
    @catbadger But that's not what this question was about, it was about determining if the data is valid base64, which the string "123412341234" is, just because that's not good enough for your purposes, doesn't mean that it isn't valid base64. In your instance, if you control the encoding of the data, you can mark it in some fashion prior to base64 encoding, that way you CAN determine whether you should pass it through decryption after decoding – PottyBert Mar 16 '17 at 19:25
  • 1
    NOTE: Running base64_encode(base64_decode($data, true)) on a string such as "test" will true, because it is multiples of 4 and only contain [A-Z, a-z, 0-9, and + /]. If the rest length is less than 4, the string is padded with '=' characters, so to solve this problem, i run base64_decode twice (with strict mode) and base64_encode twice as well... This will evaluate all type of none base64 to false. if ( base64_encode(base64_encode(base64_decode(base64_decode($data, true)))) === $data) – Tiamiyu Saheed Oluwatosin May 28 '20 at 20:15
  • 1
    talking about "old topic", well.. here we are in 2022, 10 years later is still looking for this :)) – Jacky Supit Jan 25 '22 at 05:07
32

You can use this function:

 function is_base64($s)
{
      return (bool) preg_match('/^[a-zA-Z0-9\/\r\n+]*={0,2}$/', $s);
}
xdazz
  • 158,678
  • 38
  • 247
  • 274
Dennais
  • 476
  • 5
  • 14
  • 2
    I think this is closest to the best way to detect this. base64_decode( – Thomas Schultz Jan 14 '13 at 20:39
  • 2
    I'll leave a note here: be careful of the regexp subject max size https://secure.php.net/manual/en/pcre.constants.php#118538 In PHP 7 at least you won't be able to check a base64-encoded image that way (with default PHP settings) – Kaktus Aug 04 '17 at 14:50
  • This does not detect an invalid base 64 string it just checks if the character set and formatting is correct. simply add a 1 to the end of any non-padded base64 encoded string and it will decode simply ignoring the appended 1. strict checking does not work in this instance either. – DeveloperChris Nov 20 '19 at 03:07
  • @DeveloperChris Well, if the length of non-padded base64 encoded string is a multiple of 4 then any other character added to the end of the string should be ignored. But this is OK. – Karolis Dec 04 '19 at 01:08
14

Just for strings, you could use this function, that checks several base64 properties before returning true:

function is_base64($s){
    // Check if there are valid base64 characters
    if (!preg_match('/^[a-zA-Z0-9\/\r\n+]*={0,2}$/', $s)) return false;

    // Decode the string in strict mode and check the results
    $decoded = base64_decode($s, true);
    if(false === $decoded) return false;

    // Encode the string again
    if(base64_encode($decoded) != $s) return false;

    return true;
}
merlucin
  • 473
  • 4
  • 8
  • This is very similar to my own implementation. First checks for valid chars then checks for decode/encode string and compare with original one. – Matteo Gaggiano Jan 19 '18 at 09:21
  • 1
    Shortest ver `function is_base64($s) { $decoded = base64_decode($s, true); return preg_match('/^[a-zA-Z0-9\/\r\n+]*={0,2}$/', $s) && false !== $decoded && base64_encode($decoded) == $s; }` – Andrew Rumm Feb 18 '18 at 13:59
  • 2
    For those considering the use of the Andrew's line of code, in backend (code not running in the browser) I recommend legibility (while keeping the performance) vs all code in the same line. And use comments! Don't increase the technical debt!!! – merlucin Apr 05 '18 at 09:07
9

This code should work, as the decode function returns FALSE if the string is not valid:

if (base64_decode($mystring, true)) {
    // is valid
} else {
    // not valid
}

You can read more about the base64_decode function in the documentation.

EdoDodo
  • 8,220
  • 3
  • 24
  • 30
  • 29
    downvote because this is not the right way to determine if the string is encoded as base64. It only checks wether the string has characters outside of the base64 alphabet. As Kris said, the string "I am not base 64 encoded" does not return false with this method. – Maurice Nov 07 '12 at 14:33
  • 6
    Maurice is correct here. Please **do not rely on this answer**. It is not correct and will not determine whether a string is base64 encoded. From documentation: `strict: Returns FALSE if input contains character from outside the base64 alphabet.` I don't know why PHP decided to handle it this way, but regardless, it doesn't truly detect base64 encoding. Kris' answer is correct. – Ben D Jan 03 '13 at 23:26
  • This returns "and" as valid. "and" is not valid (base64 encoding should have a number of characters that is divisilble by 4). base64_decode will decode invalid strings. – lilHar Oct 16 '19 at 17:01
  • @liljoshu As for "divisible by 4", that would be true for padded base64 strings, but they don't have to be padded. – Karolis Dec 04 '19 at 01:28
6

I think the only way to do that is to do a base64_decode() with the $strict parameter set to true, and see whether it returns false.

Pekka
  • 442,112
  • 142
  • 972
  • 1,088
  • Making CW because this is a double-dupe – Pekka Nov 25 '10 at 14:41
  • 3
    Downvote because of same reasons like another similar answer: `It only checks wether the string has characters outside of the base64 alphabet.` – Marki Aug 16 '18 at 12:51
4

I write this method is working perfectly on my projects. When you pass the base64 Image to this method, If it valid return true else return false. Let's try and let me know any wrong. I will edit and learn in the feature.

/**
 * @param $str
 * @return bool
 */
private function isValid64base($str){
    if (base64_decode($str, true) !== false){
        return true;
    } else {
        return false;
    }
}
Niroshan
  • 360
  • 3
  • 5
3

This is a really old question, but I found the following approach to be practically bullet proof. It also takes into account those weird strings with invalid characters that would cause an exception when validating.

    public static function isBase64Encoded($str) 
{
    try
    {
        $decoded = base64_decode($str, true);

        if ( base64_encode($decoded) === $str ) {
            return true;
        }
        else {
            return false;
        }
    }
    catch(Exception $e)
    {
        // If exception is caught, then it is not a base64 encoded string
        return false;
    }

}

I got the idea from this page and adapted it to PHP.

Lucio Mollinedo
  • 2,295
  • 1
  • 33
  • 28
  • string like "ciao" will be decoded successfully into something like: "r&�". It's not a bulletproof method. – m47730 Apr 28 '16 at 16:39
3

I tried the following:

  • base64 decode the string with strict parameter set to true.
  • base64 encode the result of previous step. if the result is not same as the original string, then original string is not base64 encoded
  • if the result is same as previous string, then check if the decoded string contains printable characters. I used the php function ctype_print to check for non printable characters. The function returns false if the input string contains one or more non printable characters.

The following code implements the above steps:

public function IsBase64($data) {
    $decoded_data = base64_decode($data, true);
    $encoded_data = base64_encode($decoded_data);
    if ($encoded_data != $data) return false;
    else if (!ctype_print($decoded_data)) return false;

    return true;
}

The above code will may return unexpected results. For e.g for the string "json" it will return false. "json" may be a valid base64 encoded string since the number of characters it has is a multiple of 4 and all characters are in the allowed range for base64 encoded strings. It seems we must know the range of allowed characters of the original string and then check if the decoded data has those characters.

Nadir Latif
  • 3,690
  • 1
  • 15
  • 24
3

Alright guys... finally I have found a bullet proof solution for this problem. Use this below function to check if the string is base64 encoded or not -

    private function is_base64_encoded($str) {

       $decoded_str = base64_decode($str);
       $Str1 = preg_replace('/[\x00-\x1F\x7F-\xFF]/', '', $decoded_str);
       if ($Str1!=$decoded_str || $Str1 == '') {
          return false;
       }
       return true;
    }
bilal
  • 303
  • 3
  • 10
3

if u are doing api calls using js for image/file upload to the back end this might help

function is_base64_string($string)  //check base 64 encode 
{
  // Check if there is no invalid character in string
  if (!preg_match('/^(?:[data]{4}:(text|image|application)\/[a-z]*)/', $string)){
    return false;
  }else{
    return true;
  }

}
1

Old topic, but I've found this function and It's working:

function checkBase64Encoded($encodedString) {
$length = strlen($encodedString);

// Check every character.
for ($i = 0; $i < $length; ++$i) {
$c = $encodedString[$i];
if (
($c < '0' || $c > '9')
&& ($c < 'a' || $c > 'z')
&& ($c < 'A' || $c > 'Z')
&& ($c != '+')
&& ($c != '/')
&& ($c != '=')
) {
// Bad character found.
return false;
}
}
// Only good characters found.
return true;
}
Klian
  • 1,520
  • 5
  • 21
  • 32
1

I code a solution to validate images checking the sintaxy

$image = '';
$allowedExtensions = ['png', 'jpg', 'jpeg'];

// check if the data is empty
if (empty($image)) {
    echo "Empty data";
}

// check base64 format
$explode = explode(',', $image);
if(count($explode) !== 2){
    echo "This string isn't sintaxed as base64";
}
//https://stackoverflow.com/a/11154248/4830771
if (!preg_match('%^[a-zA-Z0-9/+]*={0,2}$%', $explode[1])) {
    echo "This string isn't sintaxed as base64";
}

// check if type is allowed
$format = str_replace(
        ['data:image/', ';', 'base64'], 
        ['', '', '',], 
        $explode[0]
);
if (!in_array($format, $allowedExtensions)) {
    echo "Image type isn't allowed";
}
echo "This image is base64";

But a safe way is using Intervention

use Intervention\Image\ImageManagerStatic;
try {
    ImageManagerStatic::make($value);
    return true;
} catch (Exception $e) {
    return false;
}
Ennio Sousa
  • 156
  • 8
0

i know that i resort a very old question, and i tried all of the methods proposed; i finally end up with this regex that cover almost all of my cases:

$decoded = base64_decode($string, true);
if (0 < preg_match('/((?![[:graph:]])(?!\s)(?!\p{L}))./', $decoded, $matched)) return false;

basically i check for every character that is not printable (:graph:) is not a space or tab (\s) and is not a unicode letter (all accent ex: èéùìà etc.)

i still get false positive with this chars: £§° but i never use them in a string and for me is perfectly fine to invalidate them. I aggregate this check with the function proposed by @merlucin

so the result:

function is_base64($s)
{
  // Check if there are valid base64 characters
  if (!preg_match('/^[a-zA-Z0-9\/\r\n+]*={0,2}$/', $s)) return false;

  // Decode the string in strict mode and check the results
  $decoded = base64_decode($s, true);
  if(false === $decoded) return false;

  // if string returned contains not printable chars
  if (0 < preg_match('/((?![[:graph:]])(?!\s)(?!\p{L}))./', $decoded, $matched)) return false;

  // Encode the string again
  if(base64_encode($decoded) != $s) return false;

  return true;
}
m47730
  • 2,061
  • 2
  • 24
  • 30
0

You can just send the string through base64_decode (with $strict set to TRUE), it will return FALSE if the input is invalid.

You can also use f.i. regular expressions see whether the string contains any characters outside the base64 alphabet, and check whether it contains the right amount of padding at the end (= characters). But just using base64_decode is much easier, and there shouldn't be a risk of a malformed string causing any harm.

Wim
  • 11,091
  • 41
  • 58
0

base64_decode() should return false if your base64 encoded data is not valid.

Scoop
  • 310
  • 1
  • 4
0

MOST ANSWERS HERE ARE NOT RELIABLE

In fact, there is no reliable answer, as many non-base64-encoded text will be readable as base64-encoded, so there's no default way to know for sure.

Further, it's worth noting that base64_decode will decode many invalid strings For exmaple, and is not valid base64 encoding, but base64_decode WILL decode it. As jw specifically. (I learned this the hard way)

That said, your most reliable method is, if you control the input, to add an identifier to the string after you encode it that is unique and not base64, and include it along with other checks. It's not bullet-proof, but it's a lot more bullet resistant than any other solution I've seen. For example:

function my_base64_encode($string){
  $prefix = 'z64ENCODEDz_';
  $suffix = '_z64ENCODEDz';
  return $prefix . base64_encode($string) . $suffix;
}

function my_base64_decode($string){
  $prefix = 'z64ENCODEDz_';
  $suffix = '_z64ENCODEDz';
  if (substr($string, 0, strlen($prefix)) == $prefix) {
    $string = substr($string, strlen($prefix));
  }
  if (substr($string, (0-(strlen($suffix)))) == $suffix) {
    $string = substr($string, 0, (0-(strlen($suffix))));
  }
      return base64_decode($string);
}

function is_my_base64_encoded($string){
  $prefix = 'z64ENCODEDz_';
  $suffix = '_z64ENCODEDz';
  if (strpos($string, 0, 12) == $prefix && strpos($string, -1, 12) == $suffix && my_base64_encode(my_base64_decode($string)) == $string && strlen($string)%4 == 0){
    return true;
  } else {
    return false;
  }
}
lilHar
  • 1,735
  • 3
  • 21
  • 35
  • 1
    Are you missing `$string` inside function argument at second line? `base64_encode($string)` instead of `base64_encode()`? – Deele Oct 29 '19 at 10:28
  • 1
    And, actual decoding in line 5? `base64_decode(rtrim(ltrim($string, "z64ENCODEDz_"), "_z64ENCODEDz"))` instead of `rtrim(ltrim($string, "z64ENCODEDz_"), "_z64ENCODEDz")`? – Deele Oct 29 '19 at 11:25
  • You're right. Also I should have been using a substr instead of a trim. Fixed that too. – lilHar Oct 29 '19 at 15:51
0

I have found my solution by accident.

For those who use base64_encode(base64_decode('xxx')) to check may found that some time it is not able to check for string like test, 5555.

If the invalid base 64 string was base64_decode() without return false, it will be dead when you try to json_encode() anyway. This because the decoded string is invalid.
So, I use this method to check for valid base 64 encoded string.

Here is the code.

/**
 * Check if the given string is valid base 64 encoded.
 *
 * @param string $string The string to check.
 * @return bool Return `true` if valid, `false` for otherwise.
 */
function isBase64Encoded($string): bool
{
    if (!is_string($string)) {
        // if check value is not string.
        // base64_decode require this argument to be string, if not then just return `false`.
        // don't use type hint because `false` value will be converted to empty string.
        return false;
    }

    $decoded = base64_decode($string, true);
    if (false === $decoded) {
        return false;
    }

    if (json_encode([$decoded]) === false) {
        return false;
    }

    return true;
}// isBase64Encoded

And here is tests code.

// each tests value must be 'original string' => 'base 64 encoded string'
$testValues = [
    555 => 'NTU1',
    5555 => 'NTU1NQ==',
    'hello' => 'aGVsbG8=',
    'สวัสดี' => '4Liq4Lin4Lix4Liq4LiU4Li1',
    'test' => 'dGVzdA==',
];


foreach ($testValues as $invalid => $valid) {
    if (isBase64Encoded($invalid) === false) {
        echo '<strong>' . $invalid . '</strong> is invalid base 64<br>';
    } else {
        echo '<strong style="color:red;">Error:</strong>';
        echo '<strong>' . $invalid . '</strong> should not be valid base 64<br>';
    }

    if (isBase64Encoded($valid) === true) {
        echo '<strong>' . $valid . '</strong> is valid base 64<br>';
    } else {
        echo '<strong style="color:red;">Error:</strong>';
        echo '<strong>' . $valid . '</strong> should not be invalid base 64<br>';
    }

    echo '<br>';
}

Tests result:

555 is invalid base 64
NTU1 is valid base 64

5555 is invalid base 64
NTU1NQ== is valid base 64

hello is invalid base 64
aGVsbG8= is valid base 64

สวัสดี is invalid base 64
4Liq4Lin4Lix4Liq4LiU4Li1 is valid base 64

test is invalid base 64
dGVzdA== is valid base 64

vee
  • 4,506
  • 5
  • 44
  • 81
0

To validate without errors that someone sends a clipped base64 or that it is not an image, use this function to check the base64 and then if it is really an image

function check_base64_image($base64) {
try {
    if (base64_encode(base64_decode($base64, true)) === $base64) {
        $img = imagecreatefromstring(base64_decode($base64, true));
        if (!$img) {
            return false;
        }
        imagepng($img, 'tmp.png');
        $info = getimagesize('tmp.png');
        unlink('tmp.png');
        if ($info[0] > 0 && $info[1] > 0 && $info['mime']) {
            return true;
        }
    }
} catch (Exception $ex) {
    return false;
} }
-1

I am using this approach. It expects the last 2 characters to be ==

substr($buff, -2, 1) == '=' && substr($buff, -1, 1) == '=')

Update: I ended up doing another check if the one above fails base64_decode($buff, true)

Svetoslav Marinov
  • 1,498
  • 14
  • 11
  • 2
    FYI: `substr($buff, -2) === '==')` will be the same and faster. – Sam Feb 13 '14 at 20:23
  • To better say what SangamAngre said, there may be only a single "=" at the end depending on the padding needed, be it 8bit padding or 16bit padding. – David May 21 '15 at 05:30
  • Having `==` at the end is a necessary but not sufficient condition to be a valid Base64 string. – Salvatore Zappalà Nov 02 '16 at 08:58
-3

If data is not valid base64 then function base64_decode($string, true) will return FALSE.

citrin
  • 725
  • 6
  • 9
  • 1
    the statement is incorrent. As documentation says: "if $string s not valid base64 then function base64_decode($string, true) will return FALSE". So some invalid base64 string like "ciao" for example will be decoded as "r&�" – m47730 Apr 28 '16 at 16:37