147

What's the best way to determine whether or not a string is the result of the serialize() function?

https://www.php.net/manual/en/function.serialize

Valerio Bozz
  • 1,176
  • 16
  • 32
mO_odRing
  • 1,935
  • 3
  • 15
  • 19

13 Answers13

208

I'd say, try to unserialize it ;-)

Quoting the manual :

In case the passed string is not unserializeable, FALSE is returned and E_NOTICE is issued.

So, you have to check if the return value is false or not (with === or !==, to be sure not to have any problem with 0 or null or anything that equals to false, I'd say).

Just beware the notice : you might want/need to use the @ operator.

For instance :

$str = 'hjkl';
$data = @unserialize($str);
if ($data !== false) {
    echo "ok";
} else {
    echo "not ok";
}

Will get you :

not ok


EDIT : Oh, and like @Peter said (thanks to him!), you might run into trouble if you are trying to unserialize the representation of a boolean false :-(

So, checking that your serialized string is not equal to "b:0;" might be helpful too ; something like this should do the trick, I suppose :

$data = @unserialize($str);
if ($str === 'b:0;' || $data !== false) {
    echo "ok";
} else {
    echo "not ok";
}

testing that special case before trying to unserialize would be an optimization -- but probably not that usefull, if you don't often have a false serialized value.

Pascal MARTIN
  • 395,085
  • 80
  • 655
  • 663
65

From WordPress core functions:

<?php
function is_serialized( $data, $strict = true ) {
    // If it isn't a string, it isn't serialized.
    if ( ! is_string( $data ) ) {
        return false;
    }
    $data = trim( $data );
    if ( 'N;' === $data ) {
        return true;
    }
    if ( strlen( $data ) < 4 ) {
        return false;
    }
    if ( ':' !== $data[1] ) {
        return false;
    }
    if ( $strict ) {
        $lastc = substr( $data, -1 );
        if ( ';' !== $lastc && '}' !== $lastc ) {
            return false;
        }
    } else {
        $semicolon = strpos( $data, ';' );
        $brace     = strpos( $data, '}' );
        // Either ; or } must exist.
        if ( false === $semicolon && false === $brace ) {
            return false;
        }
        // But neither must be in the first X characters.
        if ( false !== $semicolon && $semicolon < 3 ) {
            return false;
        }
        if ( false !== $brace && $brace < 4 ) {
            return false;
        }
    }
    $token = $data[0];
    switch ( $token ) {
        case 's':
            if ( $strict ) {
                if ( '"' !== substr( $data, -2, 1 ) ) {
                    return false;
                }
            } elseif ( false === strpos( $data, '"' ) ) {
                return false;
            }
            // Or else fall through.
        case 'a':
        case 'O':
            return (bool) preg_match( "/^{$token}:[0-9]+:/s", $data );
        case 'b':
        case 'i':
        case 'd':
            $end = $strict ? '$' : '';
            return (bool) preg_match( "/^{$token}:[0-9.E+-]+;$end/", $data );
    }
    return false;
} 
T.Todua
  • 53,146
  • 19
  • 236
  • 237
Brandon
  • 16,382
  • 12
  • 55
  • 88
  • 1
    I basically needed a regex to do a basic detect, I ended up using: `^([adObis]:|N;)` – farinspace Oct 10 '11 at 23:23
  • 6
    Current WordPress version is somewhat more sophisticated: http://codex.wordpress.org/Function_Reference/is_serialized#Source_File – ChrisV Nov 22 '13 at 11:34
  • 3
    +1 for giving credits. I didn't know WordPress had this built-in. Thanks for the idea -- I'll now go ahead and create an archive of useful functions from the WordPress Core. – Amal Murali Feb 22 '14 at 03:50
  • This is a good function. Unserialize by default throws an error if the target isn't valid...yet this will not only detect if it is serialized but correctly formatted which is huge. – user2662680 May 05 '21 at 14:42
  • This function does not handle any arrays. Be also careful that you can false detect strings with added slashes before '"' are also detected as serialized, but in fact unserialize fails on them. – David Vielhuber Jun 01 '21 at 12:09
  • 1
    @CédricFrançoys I've included updated url in answer, so you can delete comment now. – T.Todua Jul 20 '21 at 13:10
27

Optimizing Pascal MARTIN's response

/**
 * Check if a string is serialized
 * @param string $string
 */
public static function is_serial($string) {
    return (@unserialize($string) !== false);
}
SoN9ne
  • 319
  • 4
  • 5
17

If the $string is a serialized false value, ie $string = 'b:0;' SoN9ne's function returns false, it's wrong

so the function would be

/**
 * Check if a string is serialized
 *
 * @param string $string
 *
 * @return bool
 */
function is_serialized_string($string)
{
    return ($string == 'b:0;' || @unserialize($string) !== false);
}
Hazem Noor
  • 176
  • 3
  • 8
13

Despite Pascal MARTIN's excellent answer, I was curious if you could approach this another way, so I did this just as a mental exercise

<?php

ini_set( 'display_errors', 1 );
ini_set( 'track_errors', 1 );
error_reporting( E_ALL );

$valueToUnserialize = serialize( false );
//$valueToUnserialize = "a"; # uncomment this for another test

$unserialized = @unserialize( $valueToUnserialize );

if ( FALSE === $unserialized && isset( $php_errormsg ) && strpos( $php_errormsg, 'unserialize' ) !== FALSE )
{
  echo 'Value could not be unserialized<br>';
  echo $valueToUnserialize;
} else {
  echo 'Value was unserialized!<br>';
  var_dump( $unserialized );
}

And it actually works. The only caveat is that it will likely break if you have a registered error handler because of how $php_errormsg works.

Peter Bailey
  • 105,256
  • 31
  • 182
  • 206
  • 1
    +1 : This one is fun, I have to admit -- wouldn't have thought about it ! And I don't find a way to make it fail, too ^^ Nice work ! And thanks for the comment on my answer : without it, I would probably not have seen this answer. – Pascal MARTIN Sep 02 '09 at 21:55
  • $a = 'bla'; $b = 'b:0;'; Try to unserialize $a then $b with this, both will fail while $b shouldn't. – bardiir Feb 28 '14 at 13:55
  • Not if there was a failure right before. Because $php_errormsg will still contain the serialization error from before and once you deserialize false then it will fail. – bardiir Mar 06 '14 at 17:03
  • Yeah, but only if you don't error-check in-between deserializing `$a` and deserializing `$b`, which is not practical application design. – Peter Bailey Mar 07 '14 at 02:52
  • Just an FYI - `This feature has been DEPRECATED as of PHP 7.2.0. Relying on this feature is highly discouraged.` - https://www.php.net/manual/en/reserved.variables.phperrormsg.php – waterloomatt Dec 14 '20 at 16:33
11
$data = @unserialize($str);
if($data !== false || $str === 'b:0;')
    echo 'ok';
else
    echo "not ok";

Correctly handles the case of serialize(false). :)

chaos
  • 122,029
  • 33
  • 303
  • 309
2

build in to a function

function isSerialized($value)
{
   return preg_match('^([adObis]:|N;)^', $value);
}
RossW
  • 149
  • 7
  • 1
    This regex is dangerous, it's returning positive when `a:` (or `b:` etc) is present somewhere inside $value, not in the beginning. And `^` here doesn't mean beginning of a string. It's totally misleading. – Denis Chmel Aug 14 '18 at 03:09
2

There is WordPress solution: (detail is here)

    function is_serialized($data, $strict = true)
    {
        // if it isn't a string, it isn't serialized.
        if (!is_string($data)) {
            return false;
        }
        $data = trim($data);
        if ('N;' == $data) {
            return true;
        }
        if (strlen($data) < 4) {
            return false;
        }
        if (':' !== $data[1]) {
            return false;
        }
        if ($strict) {
            $lastc = substr($data, -1);
            if (';' !== $lastc && '}' !== $lastc) {
                return false;
            }
        } else {
            $semicolon = strpos($data, ';');
            $brace = strpos($data, '}');
            // Either ; or } must exist.
            if (false === $semicolon && false === $brace)
                return false;
            // But neither must be in the first X characters.
            if (false !== $semicolon && $semicolon < 3)
                return false;
            if (false !== $brace && $brace < 4)
                return false;
        }
        $token = $data[0];
        switch ($token) {
            case 's' :
                if ($strict) {
                    if ('"' !== substr($data, -2, 1)) {
                        return false;
                    }
                } elseif (false === strpos($data, '"')) {
                    return false;
                }
            // or else fall through
            case 'a' :
            case 'O' :
                return (bool)preg_match("/^{$token}:[0-9]+:/s", $data);
            case 'b' :
            case 'i' :
            case 'd' :
                $end = $strict ? '$' : '';
                return (bool)preg_match("/^{$token}:[0-9.E-]+;$end/", $data);
        }
        return false;
    }
ingenious
  • 764
  • 2
  • 8
  • 24
1
/**
 * some people will look down on this little puppy
 */
function isSerialized($s){
if(
    stristr($s, '{' ) != false &&
    stristr($s, '}' ) != false &&
    stristr($s, ';' ) != false &&
    stristr($s, ':' ) != false
    ){
    return true;
}else{
    return false;
}

}
Björn3
  • 297
  • 1
  • 2
  • 8
  • 5
    well, this would give true for many JSON strings as well, wouldnt it? So it's not reliable to determine whether the string can un/serialized. – Gordon Sep 05 '12 at 16:56
  • Might be true, but if the alternative is serialized, or just plain text, as it was for me, it works like a charm. – Björn3 Jul 14 '13 at 12:48
  • 1
    @Björn3 "Well it works for me in this specific case" is a really bad mentality to have when coding. There are a lot of developers who are lazy or not forward-thinking like this and it makes for a nightmare later on down the line when other developers have to work with their code or try to change something and suddenly nothing works properly anymore. – BadHorsie Mar 18 '14 at 11:05
  • Making completly solid code (if that even was possible) is not always the goal or the best practice. Not when it comes at an expence of time. This is only true from the programmers perspective. In real life there is a lot of circomstances where quick and dirty is the preferred way. – Björn3 Mar 22 '14 at 00:02
1

This works fine for me

<?php

function is_serialized($data){
    return (is_string($data) && preg_match("#^((N;)|((a|O|s):[0-9]+:.*[;}])|((b|i|d):[0-9.E-]+;))$#um", $data));
    }

?>
  • Please bear in mind this checks if given string is serialize-looking string - it won't actually check the validity of that string. – eithed Aug 08 '16 at 15:42
1

I would just try to unserialize it. This is how i would solve it

public static function is_serialized($string)
{
    try {
        unserialize($string);
    } catch (\Exception $e) {
        return false;
    }

    return true;
}

Or more like a helper function

function is_serialized($string) {
  try {
        unserialize($string);
    } catch (\Exception $e) {
        return false;
    }

    return true;
}
Mikael Dalholm
  • 121
  • 1
  • 8
1
  • The mentionned WordPress function does not really detect arrays (a:1:{42} is considered to be serialized) and falsely returns true on escaped strings like a:1:{s:3:\"foo\";s:3:\"bar\";} (although unserialize does not work)

  • If you use the @unserialize way on the other side WordPress for example adds an ugly margin at the top of the backend when using define('WP_DEBUG', true);

enter image description here

  • A working solution that solves both problems and circumvents the stfu-operator is:
function __is_serialized($var)
{
    if (!is_string($var) || $var == '') {
        return false;
    }
    set_error_handler(function ($errno, $errstr) {});
    $unserialized = unserialize($var);
    restore_error_handler();
    if ($var !== 'b:0;' && $unserialized === false) {
        return false;
    }
    return true;
}
David Vielhuber
  • 3,253
  • 3
  • 29
  • 34
0

see the wordpress function is_serialized

function is_serialized( $data, $strict = true ) {
// If it isn't a string, it isn't serialized.
if ( ! is_string( $data ) ) {
    return false;
}
$data = trim( $data );
if ( 'N;' === $data ) {
    return true;
}
if ( strlen( $data ) < 4 ) {
    return false;
}
if ( ':' !== $data[1] ) {
    return false;
}
if ( $strict ) {
    $lastc = substr( $data, -1 );
    if ( ';' !== $lastc && '}' !== $lastc ) {
        return false;
    }
} else {
    $semicolon = strpos( $data, ';' );
    $brace     = strpos( $data, '}' );
    // Either ; or } must exist.
    if ( false === $semicolon && false === $brace ) {
        return false;
    }
    // But neither must be in the first X characters.
    if ( false !== $semicolon && $semicolon < 3 ) {
        return false;
    }
    if ( false !== $brace && $brace < 4 ) {
        return false;
    }
}
$token = $data[0];
switch ( $token ) {
    case 's':
        if ( $strict ) {
            if ( '"' !== substr( $data, -2, 1 ) ) {
                return false;
            }
        } elseif ( false === strpos( $data, '"' ) ) {
            return false;
        }
        // Or else fall through.
    case 'a':
    case 'O':
        return (bool) preg_match( "/^{$token}:[0-9]+:/s", $data );
    case 'b':
    case 'i':
    case 'd':
        $end = $strict ? '$' : '';
        return (bool) preg_match( "/^{$token}:[0-9.E+-]+;$end/", $data );
}
return false;

}

Heidar Ammarloo
  • 421
  • 2
  • 11