3

I am getting an error when trying to unserialize data. The following error occurs:

unserialize(): Error at offset 46 of 151 bytes

Here is the serialized data:

s:151:"a:1:{i:0;a:4:{s:4:"name";s:15:"Chloe O'Gorman";s:6:"gender";s:6:"female";s:3:"age";s:3:"3_6";s:7:"present";s:34:"Something from Frozen or a jigsaw ";}}";

The error is being caused by a single quote in the data. How can I alleviate this problem when the site and database that I am working with is already live?

Unfortunately I cannot rewrite the code that was responsible for serializing and inserting the data to the database. It is highly likely that there are multiple occurrences of this problem across the database.

Is there a function I can use to escape the data?

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
jagershark
  • 1,162
  • 3
  • 15
  • 27

4 Answers4

8

After doing further research I have found a work around solution. According to this blog post:

"It turns out that if there's a ", ', :, or ; in any of the array values the serialization gets corrupted."

If I was working on a site that hadn't yet been put live, a prevention method would have been to base64_encode my serialized data before it was stored in the database like so:

base64_encode( serialize( $my_data ) );

And then:

unserialize( base64_decode( $encoded_serialized_string ) );

when retrieving the data.

However, as I cannot change what has already been stored in the database, this very helpful post(original post no longer available, but looks like this) provides a solution that works around the problem:

$fixed_serialized_data = preg_replace_callback ( '!s:(\d+):"(.*?)";!', function($match) {
    return ($match[1] == strlen($match[2])) ? $match[0] : 's:' . strlen($match[2]) . ':"' . $match[2] . '";';
}, $my_data );

$result = unserialize( $fixed_serialized_data );
mickmackusa
  • 43,625
  • 12
  • 83
  • 136
jagershark
  • 1,162
  • 3
  • 15
  • 27
1

From what I see, you have a valid serialized string nested inside of a valid serialized string -- meaning serialize() was called twice in the formation of your posted string.

See how you have s:151: followed by:

"a:1:{i:0;a:4:{s:4:"name";s:15:"Chloe O'Gorman";s:6:"gender";s:6:"female";s:3:"age";s:3:"3_6";s:7:"present";s:34:"Something from Frozen or a jigsaw ";}}";

⮤ that is a valid single string that contains pre-serialized data.

After you unserialize THAT, you get:

a:1:{i:0;a:4:{s:4:"name";s:15:"Chloe O'Gorman";s:6:"gender";s:6:"female";s:3:"age";s:3:"3_6";s:7:"present";s:34:"Something from Frozen or a jigsaw ";}}
//                         ^^--^^^^^^^^^^^^^^-- uh oh, that string value has 14 bytes/characters not 15

It looks like somewhere in the string processing, and escaping slash was removed and that corrupted the string.

There is nothing foul about single quotes in serialized data.

You can choose to either:

  1. execute an escaping call to blindly apply slashes to ALL single quotes in your string (which may cause breakages elsewhere) -- assuming you WANT to escape the single quotes for your project's subsequent processes or
  2. execute my following snippet which will not escape the single quotes, but rather adjust the byte/character count to form a valid

Code: (Demo)

$corrupted_byte_counts = <<<STRING
s:151:"a:1:{i:0;a:4:{s:4:"name";s:15:"Chloe O'Gorman";s:6:"gender";s:6:"female";s:3:"age";s:3:"3_6";s:7:"present";s:34:"Something from Frozen or a jigsaw ";}}";
STRING;

$repaired = preg_replace_callback(
        '/s:\d+:"(.*?)";/s',
        function ($m) {
            return 's:' . strlen($m[1]) . ":\"{$m[1]}\";";
        },
        unserialize($corrupted_byte_counts)  // first unserialize string before repairing
    );

echo "corrupted serialized array:\n$corrupted_byte_counts";
echo "\n---\n";
echo "repaired serialized array:\n$repaired";
echo "\n---\n";
print_r(unserialize($repaired));  // unserialize repaired string
echo "\n---\n";
echo serialize($repaired);

Output:

corrupted serialized array:
s:151:"a:1:{i:0;a:4:{s:4:"name";s:15:"Chloe O'Gorman";s:6:"gender";s:6:"female";s:3:"age";s:3:"3_6";s:7:"present";s:34:"Something from Frozen or a jigsaw ";}}";
---
repaired serialized array:
a:1:{i:0;a:4:{s:4:"name";s:14:"Chloe O'Gorman";s:6:"gender";s:6:"female";s:3:"age";s:3:"3_6";s:7:"present";s:34:"Something from Frozen or a jigsaw ";}}
---
Array
(
    [0] => Array
        (
            [name] => Chloe O'Gorman
            [gender] => female
            [age] => 3_6
            [present] => Something from Frozen or a jigsaw 
        )

)

---
s:151:"a:1:{i:0;a:4:{s:4:"name";s:14:"Chloe O'Gorman";s:6:"gender";s:6:"female";s:3:"age";s:3:"3_6";s:7:"present";s:34:"Something from Frozen or a jigsaw ";}}";

*keep in mind, if you want to return your data to its original Matryoshka-serialized form, you will need to call serialize() again on $repaired.

**if you have substrings that contain "; in them, you might try this extended version of my snippet.

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
0

There's nothing wrong with your serialized text as posted. The quotes inside do NOT need to be escaped, because PHP uses the type length indicators to figure out where things start/stop. e.g.

php > $foo = "This string contains a \" quote and a ' quote";
php > $bar = serialize($foo);
php > $baz = unserialize($bar);
php > echo "$foo\n$bar\n$baz\n";
This string contains a " quote and a ' quote
s:44:"This string contains a " quote and a ' quote";
This string contains a " quote and a ' quote

Note the lack of ANY kind of escaping in the serialized string - the quotes inside the string are there as-is, no quoting, no escaping, no encoding.

As posted, your serialized data properly deserializes into a plain JSON string without issue.

Marc B
  • 356,200
  • 43
  • 426
  • 500
0

php nowdoc

unserialize(<<<'DDDD'
[SERIALIZE_STR]
DDDD
);
Andrey
  • 11
  • 3