5

Is there a way to use unserialize with a memory/size limit?

Currently we have:

$data = unserialize($_SESSION['visits']);

and we occasionally get:

PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 17645568 bytes) in Unknown on line 0

when a visitor has had a lot of visits in a short period of time (session value stores information about each page visited).

If the length of $_SESSION['visits'] is above a million characters it causes the issue so I can do a simple check on that but is there a better solution than this:

 if(strlen($_SESSION['visits']) <= 1000000) {
    $data = unserialize($_SESSION['visits']);
} else {
    $data = array();
}

I thought try catch might behave better but it didn't get caught:

try{
    $data = unserialize($_SESSION['vists']);
} catch(\Exception $exception){
    error_log('Caught memory limit');
}

The answer to this question is not to increase the memory size.

user3783243
  • 5,368
  • 5
  • 22
  • 41
  • I think [this is your primary answer](https://stackoverflow.com/a/2318937/231316), basically fatal errors cannot be recovered from. Do you need a full serialization of your data or could you instead use `json_encode()` and specify a subset of fields to store? Also, is [increasing the memory limit](https://stackoverflow.com/a/24752137/231316) an option? You could do it selectively based on the `strlen()`, too. – Chris Haas Jul 01 '20 at 13:36
  • @ChrisHaas I can't increase the memory. I've seen that thread and https://stackoverflow.com/questions/8440439/safely-catch-a-allowed-memory-size-exhausted-error-in-php both are a bit old now. I was presuming either `unserialize` might have a limit on it or there might be a way to catch the exception rather than failing. One of the answers in your thread does mention PHP7's exception catching https://stackoverflow.com/a/2319014/3783243. – user3783243 Jul 01 '20 at 14:33
  • 1
    There isn't a limit to `unserialize` except for available memory. Here's the [RFC for PHP 7](https://wiki.php.net/rfc/engine_exceptions_for_php7) that introduced the change from Errors to Exceptions. If you scroll down to _Not all errors converted_ you'll see a note that memory-related errors aren't converted to an exception. _"Some are impossible, like the memory limit"_. Although you can use `register_shutdown_function`, all that will allow you to do is to spit out a quick dying message and maybe log something. Instead of catching the unserialize, can you handle the size limit at serialize? – Chris Haas Jul 01 '20 at 15:18
  • 1
    Memory limit error is a fatal error not an exception, that's why you cannot catch it. Also, there is no way to limit the memory for only one function. memory_limit is a global var – Oussama Jul 02 '20 at 10:42
  • 1
    Not strictly true: you can catch and post-process "fatal" errors, but not in try/catch block, and also you cannot return to where you were. All you can do is log and... do something else along a new branch. But the out of memory error is actually one of the worst to handle as you can't do anything in your handler that uses more memory. – Robbie Jul 03 '20 at 06:04
  • Side point: this sounds like an ideal function for an in-memory cache such as REDIS. Create a list with the key of the session, RPUSH the visit data and then pull it in when done. Add expiry time (always) so that it expires if you don't, for some reason, handle it. Can be any database, really, but REDIS would be perfect. – Robbie Jul 03 '20 at 06:07
  • `session value stores information about each page visited`. Don't store it directly to session. Instead store it to a temporary db associated with that session. – GetSet Jul 03 '20 at 17:45
  • 2
    Session data is already serialized/unserialized behind the scenes when it's stored in the filesystem. Do you have a particular reason for "double-serializing" your session variables? Also, I'm wondering what on earth do you store there (and why), to get 1M characters' worth visit data per user? I can think of some dirty ways of implementing unserialize in chunks (leading to partial data when memory maxes out), but it seems that a change of strategy (and question) is a (much) better way forward. – Markus AO Jul 03 '20 at 20:21
  • So it sounds like: 1. `unserialize` has no limitation options, it will trial to unserialize everything it has. 2. A `fatal memory limit exceeded` error can't be corrected/caught in the application. Once that has occured all memory has been allocated and only altering shutdown function can be done. 3. Using `serialize` for the session storage is not needed, just setting `$_SESSION['visits'][]`would be sufficient then iterate over `$_SESSION['visits']`. – user3783243 Jul 04 '20 at 02:39
  • @GetSet Reading would exceed DB max connections, +1000 if trying to do that way. We had that initially and once got bigger it took DB offline; moved to master/slave system and it took 3 slaves to run with that set up. – user3783243 Jul 04 '20 at 02:42
  • @Robbie The data never expires which I think it the biggest problem now. Data can go back for 14 years and we need it all. – user3783243 Jul 04 '20 at 02:43
  • 1
    @user3783243 if you need this for 14 years, keep the SessionID in the Session object, and use a database for all the data. Do you need PHP to really parse all that data each time the server gets called? Putting in a DB means you get only the data you need, when you need, and don't process more. And you can do additional queries too such as the type of visit. (We do all our session handling like this, unless there is no DB attached to the project. Use a memory cache on top and it's very efficient.) – Robbie Jul 04 '20 at 06:47

2 Answers2

4

There are two options:

a) unserialize in another phpprocess

Which can fail with memory limit error and possibly return back only the data you are interested in.

How?

  1. create custom php script/file that:
    1. call autoloader/init app
    2. deserialize the passed value thru file or $argv
  2. call that script using exec()

to address the question requirement, you can:

  • either return the memory usage from the custom script (and check if that memory will be available in the parent script)
  • OR return only the data you are interested in back thru file or stdout

b) use custom parser

Like https://github.com/xKerman/restricted-unserialize , which allows:

  • deserialize by individual values and check if you have enough memory before (same problem, but with small data you can check much better)
  • traverse the data without unserializing them completely (thus not using extra memory)

c) use database

The two options above are solution to your requirement. However my strong advise is to store the session/visits data in a database and then store only an unique id to them.

mvorisek
  • 3,290
  • 2
  • 18
  • 53
2

The most memory efficient you'll get is probably by storing everything in a string, packed in binary, and use manual indexing to it.And for this you can make use of pack() method.

Memory Usage Differences

$a = memory_get_usage();
$data = serialize(array(1 => 1, 0 => 2, 3 => 3));
$b = memory_get_usage();
$c = $b - $a;
echo $c; //Outputs 296

And when same data packed in the form of binary string.

$a = memory_get_usage();
$data = pack("C*",array(1 => 1, 0 => 2, 3 => 3));
$b = memory_get_usage();
$c = $b - $a;
echo $c; //Outputs 72

Saves your memory a lot more than expected and is very efficient.

Kunal Raut
  • 2,495
  • 2
  • 9
  • 25