18

I have a daily cron job which takes about 5 minutes to run (it does some data gathering and then various database updates). It works fine, but the problem is that, during those 5 minutes, the site is completely unresponsive to any requests, HTTP or otherwise.

It would appear that the cron job script takes up all the resources while it runs. I couldn't find anything in the PHP docs to help me out here - how can I make the script know to only use up, say, 50% of available resources? I'd much rather have it run for 10 minutes and have the site available to users during that time, than have it run for 5 minutes and have user complaints about downtime every single day.

I'm sure I could come up with a way to configure the server itself to make this happen, but I would much prefer if there was a built-in approach in PHP to resolving this issue. Is there?

Alternatively, as plan B, we could redirect all user requests to a static downtime page while the script is running (as opposed to what's happening now, which is the page loading indefinitely or eventually timing out).

sveti petar
  • 3,637
  • 13
  • 67
  • 144
  • Using google brings up `ini_set('memory_limit', '');` – Tobias F. Mar 27 '18 at 11:35
  • @TobiasF. Yes I'm aware of this option, however I'm not trying to limit the total amount of memory used by PHP, I'm trying to limit the percentage of the memory limit that this particular script uses. – sveti petar Mar 27 '18 at 11:37
  • 2
    Take a look at the "nice level" unixoid systems offer to control resource usage of processes. – arkascha Mar 27 '18 at 11:37
  • 6
    About the "complete unresponsiveness"... this sounds as if that script locks the database tables. If that is the case, then your issue is not overall resource usage, but a single bottle neck. – arkascha Mar 27 '18 at 11:38
  • @arkascha Hm. Just to be clear, if it locks the database tables, then only those requests that require those tables would be unresponsive? Or all requests? – sveti petar Mar 27 '18 at 12:51
  • @arkascha If that is the case, then I think it's safe to just turn off table locking for the script, as users aren't allowed to update values in those tables, they can only read the existing values. Am I correct or is there some danger I'm not seeing? – sveti petar Mar 27 '18 at 12:53
  • Also I don't think I'm locking the tables in question, unless this option is enabled by default in Laravel somehow. I certainly didn't put it in there myself. – sveti petar Mar 27 '18 at 13:01
  • 1
    Have you witnessed the resource consumption? Is it really the cron-job that eats all available memory? Is it CPU-load? Is it wait-IO? DB-Locks as mentioned by @arkascha? Depending on your setup, table-level locks may happen implicitly; without performance data from the machines in question a guessing game... – Tom Regner Apr 04 '18 at 08:33

7 Answers7

6

A normal script can't hog up 100% of resources, resources get split over the processes. It could slow everything down intensly, but not lock all resources in (without doing some funky stuff). You could get a hint by doing top -s in your commandline, see which process takes up a lot.

That leads to conclude that something locks all further processes. As Arkascha comments, there is a fair chance that your database gets locked. This answer explains which table type you should use; If you do not have it set to InnoDB, you probally want that, at least for the locking tables.

It could also be disk I/O if you write huge files, try to split it into smaller read/writes or try to place some of the info (e.g. if it are files with lists) to your database (assuming that has room to spare).

It could also be CPU. To fix that, you need to make your code more efficient. Recheck your code, see if you do heavy operations and try to make those smaller. Normally you want this as fast as possible, now you want them as lightweight as possible, this changes the way you write code.

If it still locks up, it's time to debug. Turn off a large part of your code and check if the locking still happens. Continue turning on code untill you notice locking. Then fix that. Try to figure out what is costing you so much. Only a few scripts require intense resources, it is now time to optimize. One option might be splitting it into two (or more) steps. Run a cron that prepares/sanites the data, and one that processed the data. These dont have to run at syncronical, there might be a few minutes between them.

If that is not an option, benchmark your code and improve as much as you can. If you have a heavy query, it might improve by selecting only ID's in the heavy query and use a second query just to fetch the data. If you can, use your database to filter, sort and manage data, don't do that in PHP.
What I have also implemented once is a sleep every N actions.

If your script really is that extreme, another solution could be moving it to a time when little/no visitors are on your site. Even if you remove the bottleneck, nobody likes a slow website.

And there is always the option of increasing your hardware.

Martijn
  • 15,791
  • 4
  • 36
  • 68
  • yeah, it probably has nothing to do with cpu/ram usage, it's either a db lock, or `max processes on this account` block, or `max concurrent php processes` block, as is not uncommon in the `shared webhost` world. – hanshenrik Apr 11 '18 at 07:19
  • @hanshenrik I do not think it the "max processes" setting is the issue. The db probably is and more specific the I/O to disk it does. As the db probably runs in separate process, eg "mysql service", setting the priority of your script will not affect that(correct me if I am wrong at this or somehow could be done). There are some "heavy" queries in your script. – Jannes Botis Apr 11 '18 at 08:00
  • 1
    @Martijn "Turn of a large part of your code" you meant "Turn **off** ..." I guess) – Tarasovych Apr 11 '18 at 21:12
  • Yes I did :) If you have a few edits, feel free to edit a post. That keeps improving SO :) – Martijn Apr 12 '18 at 07:13
  • @Martijn Actually I found the issue is not DB locking, it is a file_get_contents() function that runs during the execution of this script. While file_get_contents() is doing its thing, nothing else works. As soon as it's finished, everything goes back to normal. Any ideas? – sveti petar Jun 01 '18 at 11:16
  • It must be a giant file then, If so, google how to open large files, there are a lot of solutions which are better explained than I can in a comment :) – Martijn Jun 01 '18 at 11:55
6

You don't mention which resources are your bottleneck; CPU, memory or disk I/O.

However if it is CPU or memory you can do something this in you script: http://php.net/manual/en/function.sys-getloadavg.php http://php.net/manual/en/function.memory-get-usage.php

$yourlimit = 100000000; 
$load = sys_getloadavg();
if ($load[0] > 0.80 || memory_get_usage() > $yourlimit) {
    sleep(5);
}

Another thing to try would be to set your process priority in your script. This requires SU though, which should be fine for a cronjob? http://php.net/manual/en/function.proc-nice.php

 proc_nice(50);

I did a quick test for both and it work like a charm, thanks for asking I have cronjob like that as well and will implement it. It looks like the proc_nice only will do fine.

My test code:

proc_nice(50);
$yourlimit = 100000000;
while (1) {
    $x = $x+1;
    $load = sys_getloadavg();
    if ($load[0] > 0.80 || memory_get_usage() > $yourlimit) {
        sleep(5);
    }
    echo $x."\n";
}
PaulV
  • 129
  • 3
  • While this might work, this wont fix the actual problem. If the DB or amount to process increases, this solution will take longer and longer – Martijn Apr 11 '18 at 07:11
  • "memory_get_usage" gets the memory used by your script, not the system memory usage. Putting the script to sleep will not reduce that, it may give other processes time to reduce theirs, but not yours. See the 3rd note: http://php.net/manual/en/function.memory-get-usage.php – Jannes Botis Apr 11 '18 at 07:31
  • Nonetheless, +1 from me. @Martijn, how is this gonna take longer in that case? – Jannes Botis Apr 11 '18 at 07:39
  • If it hits a certain point it sleeps 5sec. The current code has a bottleneck, which means that if you add N work, it will take N+sleepfixes more time, the time will increase faster than the amount of work added. – Martijn Apr 11 '18 at 07:41
  • 1
    Yes, but is n`t that the point? You caused N work or server is busy the last minute, stop, sleep, let others do their work. After that, continue working, if again caused too much work or server is busy, sleep, if not, continue working. You sleep to give others the time to CPU. – Jannes Botis Apr 11 '18 at 08:22
5

It really depend of your environment.

If using a unix base, there is built-in tools to limit cpu/priority of a given process.

You can limit the server or php alone, wich is probably not what you are looking for.

What you can do first is to separate your task in a separate process.

There is popen for that, but i found it much more easier to make the process as a bash script. Let''s name it hugetask for the example.

#!/usr/bin/php
<?php
// Huge task here 

Then to call from the command line (or cron):

nice -n 15 ./hugetask

This will limit the scheduling. It mean it will low the priority of the task against others. The system will do the job.

You can as well call it from your php directly:

exec("nice -n 15 ./hugetask &");

Usage: nice [OPTION] [COMMAND [ARG]...] Run COMMAND with an adjusted niceness, which affects process scheduling. With no COMMAND, print the current niceness. Niceness values range from -20 (most favorable to the process) to 19 (least favorable to the process).

To create a cpu limit, see the tool cpulimit which has more options.

This said, usually i am just putting some usleep() in my scripts, to slow it down and avoid to create a funnel of data. This is ok if you are using loops in your script. If you slow down your task to run in say 30 minutes, there won't be much issues.

See also proc_nice http://php.net/manual/en/function.proc-nice.php

proc_nice() changes the priority of the current process by the amount specified in increment. A positive increment will lower the priority of the current process, whereas a negative increment will raise the priority.

And sys_getloadavg can also help. It will return an array of the system load in the last 1,5, and 15 minutes. It can be used as a test condition before launching the huge task. Or to log the average to find the best day time to launch huge task. It can be susrprising!

print_r(sys_getloadavg());

http://php.net/manual/en/function.sys-getloadavg.php

NVRM
  • 11,480
  • 1
  • 88
  • 87
2

You could try to delay execution using sleep. Just cause your script to pause between several updates of your database.

sleep(60); // stop execution for 60 seconds

Although this depends a lot on the kind of process you are doing in your script. Maybe or not helpful in your case. Worth a try, so you could

  • Split your queries
  • do the updates in steps with sleep inbetween

References

Using sleep for cron process

I could not describe it better than the quote in the above answer:

Maybe you're walking the database of 9,000,000 book titles and updating about 10% of them. That process has to run in the middle of the day, but there are so many updates to be done that running your batch program drags the database server down to a crawl for other users.

So modify the batch process to submit, say, 1000 updates, then sleep for 5 seconds to give the database server a chance to finish processing any requests from other users that have backed up.

Sleep and server resources

sleep resources depend on OS

adding sleep to allevaite server resources

Community
  • 1
  • 1
Jannes Botis
  • 11,154
  • 3
  • 21
  • 39
1

Probably to minimize you memory usage you should process heavy and lengthy operations in batches. If you query the database using an ORM like doctrine you can easily use existing functions

http://docs.doctrine-project.org/projects/doctrine-orm/en/latest/reference/batch-processing.html

vasillis
  • 342
  • 4
  • 9
1

It's hard to tell what exactly the issue may be without having a look at your code (cron script). But to confirm that the issue is caused by the cron job you can run the script manually and check website responsiveness. If you notice the site being down when running the cron job then we would have to have a look at your script in order to come up with a solution.

EmilCataranciuc
  • 1,024
  • 1
  • 11
  • 24
1

Many loops in your cron script might consume a lot of CPU resources. To prevent that and reduce CPU usage simply put some delays in your script, for example:

while($long_time_condition) { 
    //Do something here
    usleep(100000); 
}

Basically, you are giving the processor some time to do something else. Also you can use the proc_nice() function to change the process priority. For example proc_nice(20);//very low priority. Look at this question.

If you want to find the bottlenecks in your code you can try to use Xdebug profiler.

Just set it up in your dev environment, start the cron manually and then profile any page. Also you can profile your cron script as well php -d xdebug.profiler_enable=On script.php, look at this question.

If you suspect that the database is your bottleneck than import pretty large dataset (or entire database) in your local database and repeat the steps, logging and inspecting all the queries.

Alternatively if it possible setup the Xdebug on the staging server where the server is as close as possible to production and profile the page during cron execution.

zstate
  • 1,995
  • 1
  • 18
  • 20