2

I currently have a PHP website built with codeigniter, and i'm having issues with CLI and cron jobs.

The CLI is setup so the controller running the script is found in the /application/controllers/scrape on the server (looking via the ftp) this would be /public_html/application/controllers/scrape, the function to run is called all_sites.

I'm hosted with TSOhost and can successfully run the command using the browser via URL (website.com/index.php/scrape/all_sites)however the script times out, hence the need to use a cron job to run the script.

So far i have tried the following raw cron commands in the advanced mode in the TSOhost control panel when trying to get the script to run daily:

The TSOhost technician set this up

03 19 * * * /usr/bin/php-5.3 /var/sites/s/website.com/public_html/application/controllers/scrape.php (didn't work)
0  6  * * * /usr/bin/wget -O /dev/null -o /dev/null http://www.speeddatemate.com/index.php/scrape/all_sites
0  6  * * * /usr/bin/php-5.3 /var/sites/s/speeddatemate.com/public_html/application/controllers/scrape/all_sites
03 19 * * * /usr/bin/php-5.3 /var/sites/s/speeddatemate.com/public_html/application/controllers/scrape.php
03 19 * * * /usr/bin/php-5.3 /var/sites/s/speeddatemate.com/public_html/application/controllers/scrape/all_sites
10 18 * * * /usr/bin/php-5.3 /var/sites/s/speeddatemate.com/public_html/index.php scrape all_sites

TSO host have stated:

For referencing your site path, use /var/sites/s/website.com

The path to PHP 5.2 is /usr/bin/php and for 5.3 it's /usr/bin/php-5.3

The technician also said:

"To run from CLI you would need to find a way to get those parameters into the script, they can't be appended to the command."

Although is this not the Point of the command?

I've also tried running it via the "make a http request" option which creates the raw job as:

0 6 * * * /usr/bin/wget -O /dev/null -o /dev/null http://www.speeddatemate.com/index.php/scrape/all_sites

Again this does not work.

I've searched high and low to find a way to get this working and read various posts and tried various methods nothing has worked. Can anyone help?

Community
  • 1
  • 1
SmokersCough
  • 967
  • 7
  • 22
  • Without seeing what your code does, what parameters it needs it would be hard to tell you why its not work on command line and what command you would need to make it run on the cronjob additionally the time out could also be resolved depending on what is causing it. – Prix Mar 23 '14 at 13:11
  • @SmokersCough Can you share more info about the nature of the script? – Anshul Goyal Mar 25 '14 at 11:02
  • Ok, i can successfully run the command from the terminal with this command: php index.php scrape all_sites...... and the script completes, the script essentially scrapes some data from some websites. If someone can confirm the cron job command i should use to run this, i can try again, running from url times out due to the long time the script takes to complete. (about 1 hour) but this should not be the case when run from a cron job. – SmokersCough Mar 26 '14 at 11:18
  • @SmokersCough Were you able to resolve this? Check the edits to my answer. Comment on my answer if you need clarifications with any specific point. – Anshul Goyal Mar 28 '14 at 10:31
  • Unfortunately this has not worked, the first command did not seem to trigger the script, is there a way i can debug out to a log file when the script is attempted to run? Secondly the path section i'm still a little uncertain about how to go about setting that up. – SmokersCough Mar 29 '14 at 11:35
  • @SmokersCough give me the exact command that works for you and we can setup up the cron from it. Cron is not some magical system which can work without the right input. Which step exactly does not work? What part are you uncertain about in path section? – Anshul Goyal Mar 29 '14 at 14:24
  • You can append the output to a log file like following `php script.pH >> /some/location/cron.log` – Anshul Goyal Mar 29 '14 at 14:41
  • @SmokersCough were you able to figure things out? – Anshul Goyal Mar 30 '14 at 13:27
  • @SmokersCough are you still stuck somewhere? the bounty's grace period is ending ... – Anshul Goyal Mar 30 '14 at 20:02
  • When running from my local machine this command scrapes correctly: "php index.php scrape all_sites", however your suggested commands have not worked at all, surely this should be simple as we have the directory the php exe is in and the directory for the index of the site. – SmokersCough Mar 30 '14 at 20:17
  • @SmokersCough does that command `php index.php scrape all_sites` work from the cli on server as well? Your local environment is not the same as that on your TSOhost. Run that command on your server. Also, what do you mean by the PHP exe? Take incremental steps - first try if `php index.php scrape all_sites` works on TSOHost, then try if `cd /var/sites/s/website.com/public_html/application/ && /usr/bin/php-5.3 index.php scrape all_sites` works on TSOhost. I am assuming `/var/sites/s/website.com/public_html/application/index.php` is the path on TSOhost and not your local machine. – Anshul Goyal Mar 31 '14 at 03:32

2 Answers2

3

First off, don't bother setting up a cron job unless you have it working on the command line. You will end up mixing different things and generating a plethora of unwanted possibilities for debugging which will confuse you further.


Now, you are saying that

I can successfully run the command using the browser via URL (website.com/index.php/scrape/all_sites), however the script times out

This could be because the script takes longer than 30 seconds to complete, and thus is hitting the max-execution-time in php. Check this.

If this is the case, check this and answers over here and here to resolve it by increasing the limit.

Once you have a working version of the script, either via browser or via command line, you can go ahead and schedule the cron jobs, like already shared by your TSOhost tech support.

03 19 * * * /usr/bin/php-5.3 /var/sites/s/website.com/public_html/application/controllers/scrape.php (didn't work)
0  6  * * * /usr/bin/wget -O /dev/null -o /dev/null http://www.speeddatemate.com/index.php/scrape/all_sites

Once again, setup the crons only after you have the other pieces working.


If it still doesn't work, give more info regarding:

  1. What exactly the script does? Does it download something, does it backup something; update the question with whatever it does.
  2. What parameters does it require? and how were you passing them via the url?
  3. What do the logs say? Assuming your webserver is apache, you can usually find them at /var/log/apache2/error.log and /var/log/apache2/access.log.

EDIT 1

OP says in comments that php index.php scrape all_sites works for him.

Assuming that works from the root of of his app, where path to index.php can be asssumed to be /var/sites/s/website.com/public_html/application/index.php, try this cron job then

03 19 * * * cd /var/sites/s/website.com/public_html/application/ && /usr/bin/php-5.3 index.php scrape all_sites

If possible, schedule it for a time closer to current time rather than a fixed job for 19:03

If this still doesn't work, and assuming the max-execution-time has already been taken care of, the problem could be with one of your environment variables - your cli shell environment is having some variables that are missing from your cron environment.

In my experience, I have found that PATH variable causes the most troubles, so run echo $PATH on your shell, and if the path value you get is /some/path:/some/other/path:/more/path/values, run your cron job like

03 19 * * * export PATH="/some/path:/some/other/path:/more/path/values" && cd /var/sites/s/website.com/public_html/application/ && /usr/bin/php-5.3 index.php scrape all_sites

If this doesn't work out, check all the environment variables next.

Use printenv > ~/shell_environment.txt from a normal shell to get all the environment variables set in the shell. Then use the following cron entry * * * * * printenv > ~/cron_environment.txt to get the variables from the cron environment.

Compare the two - shell_environment.txt and cron_environment.txt for any unset variable which you need to tinker with in cron environment.

Community
  • 1
  • 1
Anshul Goyal
  • 73,278
  • 37
  • 149
  • 186
0

First of all make sure your script run as expected from your browser. If so then you can try to run it from command line. Lets assume following is your controller.

<?php
class Scrape extends CI_Controller {

    public function all_sites($some_parameter = 'working')
    {
        echo "Its {$some_parameter}!".PHP_EOL;
    }
}
?>

As per your provided information to run the command you can run

/usr/bin/php-5.3 /var/sites/s/website.com/public_html/index.php scrape all_sites

And set cron as

03 19 * * * /usr/bin/php-5.3 /var/sites/s/website.com/public_html/index.php scrape all_sites

If you need to pass a parameter, pass the parameter like:

/usr/bin/php-5.3 /var/sites/s/website.com/public_html/index.php scrape all_sites "Working fine"

Enjoy!

You can read reference document about Running via the CLI from codeigniter here

xiidea
  • 3,344
  • 20
  • 24