briangroce Posted November 8, 2012 Share Posted November 8, 2012 Hi guys, I have a script that takes quite a LONG time to process (10 min or so) and requires a few loops within loops. (That's a whole separate issue that I am trying to optimize better) In some of the loops, I am scraping some information off of another website also so this script takes a while to run. Anyways, I need this script to run once a day at 2AM or so. I know about LazyCron, but I don't want it to have to be user initiated. I need it to run the script at the same time every day. The script is grabbing Pages in ProcessWire, then scraping data, then updating the ProcessWire Page. What do you guys suggest as the best way to have a script like this run efficiently? Link to comment Share on other sites More sharing options...
netcarver Posted November 8, 2012 Share Posted November 8, 2012 These are just quick ideas off the top of my head about running at the same time every day. If you can add cron jobs to the server box, that's probably the best way to go. Try... Making the script executable by whatever user the webserver uses. From the server's command prompt setup the cronjob via "crontab -e". You will need to add a line to this file and you can use this cronjob calculator to help you set it up. However, if you have no access to cron on the server, install the lazycron module instead and then set up a cronjob on another internet connected box that is always on at 2am that uses wget to visit your site and hence trigger your script. 2 Link to comment Share on other sites More sharing options...
Adam Kiss Posted November 8, 2012 Share Posted November 8, 2012 Basically, have this available from outside under URL, and run CRON job on it every day, be it from the box itself or other hosting you have. No need for LazyCron, that's Netcarver having not enough coffee. Link to comment Share on other sites More sharing options...
briangroce Posted November 8, 2012 Author Share Posted November 8, 2012 OK, so I found where I can have set up a cron job on the server. Since there are lots of Processwire classes called in the script, I have to call a particular URL without a .php (The URL that runs the script is http://www.someurl.com/myPage/ ) In this case, is running curl http://www.someurl.com/myPage/ reasonable? Link to comment Share on other sites More sharing options...
ryan Posted November 9, 2012 Share Posted November 9, 2012 In this case, is running curl http://www.someurl.com/myPage/ reasonable? Keep in mind that ProcessWire can run in shell scripts outside of Apache/http. But so long as you aren't dealing with timeout issues, it should also be fine to trigger it the way you are asking about too (whether curl or wget or something else). However, you'll want to make sure you've got some good security through obscurity (obscure URL), and/or a GET/POST variable pass key or something to ensure nobody else can trigger your script except you. This is always a concern with anything http accessible. 2 Link to comment Share on other sites More sharing options...
Soma Posted February 7, 2014 Share Posted February 7, 2014 Just wanted to mention that this doesn't seem to be true, at least not for me. I create a crontab and get this errors when the script is run: PHP Notice: Undefined index: SERVER_NAME in /home/www-data/processwire/wire/core/ProcessWire.php on line 93 Notice: Undefined index: SERVER_NAME in /home/www-data/processwire/wire/core/ProcessWire.php on line 93 PHP Notice: Undefined index: HTTP_HOST in /home/www-data/processwire/wire/core/ProcessWire.php on line 94 Notice: Undefined index: HTTP_HOST in /home/www-data/processwire/wire/core/ProcessWire.php on line 94 PHP Fatal error: Exception: SQLSTATE[28000] [1045] Access denied for user 'www-data'@'localhost' (using password: NO) (in /home/www-data/processwire/wire/core/ProcessWire.php line 143) #0 /home/www-data/processwire/wire/core/ProcessWire.php(51): ProcessWire->load(Object(Config)) #1 /home/www-data/abc/index.php(183): ProcessWire->__construct(Object(Config)) #2 /home/www-data/abc/import/mitglieder/cron.php(5): include('/home/www-data/...') #3 {main} in /home/www-data/abc/index.php on line 214 Fatal error: Exception: SQLSTATE[28000] [1045] Access denied for user 'www-data'@'localhost' (using password: NO) (in /home/www-data/processwire/wire/core/ProcessWire.php line 143) #0 /home/www-data/processwire/wire/core/ProcessWire.php(51): ProcessWire->load(Object(Config)) #1 /home/www-data/abc/index.php(183): ProcessWire->__construct(Object(Config)) #2 /home/www-data/abc/import/mitglieder/cron.php(5): include('/home/www-data/...') #3 {main} in /home/www-data/abc/index.php on line 214 This error message was shown because site is in debug mode ($config->debug = true; in /site/config.php). Error has been logged. Administrator has been notified. Any ideas how to get this working? processwire is symlinked from within the webroot of where I call the script. Link to comment Share on other sites More sharing options...
ryan Posted February 8, 2014 Share Posted February 8, 2014 I'm running PW from crontabs all over the place. You can ignore the undefined index notices in this case as they don't have anything to do with the errors that follow. The error messages you are seeing seem to indicate that the database settings were not defined. No idea how that could happen, but maybe something to do with the symlink. Or you may be hitting up against some server security here, as it appears you've got one account (sev-online) trying to access another (abc). Make sure your cron job is running as the user that owns these files, or one with greater access. Link to comment Share on other sites More sharing options...
Soma Posted February 10, 2014 Share Posted February 10, 2014 No abc was me trying to take out sev-online.ch and forgot 2 of them, but anyway. I'm not sure why it wouldn't be able to connect to db as the config.php is local and clearly working. There seems no security problem or symlink, the script works fine when run directly and not from crontab. The crontab is set up for the user that also owns the webs. Link to comment Share on other sites More sharing options...
Soma Posted February 11, 2014 Share Posted February 11, 2014 OK I tried again today, and after some time I glimpsed at my config.php... of course it can't work cause I had DB connection infos dynamic on the host name using some $_SERVER which of course doesn't work in a crontab. Seems to work fine for now. So the problem was once again between chair and computer. Link to comment Share on other sites More sharing options...
Martijn Geerts Posted February 11, 2014 Share Posted February 11, 2014 Offtopic, For me it's difficult not to read this topic title as: “ Long Process to be Ryan Once a Day ” ( referring to all ProcessWire & beyond ryan does every day ) 2 Link to comment Share on other sites More sharing options...
Soma Posted February 11, 2014 Share Posted February 11, 2014 Processus Longus. 2 Link to comment Share on other sites More sharing options...
diogo Posted February 11, 2014 Share Posted February 11, 2014 This works great for me: http://processwire.com/api/include/ On that example Ryan creates a executable file, but you can as well create a PHP file anywhere you want on the server and do the same: <?php include("/path/to/processwire/index.php"); // Do anything you want with PW. Remember to use wire('pages') and all the likes instead of $pages After that you can run it in the terminal like this: php path/to/your/file.php Or for your cronjob: @hourly php path/to/your/file.php You can even put some echo's on the file for debugging purposes and see them in the terminal. That worked great for me with scripts tat created hundreds of pages in one go. 2 Link to comment Share on other sites More sharing options...
SteveB Posted February 16, 2014 Share Posted February 16, 2014 Re. "However, you'll want to make sure you've got some good security through obscurity (obscure URL), and/or a GET/POST variable pass key or something to ensure nobody else can trigger your script except you." Assuming your intended way to run this is via cron job on the same server, have your PHP check that REMOTE_HOST equals LOCALHOST Link to comment Share on other sites More sharing options...
horst Posted February 16, 2014 Share Posted February 16, 2014 OK I tried again today, and after some time I glimpsed at my config.php... of course it can't work cause I had DB connection infos dynamic on the host name using some $_SERVER which of course doesn't work in a crontab. @Soma: I use same approach in site/config.php but use it like that: $config->dbHost = 'MyComputersName'==getenv('COMPUTERNAME') ? 'example.com' : 'localhost'; The systems environment variable "COMPUTERNAME" is set on (every) Windows, but if it is set on a *nix system it will have a different value. This way it works in terminal too. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now