Jump to content

Module: ScheduleCloudBackups (back up site to S3)


djr
 Share

Recommended Posts

Hello

I've written a little module that backs up your ProcessWire site to Amazon S3 (might add support for other storage providers later, hence the generic name).

Pete's ScheduleBackups was used as a starting point but has been overhauled somewhat. Still, it's far from perfect at the moment, but I guess you might find it useful.

Essentially, you set up a cron job to load a page every day, and then the script creates a .tar.gz containing all of the site files and a dump of the database, then uploads it to an S3 bucket.

Currently, only linux-based hosts are supported currently (hopefully, most of you).

The module is available on github: https://github.com/DavidJRobertson/ProcessWire-ScheduleCloudBackups

Zip download: https://github.com/DavidJRobertson/ProcessWire-ScheduleCloudBackups/archive/master.zip

Let me know what you think

EDIT: now available on the module directory @ http://modules.processwire.com/modules/schedule-cloud-backups

  • Like 15
Link to comment
Share on other sites

This looks fantastic djr! I've had a need for something exactly like this and will definitely look forward to using it. I took a quick look through the code and think it looks very well put together. I do have a few minor suggestions:

Rather than backing up the DB to the root path (where it is temporarily web accessible) I'd recommend backing it up to a non web accessible directory, like /site/assets/cache/. Likewise for the tar/gz file. 

Beyond just looking for "runbackup" in the request URI, I recommend designating a page that it will only run on. For instance, if you wanted it to only run on the homepage:

$shouldBackup = $page->path === '/' && 
  (strpos($_SERVER['REQUEST_URI'], self::RUN_BACKUP_PATH) !== FALSE) &&
  $this->wire('input')->get->token &&
  $this->wire('input')->get->token === $this->token;
This might be a good module to experiment with conditional autoloads. In your getModuleInfo, you can do this:
'autoload' => function() {
  return (strpos($_SERVER['REQUEST_URI'], self::RUN_BACKUP_PATH) !== FALSE); 
}
In truth, conditional autoloads are more reliable in PW 2.5 (a few minor issues have been fixed) so this may be a v2 kind of thing as well. In PW 2.5, you can also isolate the entire getModuleInfo() to a separate ModuleName.info.php file. 

Beyond just the token, it might be worthwhile to have an IP address limiter since there's a good chance one's CRON job is always going to be coming from the same IP. Though not sure it's totally necessary. 

In your docs file, I would mention if possible what command you recommend for the CRON job, for instance:

wget quiet no-cache -O - http://www.your-site.com/runbackup?token=abc123 > /dev/null

Lastly, might be good to mention that it requires exec/system access for executing commands (many hosts have these disabled by default, but there's usually a way to enable them). 

Please add to the modules directory when ready! Thanks for putting this together! 

  • Like 5
Link to comment
Share on other sites

Thanks Ryan.

Re your suggestions: (I've made some changes to the code)

  • The data.sql file is already protected from web access by the default PW .htaccess file, and I've added a .htaccess file in the module directory to prevent access to the backup tarball.
  • I've changed the shouldBackup check to be more specific (behaves the same as your suggestion, but simpler logic).
  • I don't know what the issues around conditional autoloading in PW 2.4 are, so I'll leave that for now (?).
  • I'll put IP whitelisting on the todo list, but I don't think it's essential right now, since it's unlikely anybody would be able to guess the secret token in the URL.
  • The `wget` command for the cron job is displayed on the module config page (prefilled with URL). Would it be better to have the cron job run a PHP file directly rather than going through the web server? Not sure.
  • I've added a little mention of the requirements in the readme. I've also adjusted the install method to check it can run tar and mysqldump.

I'll submit it to the module directory shortly :)

  • Like 7
Link to comment
Share on other sites

The data.sql file is already protected from web access by the default PW .htaccess file, and I've added a .htaccess file in the module directory to prevent access to the backup tarball.

You are right–I'd forgotten we had that in the htaccess. 

I don't know what the issues around conditional autoloading in PW 2.4 are, so I'll leave that for now (?).

Yes, I'd leave it for now. I just wanted to point them out because I think these will be beneficial for this module once 2.5 is stable. 

I'll put IP whitelisting on the todo list, but I don't think it's essential right now, since it's unlikely anybody would be able to guess the secret token in the URL.

I agree, you don't need it for now. My default is always to double up on security, but thinking through it more it's probably not necessary here. I mention it as a possible future addition though just because the URLs hitting a website aren't always confidential. The token is only as private as the logs. For most of us, that's a non issue. For some it's a potential ddos entry point, but only if the token gets in the wrong hands. I think what you've got is just right for the majority, and if someone needed something more, like an IP limiter, then probably better to leave it to them to add in rather than making everyone else fuss with it. 

The `wget` command for the cron job is displayed on the module config page (prefilled with URL). Would it be better to have the cron job run a PHP file directly rather than going through the web server? Not sure.

Sorry, I missed that wget was already there. There may be some benefits to having the cron job run the PHP File directly, but it would be more difficult for the user to setup (creating executable PHP shell scripts and such). Also, having initialization of the job URL accessible makes it easier for people to use external CRON services. As a result, I think sticking to the method you are using is better. 

Thanks for adding to the modules directory! 

  • Like 2
Link to comment
Share on other sites

When trying to create a backup from within the CP, I get:

Error: Exception: Failed to create database dump. (in /site/modules/ProcessWire-ScheduleCloudBackups/ScheduleCloudBackups.module line 167)

Is this to do with:

 

tar and mysqldump must be present on your PATH

because I'm not sure I have the ability to do anything about that on the host I'm testing it out on.

Link to comment
Share on other sites

@tyssen, @jacmaes:

Most likely the server doesn't have the mysqldump utility available.

It's possible to add a pure-PHP fallback (Pete's ScheduleBackups did) but it will probably be considerably slower than real mysqldump. I'll see about adding it soon, but I'm a bit busy today.

Link to comment
Share on other sites

Thanks for the update, djr, but now I'm getting this error  :( :

Error: Exception: Failed to create database dump. (in /var/www/.../site/modules/ScheduleCloudBackups/ScheduleCloudBackups.module line 81)
Link to comment
Share on other sites

Oh. That tells me it's using the native mysqldump (not the php implementation), but it's still failing.

Perhaps the file permissions don't allow creating a new file (data.sql) in the root of your site? I should probably add a check for that. 

Link to comment
Share on other sites

  • 4 weeks later...
  • 1 year later...

Hi djr

Great plugin and thanks as I think backups are really important.

I was trying to set up your module and I'm getting an error in the admin section. It installed okay and I was filling out the Amazon information (the admin page worked fine when the information was wrong) and once I got the Amazon information right the admin page started failing with the error:

Fatal error: Call to a member function format() on boolean in /var/www/vhosts/62/500562/webspace/httpdocs/site/modules/ScheduleCloudBackups/ScheduleCloudBackups.module on line 418

the relevant lines in the module are:

                foreach ($objects as $object) {
                    $ts = date_create_from_format(self::TIMESTAMP_FORMAT, basename($object['Key'], '.tar.gz'));
                    $date = $ts->format('Y-m-d H:i:s');


and it's the last line that is line 418.

Any ideas?

 

Thanks very much

Rob

Link to comment
Share on other sites

Hi djr

I solved this myself - I had put a folder in the bucket called the same as the website name (ie website.ie). When I checked the count of $objects in that section of code there was one $objects and the basename($object['Key'], '.tar.gz') resulted in "website.ie" which resulted in $ts === false (ie the boolean) so line 418 couldn't have worked.

I don't know if this is a huge coincidence but an is_bool() check on $ts and if === FALSE followed by some error handling would solve this for any future users.

 

Thanks for the great plugin.

Rob

Link to comment
Share on other sites

Hi David

 

Another problem - when I try to run the backup I get a 404- page not found. The url I am using from the admin page is similar to:

 

http://www.websitename.ie/runbackup?token=XXXX29924700591dd18a9633d17c8ea34c0b2

(changed to protect the highly secret information of the client's website!!!)

I'm using PW version 2.7.2

[I've tried to completely reinstall the plugin but it doidn't make any difference]

Thanks for the help

Rob

 

 

Link to comment
Share on other sites

  • 7 months later...
  • 1 year later...

Any update on this great module? Can't install on PW > 3  :(
I get this error: Cannot declare class ComposerAutoloaderInit700022e1c519b28dbab39fa2456e3e43, because the name is already in use (line 5 of /home/nginx/domains/public/site/assets/cache/FileCompiler/vendor/composer/autoload.php)
 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Similar Content

    • By monollonom
      (once again I was surprised to see a work of mine pop up in the newsletter, this time without even listing the module on PW modules website 😅. Thx @teppo !)
      FieldtypeQRCode
      Github: https://github.com/romaincazier/FieldtypeQRCode
      Modules directory: https://processwire.com/modules/fieldtype-qrcode/
      A simple fieldtype generating a QR Code from the public URL of the page, and more.
      Using the PHP library QR Code Generator by Kazuhiko Arase.

      Options
      In the field’s Details tab you can change between .gif or .svg formats. If you select .svg you will have the option to directly output the markup instead of a base64 image. SVG is the default.
      You can also change what is used to generate the QR code and even have several sources. The accepted sources (separated by a comma) are: httpUrl, editUrl, or the name of any text/URL/file/image field.
      If LanguageSupport is installed the compatible sources (httpUrl, text field, ...) will return as many QR codes as there are languages. Note however that when outputting on the front-end, only the languages visible to the user will be generated.
      Formatting
      Unformatted value
      When using $page->getUnformatted("qrcode_field") it returns an array with the following structure:
      [ [ "label" => string, // label used in the admin "qr" => string, // the qrcode image "source" => string, // the source, as defined in the configuration "text" => string // and the text used to generate the qrcode ], ... ] Formatted value
      The formatted value is an <img>/<svg> (or several right next to each other). There is no other markup.
      Should you need the same markup as in the admin you could use:
      $field = $fields->get("qrcode_field"); $field->type->markupValue($page, $field, $page->getUnformatted("qrcode_field")); But it’s a bit cumbersome, plus you need to import the FieldtypeQRCode's css/js. Best is to make your own markup using the unformatted value.
      Static QR code generator
      You can call FieldtypeQRCode::generateQRCode to generate any QR code you want. Its arguments are:
      string $text bool $svg Generate the QR code as svg instead of gif ? (default=true) bool $markup If svg, output its markup instead of a base64 ? (default=false) Hooks
      Please have a look at the source code for more details about the hookable functions.
      Examples
      $wire->addHookAfter("FieldtypeQRCode::getQRText", function($event) { $page = $event->arguments("page"); $event->return = $page->title; // or could be: $event->return = "Your custom text"; }) $wire->addHookAfter("FieldtypeQRCode::generateQRCodes", function($event) { $qrcodes = $event->return; // keep everything except the QR codes generated from editUrl foreach($qrcodes as $key => &$qrcode) { if($qrcode["source"] === "editUrl") { unset($qrcodes[$key]); } } unset($qrcode); $event->return = $qrcodes; })
    • By Sebi
      AppApiFile adds the /file endpoint to the AppApi routes definition. Makes it possible to query files via the api. 
      This module relies on the base module AppApi, which must be installed before AppApiFile can do its work.
      Features
      You can access all files that are uploaded at any ProcessWire page. Call api/file/route/in/pagetree?file=test.jpg to access a page via its route in the page tree. Alternatively you can call api/file/4242?file=test.jpg (e.g.,) to access a page by its id. The module will make sure that the page is accessible by the active user.
      The GET-param "file" defines the basename of the file which you want to get.
      The following GET-params (optional) can be used to manipulate an image:
      width height maxwidth maxheight cropX cropY Use GET-Param format=base64 to receive the file in base64 format.
    • By MarkE
      This fieldtype and inputfield bundle was built for storing measurement values within a field, rendering them in a variety of formats and converting them to other units or otherwise modifying them via the API.
      The API consists of a number of predefined functions, some of which include...
      render() for rendering the measurement object, valueAs() for converting the value to another unit value, convertTo() for converting the whole measurement object to different units, and add() and subtract() for for modifying the stored value by the value (converted as required) in another measurement. In the admin the inputfield includes a checkbox (which can be optionally disabled) for converting values on page save. For an example if a value was typed in as centimeters, the unit was changed to metres, and the page saved with this checkbox selected, said value would be automatically converted so that e.g. 170 cm becomes 1.7 m.

      A simple length field using Fieldtype Measurement and Inputfield Measurement.
      Combination units (e.g. feet and inches) are also supported.
      Please note that this module is 'proof of concept' at the moment - there are limited units available and quite a lot of code tidying to do. More units will be added shortly.
      See the GitHub at https://github.com/MetaTunes/FieldtypeMeasurement for full details and updates.
    • By tcnet
      File Manager for ProcessWire is a module to manager files and folders from the CMS backend. It supports creating, deleting, renaming, packing, unpacking, uploading, downloading and editing of files and folders. The integrated code editor ACE supports highlighting of all common programming languages.
      https://github.com/techcnet/ProcessFileManager

      Warning
      This module is probably the most powerful module. You might destroy your processwire installation if you don't exactly know what you doing. Be careful and use it at your own risk!
      ACE code editor
      This module uses ACE code editor available from: https://github.com/ajaxorg/ace

      Dragscroll
      This module uses the JavaScript dragscroll available from: http://github.com/asvd/dragscroll. Dragscroll adds the ability to drag the table horizontally with the mouse pointer.
      PHP File Manager
      This module uses a modified version of PHP File Manager available from: https://github.com/alexantr/filemanager
       
    • By tcnet
      This module implements the website live chat service from tawk.to. Actually the module doesn't have to do much. It just need to inserted a few lines of JavaScript just before the closing body tag </body> on each side. However, the module offers additional options to display the widget only on certain pages.
      Create an account
      Visit https://www.tawk.to and create an account. It's free! At some point you will reach a page where you can copy the required JavaScript-code.

      Open the module settings and paste the JavaScript-code into the field as shown below. Click "Submit" and that's all.

      Open the module settings
      The settings for this module are located int the menu Modules=>Configure=>LiveChatTawkTo.

       
×
×
  • Create New...