Jump to content
djr

Module: ScheduleCloudBackups (back up site to S3)

Recommended Posts

Hello

I've written a little module that backs up your ProcessWire site to Amazon S3 (might add support for other storage providers later, hence the generic name).

Pete's ScheduleBackups was used as a starting point but has been overhauled somewhat. Still, it's far from perfect at the moment, but I guess you might find it useful.

Essentially, you set up a cron job to load a page every day, and then the script creates a .tar.gz containing all of the site files and a dump of the database, then uploads it to an S3 bucket.

Currently, only linux-based hosts are supported currently (hopefully, most of you).

The module is available on github: https://github.com/DavidJRobertson/ProcessWire-ScheduleCloudBackups

Zip download: https://github.com/DavidJRobertson/ProcessWire-ScheduleCloudBackups/archive/master.zip

Let me know what you think

EDIT: now available on the module directory @ http://modules.processwire.com/modules/schedule-cloud-backups

  • Like 15

Share this post


Link to post
Share on other sites

This looks fantastic djr! I've had a need for something exactly like this and will definitely look forward to using it. I took a quick look through the code and think it looks very well put together. I do have a few minor suggestions:

Rather than backing up the DB to the root path (where it is temporarily web accessible) I'd recommend backing it up to a non web accessible directory, like /site/assets/cache/. Likewise for the tar/gz file. 

Beyond just looking for "runbackup" in the request URI, I recommend designating a page that it will only run on. For instance, if you wanted it to only run on the homepage:

$shouldBackup = $page->path === '/' && 
  (strpos($_SERVER['REQUEST_URI'], self::RUN_BACKUP_PATH) !== FALSE) &&
  $this->wire('input')->get->token &&
  $this->wire('input')->get->token === $this->token;
This might be a good module to experiment with conditional autoloads. In your getModuleInfo, you can do this:
'autoload' => function() {
  return (strpos($_SERVER['REQUEST_URI'], self::RUN_BACKUP_PATH) !== FALSE); 
}
In truth, conditional autoloads are more reliable in PW 2.5 (a few minor issues have been fixed) so this may be a v2 kind of thing as well. In PW 2.5, you can also isolate the entire getModuleInfo() to a separate ModuleName.info.php file. 

Beyond just the token, it might be worthwhile to have an IP address limiter since there's a good chance one's CRON job is always going to be coming from the same IP. Though not sure it's totally necessary. 

In your docs file, I would mention if possible what command you recommend for the CRON job, for instance:

wget quiet no-cache -O - http://www.your-site.com/runbackup?token=abc123 > /dev/null

Lastly, might be good to mention that it requires exec/system access for executing commands (many hosts have these disabled by default, but there's usually a way to enable them). 

Please add to the modules directory when ready! Thanks for putting this together! 

  • Like 5

Share this post


Link to post
Share on other sites

Thanks Ryan.

Re your suggestions: (I've made some changes to the code)

  • The data.sql file is already protected from web access by the default PW .htaccess file, and I've added a .htaccess file in the module directory to prevent access to the backup tarball.
  • I've changed the shouldBackup check to be more specific (behaves the same as your suggestion, but simpler logic).
  • I don't know what the issues around conditional autoloading in PW 2.4 are, so I'll leave that for now (?).
  • I'll put IP whitelisting on the todo list, but I don't think it's essential right now, since it's unlikely anybody would be able to guess the secret token in the URL.
  • The `wget` command for the cron job is displayed on the module config page (prefilled with URL). Would it be better to have the cron job run a PHP file directly rather than going through the web server? Not sure.
  • I've added a little mention of the requirements in the readme. I've also adjusted the install method to check it can run tar and mysqldump.

I'll submit it to the module directory shortly :)

  • Like 7

Share this post


Link to post
Share on other sites
The data.sql file is already protected from web access by the default PW .htaccess file, and I've added a .htaccess file in the module directory to prevent access to the backup tarball.

You are right–I'd forgotten we had that in the htaccess. 

I don't know what the issues around conditional autoloading in PW 2.4 are, so I'll leave that for now (?).

Yes, I'd leave it for now. I just wanted to point them out because I think these will be beneficial for this module once 2.5 is stable. 

I'll put IP whitelisting on the todo list, but I don't think it's essential right now, since it's unlikely anybody would be able to guess the secret token in the URL.

I agree, you don't need it for now. My default is always to double up on security, but thinking through it more it's probably not necessary here. I mention it as a possible future addition though just because the URLs hitting a website aren't always confidential. The token is only as private as the logs. For most of us, that's a non issue. For some it's a potential ddos entry point, but only if the token gets in the wrong hands. I think what you've got is just right for the majority, and if someone needed something more, like an IP limiter, then probably better to leave it to them to add in rather than making everyone else fuss with it. 

The `wget` command for the cron job is displayed on the module config page (prefilled with URL). Would it be better to have the cron job run a PHP file directly rather than going through the web server? Not sure.

Sorry, I missed that wget was already there. There may be some benefits to having the cron job run the PHP File directly, but it would be more difficult for the user to setup (creating executable PHP shell scripts and such). Also, having initialization of the job URL accessible makes it easier for people to use external CRON services. As a result, I think sticking to the method you are using is better. 

Thanks for adding to the modules directory! 

  • Like 2

Share this post


Link to post
Share on other sites

When trying to create a backup from within the CP, I get:

Error: Exception: Failed to create database dump. (in /site/modules/ProcessWire-ScheduleCloudBackups/ScheduleCloudBackups.module line 167)

Is this to do with:

 

tar and mysqldump must be present on your PATH

because I'm not sure I have the ability to do anything about that on the host I'm testing it out on.

Share this post


Link to post
Share on other sites

@tyssen, @jacmaes:

Most likely the server doesn't have the mysqldump utility available.

It's possible to add a pure-PHP fallback (Pete's ScheduleBackups did) but it will probably be considerably slower than real mysqldump. I'll see about adding it soon, but I'm a bit busy today.

Share this post


Link to post
Share on other sites

@tyssen, @jacmaes: released 0.0.2 which has a pure-php fallback for mysqldump and tar. Give it a go :)

  • Like 3

Share this post


Link to post
Share on other sites

Thanks for the update, djr, but now I'm getting this error  :( :

Error: Exception: Failed to create database dump. (in /var/www/.../site/modules/ScheduleCloudBackups/ScheduleCloudBackups.module line 81)

Share this post


Link to post
Share on other sites

Oh. That tells me it's using the native mysqldump (not the php implementation), but it's still failing.

Perhaps the file permissions don't allow creating a new file (data.sql) in the root of your site? I should probably add a check for that. 

Share this post


Link to post
Share on other sites

The root folder of my site has permissions of 755, if that's what you're referring to. 

Share this post


Link to post
Share on other sites

Hi djr

Great plugin and thanks as I think backups are really important.

I was trying to set up your module and I'm getting an error in the admin section. It installed okay and I was filling out the Amazon information (the admin page worked fine when the information was wrong) and once I got the Amazon information right the admin page started failing with the error:

Fatal error: Call to a member function format() on boolean in /var/www/vhosts/62/500562/webspace/httpdocs/site/modules/ScheduleCloudBackups/ScheduleCloudBackups.module on line 418

the relevant lines in the module are:

                foreach ($objects as $object) {
                    $ts = date_create_from_format(self::TIMESTAMP_FORMAT, basename($object['Key'], '.tar.gz'));
                    $date = $ts->format('Y-m-d H:i:s');


and it's the last line that is line 418.

Any ideas?

 

Thanks very much

Rob

Share this post


Link to post
Share on other sites

Hi djr

I solved this myself - I had put a folder in the bucket called the same as the website name (ie website.ie). When I checked the count of $objects in that section of code there was one $objects and the basename($object['Key'], '.tar.gz') resulted in "website.ie" which resulted in $ts === false (ie the boolean) so line 418 couldn't have worked.

I don't know if this is a huge coincidence but an is_bool() check on $ts and if === FALSE followed by some error handling would solve this for any future users.

 

Thanks for the great plugin.

Rob

Share this post


Link to post
Share on other sites

Hi David

 

Another problem - when I try to run the backup I get a 404- page not found. The url I am using from the admin page is similar to:

 

http://www.websitename.ie/runbackup?token=XXXX29924700591dd18a9633d17c8ea34c0b2

(changed to protect the highly secret information of the client's website!!!)

I'm using PW version 2.7.2

[I've tried to completely reinstall the plugin but it doidn't make any difference]

Thanks for the help

Rob

 

 

Share this post


Link to post
Share on other sites

I'm getting the same thing. There's an issue on Github with the same problem from June 2015.

So it seems this project is now dead. Is that the case? And if so, are there any other alternatives?

Share this post


Link to post
Share on other sites
9 hours ago, Tyssen said:

And if so, are there any other alternatives?

Something like this?

Soon to be released, I hope :) 

  • Like 2

Share this post


Link to post
Share on other sites

Any update on this great module? Can't install on PW > 3  :(
I get this error: Cannot declare class ComposerAutoloaderInit700022e1c519b28dbab39fa2456e3e43, because the name is already in use (line 5 of /home/nginx/domains/public/site/assets/cache/FileCompiler/vendor/composer/autoload.php)
 

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By d'Hinnisdaël
      Happy new year, everybody 🥬
      I've been sitting on this Dashboard module I made for a client and finally came around to cleaning it up and releasing it to the wider public. This is how it looks.
      ProcessWire Dashboard

      If anyone is interested in trying this out, please go ahead! I'd love to get some feedback on it. If this proves useful and survives some real-world testing, I'll add this to the module directory.
      Download
      You can find the latest release on Github.
      Documentation
      Check out the documentation to get started. This is where you'll find information about included panel types and configuration options.
      Custom Panels
      My goal was to make it really simple to create custom panels. The easiest way to do that is to use the panel type template and have it render a file in your templates folder. This might be enough for 80% of all use cases. For anything more complex (FormBuilder submissions? Comments? Live chat?), you can add new panel types by creating modules that extend the DashboardPanel base class. Check out the documentation on custom panels or take a look at the HelloWorld panel to get started. I'm happy to merge any user-created modules into the main repo if they might be useful to more than a few people.
       Disclaimer
      This is a pre-release version. Please treat it as such — don't install it on production sites. Just making sure 🍇
      Roadmap
      These are the things I'm looking to implement myself at some point. The wishlist is a lot longer, but those are the 80/20 items that I probably won't regret spending time on.
      Improve documentation & add examples ⚙️ Panel types Google Analytics ⚙️ Add new page  🔥 Drafts 🔥 At a glance / Page counter 404s  Layout options Render multiple tabs per panel panel groups with heading and spacing between ✅ panel wrappers as grid item (e.g. stacked notices) ✅ Admin themes support AdminThemeReno and AdminThemeDefault ✅ Shortcuts panel add a table layout with icon, title & summary ✅ Chart panel add default styles for common chart types ✅ load chart data from JS file (currently passed as PHP array) Collection panel support image columns ✅ add buttons: view all & add new ✅
    • By Gadgetto
      Status update links (inside this thread) for SnipWire development will be always posted here:
      2020-01-14 --> new date range picker, discount editor, order notifiactions, order statuses, and more ... 2019-11-15 --> orders filter, order details, download + resend invoices, refunds 2019-10-18 --> list filters, REST API improvements, new docs platform, and more ... 2019-08-08 --> dashboard interface, currency selector, managing Orders, Customers and Products, Added a WireTabs, refinded caching behavior 2019-06-15 --> taxes provider, shop templates update, multiCURL implementation, and more ... 2019-06-02 --> FieldtypeSnipWireTaxSelector 2019-05-25 --> SnipWire will be free and open source If you are interested, you can test the current state of development:
      https://github.com/gadgetto/SnipWire
      Please note that the software is not yet intended for use in a production system (alpha version).
      If you like, you can also submit feature requests and suggestions for improvement. I also accept pull requests.
      ---- INITIAL POST FROM 2019-05-25 ----
      I wanted to let you know that I am currently working on a new ProcessWire module that fully integrates the Snipcart Shopping Cart System into ProcessWire. (this is a customer project, so I had to postpone the development of my other module GroupMailer).
      The new module SnipWire offers full integration of the Snipcart Shopping Cart System into ProcessWire.
      Here are some highlights:
      simple setup with (optional) pre-installed templates, product fields, sample products (quasi a complete shop system to get started immediately) store dashboard with all data from the snipcart system (no change to the snipcart dashboard itself required) Integrated REST API for controlling and querying snipcart data webhooks to trigger events from Snipcart (new order, new customer, etc.) multi currency support self-defined/configurable tax rates etc. Development is already well advanced and I plan to release the module in the next 2-3 months.
      I'm not sure yet if this will be a "Pro" module or if it will be made available for free.
      I would be grateful for suggestions and hints!
      Please have a look at the screenshots to get an idea what I'm talking about (open spoiler):
      (Please note: these screenshots are from an early development state of SnipWire. To see actual screens please have a look at later posts below!)
       
    • By Robin S
      This module is inspired by and similar to the Template Stubs module. The author of that module has not been active in the PW community for several years now and parts of the code for that module didn't make sense to me, so I decided to create my own module. Auto Template Stubs has only been tested with PhpStorm because that is the IDE that I use.
      Auto Template Stubs
      Automatically creates stub files for templates when fields or fieldgroups are saved.
      Stub files are useful if you are using an IDE (e.g. PhpStorm) that provides code assistance - the stub files let the IDE know what fields exist in each template and what data type each field returns. Depending on your IDE's features you get benefits such as code completion for field names as you type, type inference, inspection, documentation, etc.
      Installation
      Install the Auto Template Stubs module.
      Configuration
      You can change the class name prefix setting in the module config if you like. It's good to use a class name prefix because it reduces the chance that the class name will clash with an existing class name.
      The directory path used to store the stub files is configurable.
      There is a checkbox to manually trigger the regeneration of all stub files if needed.
      Usage
      Add a line near the top of each of your template files to tell your IDE what stub class name to associate with the $page variable within the template file. For example, with the default class name prefix you would add the following line at the top of the home.php template file:
      /** @var tpl_home $page */ Now enjoy code completion, etc, in your IDE.

      Adding data types for non-core Fieldtype modules
      The module includes the data types returned by all the core Fieldtype modules. If you want to add data types returned by one or more non-core Fieldtype modules then you can hook the AutoTemplateStubs::getReturnTypes() method. For example, in /site/ready.php:
      // Add data types for some non-core Fieldtype modules $wire->addHookAfter('AutoTemplateStubs::getReturnTypes', function(HookEvent $event) { $extra_types = [ 'FieldtypeDecimal' => 'string', 'FieldtypeLeafletMapMarker' => 'LeafletMapMarker', 'FieldtypeRepeaterMatrix' => 'RepeaterMatrixPageArray', 'FieldtypeTable' => 'TableRows', ]; $event->return = $event->return + $extra_types; }); Credits
      Inspired by and much credit to the Template Stubs module by mindplay.dk.
       
      https://github.com/Toutouwai/AutoTemplateStubs
      https://modules.processwire.com/modules/auto-template-stubs/
    • By Mike Rockett
      Jumplinks for ProcessWire
      Release: 1.5.60
      Composer: rockett/jumplinks
      ⚠️ NOTICE: 1.5.60 is an important security patch-release for an XSS vulnerability discovered by @phlp. It's HIGHLY RECOMMENDED that all Jumplinks users update to the latest version as soon as possible.
      Jumplinks is an enhanced version of the original ProcessRedirects by Antti Peisa.
      The Process module manages your permanent and temporary redirects (we'll call these "jumplinks" from now on, unless in reference to redirects from another module), useful for when you're migrating over to ProcessWire from another system/platform. Each jumplink supports wildcards, shortening the time needed to create them.
      Unlike similar modules for other platforms, wildcards in Jumplinks are much easier to work with, as Regular Expressions are not fully exposed. Instead, parameters wrapped in curly braces are used - these are described in the documentation.
      Under Development: 2.0, to be powered by FastRoute
      As of version 1.5.0, Jumplinks requires at least ProcessWire 2.6.1 to run.
      View on GitLab
      Download via the Modules Directory
      Read the docs
      Features
      The most prominent features include:
      Basic jumplinks (from one fixed route to another) Parameter-based wildcards with "Smart" equivalents Mapping Collections (for converting ID-based routes to their named-equivalents without the need to create multiple jumplinks) Destination Selectors (for finding and redirecting to pages containing legacy location information) Timed Activation (activate and/or deactivate jumplinks at specific times) 404-Monitor (for creating jumplinks based on 404 hits) Additionally, the following features may come in handy:
      Stale jumplink management Legacy domain support for slow migrations An importer (from CSV or ProcessRedirects) Feedback & Feature Requests
      I’d love to know what you think of this module. Please provide some feedback on the module as a whole, or even regarding smaller things that make it whole. Also, please feel free to submit feature requests and their use-cases.
      Note: Features requested so far have been added to the to-do list, and will be added to 2.0, and not the current dev/master branches.
      Open Source

      Jumplinks is an open-source project, and is free to use. In fact, Jumplinks will always be open-source, and will always remain free to use. Forever. If you would like to support the development of Jumplinks, please consider making a small donation via PayPal.
      Enjoy! 🙂
    • By Robin S
      Add Image URLs
      Allows images/files to be added to Image/File fields by pasting URLs.

      Usage
      Install the Add Image URLs module.
      A "Paste URLs" button will be added to all image and file fields. Use the button to show a textarea where URLs may be pasted, one per line. Images/files are added when the page is saved.
       
      https://github.com/Toutouwai/AddImageUrls
      https://modules.processwire.com/modules/add-image-urls/
×
×
  • Create New...