Jump to content
zyON

Module: Amazon S3 / Cloudfront

Recommended Posts

Hi, 

After the move to ProcessWire on one of our major websites, I had to build myself a module/plugin so I could use the Amazon S3 / CloudFront infrastructure.

The module/plugin uses Amazon S3 PHP SDK and the idea was to provide a clean way to upload/backup files to S3 and distribute the assets via Cloudfront. 

The current version of the module/plugin will copy the page files to Amazon S3 so you can then serve the via Cloudfront automatically. Note that the files are still also copied to the server where your PW installation resides.

There's an option to backup the deleted files to another folder on S3 because when you delete a file on PW via the admin, the file is also deleted from it's folder on S3.

The module also supports Apeisa's Thumbnail module so it also stores the thumbnails on S3. Note that the native size() method to create image thumbnails on S3 is currently not supported.

As it is, the module will only upload new assets, so beware that if you already have pages created with assets you'll definitely have errors, so my advice is to test this with a blank installation of PW.

If anyone wants to test it and contribute to it, I think this provides a good proof of concept for a functionality that is requested by several users (me included).

Please note that I'm not a PHP developer, my skills (very) limited and I'm aware that it can be improved and I'm open to suggestions, feedback and most of all, collaborators.

The module is hosted on github here and it's available in the ProcessWire modules directory.

Nelson Mendes
Edited by zyON
  • Like 22

Share this post


Link to post
Share on other sites

I've updated the module so now it's possible to set a custom time in seconds to control the cache in the browser and in CloudFront. Previously it used the Expires Header but it's advisable to user the Cache-Control directive instead. 

  • Like 1

Share this post


Link to post
Share on other sites

There's a new version available. Now supporting file versioning to handle the caching issues of CloudFront. 

There's an option on the module configuration that will automatically rename the uploaded files by inserting a timestamp in the file name. This way all the files have unique names and that will make it easier to replace old files already cached by CloudFront (something that can be a huge PITA as many CloudFront, or other CDN systems know well).

Screen-Shot-2014-06-30-at-11.59.26-am.pn

  • Like 7

Share this post


Link to post
Share on other sites

New version available

The module now has the ability to update the filename when a new thumbnail is created via the thumbnails plugin. 

This is important if you use Amazon CloudFront option to serve the files and want to show the changes you've made to the thumbnails immediately (without waiting for the cache to expire).

Because when you create a new image crop it's filename doesn't change, I had to force this change in both the thumbnails files and the source file. 

  • Like 5

Share this post


Link to post
Share on other sites

Is it possible to serve files from S3  and delete them from the asses folder?

Great module zyON.

Share this post


Link to post
Share on other sites

Manol, the plugin doesn't handle the deletion of the files in the assets folder. If the plugin is active, the files will be delivered through S3 without relying on the local ones , so theoretically you could manually delete them from you local assets folder but I think of it as a backup that you can quickly start using just by deactivating the plugin.

EDIT: Sorry, I forgot to mention (too much work) that in the admin, the local files are used. So it's definitely not a good Idea.!

  • Like 1

Share this post


Link to post
Share on other sites

Great module. It's possible to use the module working just with API ? I mean, creating pages with images and files not from the admin but via code.

Share this post


Link to post
Share on other sites

Great module. It's possible to use the module working just with API ? I mean, creating pages with images and files not from the admin but via code.

Thanks. In the current version of the module it will only work with assets uploaded via the admin. 

That's something in the roadmap for me but I don't have a date I can share right now because I'm really lacking the time at this moment.

  • Like 1

Share this post


Link to post
Share on other sites

To use amazon cloud front, does the file have to be in s3?  Or is it possible to have cloud front work with local processwire files?

Great plugin!  I am bookmarking it.  Was just curious how amazon works!

Thanks!

Share this post


Link to post
Share on other sites

Great module!

I'm creating a film database site with private content that authenticated users can download.

I've used your plugin and configured it to work with signed URLs so users can't distribute the download link.

Is signed urls the best method to ensure that downloads are only available for authenticated users?

is it possible to have a setting within a file upload field to flag whether to upload the file to S3/cloudfront or store it locally on the PW server?

Jonathan

Share this post


Link to post
Share on other sites

Just today tested the module on a pw 2.5.22 site and apart form the figuring out how to set stuff up with s3 bucket, policy and cloudfront distribution, all worked like a charm

Thanks lots zyON!

  • Like 2

Share this post


Link to post
Share on other sites

A few things have changed regarding amazon s3 and cloudfront, I have made a checklist that takes you step by step through setting it all up, works like a charm, posting here in case I forget the steps myself for next site... :undecided:

1 - Set up a bucket for this site on amazon s3

2 - make sure it is on us standard

3 - go to Account => security credentials => get started with IAM users

4 - create new user

5 - download user security credentials

6 - select user and go to Inline Policy

7 - Create User Policy

8 - {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1425901261000",
            "Effect": "Allow",
            "Action": [
                "s3:DeleteObject",
                "s3:GetObject",
                "s3:PutObject",
                "s3:PutObjectAcl"
            ],
            "Resource": [
                "arn:aws:s3:::name-of-bucket/*"
            ]
        }
    ]
}

9 - go to PW admin and install the amazons3cloudfront module

10 - enter the AWS acces key and secret key for the user you have created

11 - enter the name of the bucket you have created

12 - test the module to see if images are uploaded to the bucket

13 - go to a page in PW and drag & drop an image

14 - refresh the page on the amazon s3 console and you will see the image there in the file and page folder

15 - go to amazon aws console and go to cloudfront

16 - Create Distribution

17 - Web distribution => Get Started

18 - Origin Domain Name => from dropdown choose the bucket you want

19 - Restrict Bucket Access - NO

20 - Create Distribution

21 - Allow the distributiion time to deploy and once status is deployed, it can be used

22 - copy domain name of distribution for entry in PW Admin of module
 

That's it, test and enjoy!

  • Like 1

Share this post


Link to post
Share on other sites

One question regarding hotlinking:

How can I prevent hotlinking to my s3/cloudfront images by someone else from another domain?

thanks!

Share this post


Link to post
Share on other sites

Hotlinking Cloudfront solution:

s3 has bucket policy that you can instruct to only allow images get commands from certain domains and or IP's. But cloudfront does not have this, so this only prtects from direct linking to your s3 files. You can apply some level of defensive tactics to limit hotlinking to your cloudfront distributions.

Here's what I figured out:

1 - On cloudfront create a second distribution that you connect to the same bucket that you are already using, wait for it to fully deploy, then:

2 - Check if ithe files can be served from the new cloudfront distribution, then in PW amazons3/loudfront module, change domain name of cloudfront distribution to new distribution, check if all works and disable the old distribution, it can be deleted it you like (I would if I know it is being used for hotlinking).

This does not stop hotlinking right away, but allows you to stop images from being served from your cloudfront distribution as soon as you switch.

The hotlinking sites will now no longer have images showing up and will have to go to your site and get the new locations.

At least it is an easy defense technique to protect against paying for traffic for images not served via your site. Not fool proof, but it slows things down and makes it harder for leaches...

BTW, it is wise to apply a cache-control http header to the s3 bucket and get the distribution to use this, so your images are served with cache control headers, saves you from unnecessary traffic costs on your s3 account.

Would be nice to see this module working together with the minimize solution module. That would really make things easy!

  • Like 1

Share this post


Link to post
Share on other sites

To use amazon cloud front, does the file have to be in s3?  Or is it possible to have cloud front work with local processwire files?

Great plugin!  I am bookmarking it.  Was just curious how amazon works!

Thanks!

CloudFront uses S3. There's no other way.

Share this post


Link to post
Share on other sites

I've not tried yet to block other sites from hotlinking my CloudFront content. 

You can also check to apply a cache control http header on the module itself. It will apply it to every file uploaded via the module. 

Hotlinking Cloudfront solution:

s3 has bucket policy that you can instruct to only allow images get commands from certain domains and or IP's. But cloudfront does not have this, so this only prtects from direct linking to your s3 files. You can apply some level of defensive tactics to limit hotlinking to your cloudfront distributions.

Here's what I figured out:

1 - On cloudfront create a second distribution that you connect to the same bucket that you are already using, wait for it to fully deploy, then:

2 - Check if ithe files can be served from the new cloudfront distribution, then in PW amazons3/loudfront module, change domain name of cloudfront distribution to new distribution, check if all works and disable the old distribution, it can be deleted it you like (I would if I know it is being used for hotlinking).

This does not stop hotlinking right away, but allows you to stop images from being served from your cloudfront distribution as soon as you switch.

The hotlinking sites will now no longer have images showing up and will have to go to your site and get the new locations.

At least it is an easy defense technique to protect against paying for traffic for images not served via your site. Not fool proof, but it slows things down and makes it harder for leaches...

BTW, it is wise to apply a cache-control http header to the s3 bucket and get the distribution to use this, so your images are served with cache control headers, saves you from unnecessary traffic costs on your s3 account.

Would be nice to see this module working together with the minimize solution module. That would really make things easy!

Share this post


Link to post
Share on other sites

To work on PW 3+, you need to change two lines on AmazonS3Cloudfront.module:

Line 1 - Add PW namespace:

<?php namespace ProcessWire;

Line 21 - Add a slash before the class name. This references it on global namespace, not on PW's:

use \Aws\S3\S3Client;
  • Like 2

Share this post


Link to post
Share on other sites

Hi @zyON, I just submitted a pull request:

---

Now when a image is resized via API, upload this size variation too to S3.
I also changed line 21 to make it run on PW 3+

---

Could you kindly review it? :)

  • Like 1

Share this post


Link to post
Share on other sites

Hi nmendes, awesome module, don't know if you're still maintaining it. I think a couple improvements could be done:

  1. Would be the possibility to mimic the pw structure under a folder within a bucket that way many different webpages could use the same bucket ( because the number of buckets per user is limited)
  2. That one would be the best if you could. Not mimic pw structure but use S3 as the place to save the files, because Gb price on a server is far more expensive than using S3 (used by dropbox, spotify and other big companies), if you've a page with many videos, audios...

Thanks and congratulations.

Share this post


Link to post
Share on other sites

1. it is possible. Are you imagining something like "example.com/2060/name-of-file"? Where 2060 is the page id.

2. Didn't get exactly what you want in this one. You mean bypass completely the local filesystem and only store the files on S3?  

Share this post


Link to post
Share on other sites

I think people have looked at it and found the processwire admin hard codes local folders so it would be hard to have a website only use a remote server for files without modifications to processwire.  (Only save files in s3)

Share this post


Link to post
Share on other sites
On 4/1/2016 at 9:34 PM, Sérgio Jardim said:

Hi @zyON, I just submitted a pull request:

---

Now when a image is resized via API, upload this size variation too to S3.
I also changed line 21 to make it run on PW 3+

---

Could you kindly review it? :)

Hi @Sérgio Jardim

I'm looking for this functionality, too. Did you actually get this working, where image resizes are uploaded to S3? It seems the module is perhaps no longer being developed, but it still seems to work in PW 3 - I'd like to enhance at least my copy of it. 

Thanks

  • Like 1

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By MoritzLost
      This is a new module that provides a simple solution to clearing all your cache layers at once, and an extensible interface to perform various cache-related actions.
      The simple motivation behind this module was that I was tired of manually clearing caches in several places after deploying a change on a live site. The basic purpose of this module is a simple Clear all caches link in the Setup menu which clears out all caches, no matter where they hide. You can customize what exactly the module does through it's configuration menu:
      Expire or delete all cache entries in the database, or selectively clear caches by namespace ($cache API) Clear the the template render cache. Clear out specific folders inside your site's cache directory (/site/assets/cache) Refresh version strings for static assets to bust client-side browser caches (this requires some setup, see the full documentation for details). This is the basic function of the module. However, you can also add different cache management action through the API and execute them through the module's interface. For this advanced usage, the module provides:
      An interface to see all available cache actions and execute them. A system log and logging output on the module page to see verify what the module is doing. A CacheControlTools class with utility functions to clear out different caches. An API to add cache actions, execute them programmatically and even modify the default action. Permission management, allowing you granular control over which user roles can execute which actions. The complete documentation can be found in the module's README.
      Beta release
      Note that I consider this a Beta release. Since the module is relatively aggressive in deleting some caches, I would advise you to install in on a test environment before using it on a live site.
      Let me know if you're getting any errors, have trouble using the module or if you have suggestions for improvement!
      In particular, can someone let me know if this module causes any problems with the ProCache module? I don't own or use it, so I can't check. As far as I can tell, ProCache uses a folder inside the cache directory to cache static pages, so my module should be able to clear the ProCache site cache as well, I'd appreciate it if someone can test that for me.
      Future plans
      If there is some interest in this, I plan to expand this to a more general cache management solution. I particular, I would like to add additional cache actions. Some ideas that came to mind:
      Warming up the template render cache for publicly accessible pages. Removing all active user sessions. Let me know if you have more suggestions!
      Links
      https://github.com/MoritzLost/ProcessCacheControl ProcessCacheControl in the Module directory

    • By joshua
      This module is (yet another) way for implementing a cookie management solution.
      Of course there are several other possibilities:
      - https://processwire.com/talk/topic/22920-klaro-cookie-consent-manager/
      - https://github.com/webmanufaktur/CookieManagementBanner
      - https://github.com/johannesdachsel/cookiemonster
      - https://www.oiljs.org/
      - ... and so on ...
      In this module you can configure which kind of cookie categories you want to manage:

      You can also enable the support for respecting the Do-Not-Track (DNT) header to don't annoy users, who already decided for all their browsing experience.
      Currently there are four possible cookie groups:
      - Necessary (always enabled)
      - Statistics
      - Marketing
      - External Media
      All groups can be renamed, so feel free to use other cookie group names. I just haven't found a way to implement a "repeater like" field as configurable module field ...
      When you want to load specific scripts ( like Google Analytics, Google Maps, ...) only after the user's content to this specific category of cookies, just use the following script syntax:
      <script type="optin" data-type="text/javascript" data-category="statistics" data-src="/path/to/your/statistic/script.js"></script> <script type="optin" data-type="text/javascript" data-category="marketing" data-src="/path/to/your/mareketing/script.js"></script> <script type="optin" data-type="text/javascript" data-category="external_media" data-src="/path/to/your/external-media/script.js"></script> <script type="optin" data-type="text/javascript" data-category="marketing">console.log("Inline scripts are also working!");</script> The type has to be "optin" to get recognized by PrivacyWire, the data-attributes are giving hints, how the script shall be loaded, if the data-category is within the cookie consents of the user. These scripts are loaded asynchronously after the user made the decision.
      If you want to give the users the possibility to change their consent, you can use the following Textformatter:
      [[privacywire-choose-cookies]] It's planned to add also other Textformatters to opt-out of specific cookie groups or delete the whole consent cookie.
      You can also add a custom link to output the banner again with a link / button with following class:
      <a href="#" class="privacywire-show-options">Show Cookie Options</a> <button class="privacywire-show-options">Show Cookie Options</button> This module is still in development, but we already use it on several production websites.
      You find it here: https://github.com/blaueQuelle/privacywire/tree/master
      Download: https://github.com/blaueQuelle/privacywire/archive/master.zip
      I would love to hear your feedback 🙂
      Edit: Updated URLs to master tree of git repo
       
    • By David Karich
      Admin Page Tree Multiple Sorting
      ClassName: ProcessPageListMultipleSorting
      Extend the ordinary sort of children of a template in the admin page tree with multiple properties. For each template, you can define your own rule. Write each template (template-name) in a row, followed by a colon and then the additional field names for sorting.
      Example: All children of the template "blog" to be sorted in descending order according to the date of creation, then descending by modification date, and then by title. Type:
      blog: -created, -modified, title  Installation
      Copy the files for this module to /site/modules/ProcessPageListMultipleSorting/ In admin: Modules > Check for new modules. Install Module "Admin Page Tree Multible Sorting". Alternative in ProcessWire 2.4+
      Login to ProcessWire backend and go to Modules Click tab "New" and enter Module Class Name: "ProcessPageListMultipleSorting" Click "Download and Install"   Compatibility   I have currently tested the module only under PW 2.6+, but think that it works on older versions too. Maybe someone can give a feedback.     Download   PW-Repo: http://modules.processwire.com/modules/process-page-list-multiple-sorting/ GitHub: https://github.com/FlipZoomMedia/Processwire-ProcessPageListMultipleSorting     I hope someone can use the module. Have fun and best regards, David
    • By dimitrios
      Hello,
      this module can publish content of a Processwire page on a Facebook page, triggered by saving the Processwire page.
      To set it up, configure the module with a Facebook app ID, secret and a Page ID. Following is additional configuration on Facebook for developers:
      Minimum Required Facebook App configuration:
      on Settings -> Basics, provide the App Domains, provide the Site URL, on Settings -> Advanced, set the API version (has been tested up to v3.3), add Product: Facebook Login, on Facebook Login -> Settings, set Client OAuth Login: Yes, set Web OAuth Login: Yes, set Enforce HTTPS: Yes, add "https://www.example.com/processwire/page/" to field Valid OAuth Redirect URIs. This module is configurable as follows:
      Templates: posts can take place only for pages with the defined templates. On/Off switch: specify a checkbox field that will not allow the post if checked. Specify a message and/or an image for the post.
      Usage
      edit the desired PW page and save; it will post right after the initial Facebook log in and permission granting. After that, an access token is kept.
       
      Download
      PW module directory: http://modules.processwire.com/modules/auto-fb-post/ Github: https://github.com/kastrind/AutoFbPost   Note: Facebook SDK for PHP is utilized.


×
×
  • Create New...