Jump to content

Module: Amazon S3 / Cloudfront


zyON
 Share

Recommended Posts

Hi, 

After the move to ProcessWire on one of our major websites, I had to build myself a module/plugin so I could use the Amazon S3 / CloudFront infrastructure.

The module/plugin uses Amazon S3 PHP SDK and the idea was to provide a clean way to upload/backup files to S3 and distribute the assets via Cloudfront. 

The current version of the module/plugin will copy the page files to Amazon S3 so you can then serve the via Cloudfront automatically. Note that the files are still also copied to the server where your PW installation resides.

There's an option to backup the deleted files to another folder on S3 because when you delete a file on PW via the admin, the file is also deleted from it's folder on S3.

The module also supports Apeisa's Thumbnail module so it also stores the thumbnails on S3. Note that the native size() method to create image thumbnails on S3 is currently not supported.

As it is, the module will only upload new assets, so beware that if you already have pages created with assets you'll definitely have errors, so my advice is to test this with a blank installation of PW.

If anyone wants to test it and contribute to it, I think this provides a good proof of concept for a functionality that is requested by several users (me included).

Please note that I'm not a PHP developer, my skills (very) limited and I'm aware that it can be improved and I'm open to suggestions, feedback and most of all, collaborators.

The module is hosted on github here and it's available in the ProcessWire modules directory.

Nelson Mendes
Edited by zyON
  • Like 22
Link to comment
Share on other sites

I've updated the module so now it's possible to set a custom time in seconds to control the cache in the browser and in CloudFront. Previously it used the Expires Header but it's advisable to user the Cache-Control directive instead. 

  • Like 1
Link to comment
Share on other sites

There's a new version available. Now supporting file versioning to handle the caching issues of CloudFront. 

There's an option on the module configuration that will automatically rename the uploaded files by inserting a timestamp in the file name. This way all the files have unique names and that will make it easier to replace old files already cached by CloudFront (something that can be a huge PITA as many CloudFront, or other CDN systems know well).

Screen-Shot-2014-06-30-at-11.59.26-am.pn

  • Like 7
Link to comment
Share on other sites

  • 4 weeks later...

New version available

The module now has the ability to update the filename when a new thumbnail is created via the thumbnails plugin. 

This is important if you use Amazon CloudFront option to serve the files and want to show the changes you've made to the thumbnails immediately (without waiting for the cache to expire).

Because when you create a new image crop it's filename doesn't change, I had to force this change in both the thumbnails files and the source file. 

  • Like 5
Link to comment
Share on other sites

  • 3 months later...

Manol, the plugin doesn't handle the deletion of the files in the assets folder. If the plugin is active, the files will be delivered through S3 without relying on the local ones , so theoretically you could manually delete them from you local assets folder but I think of it as a backup that you can quickly start using just by deactivating the plugin.

EDIT: Sorry, I forgot to mention (too much work) that in the admin, the local files are used. So it's definitely not a good Idea.!

  • Like 1
Link to comment
Share on other sites

  • 2 weeks later...

Great module. It's possible to use the module working just with API ? I mean, creating pages with images and files not from the admin but via code.

Thanks. In the current version of the module it will only work with assets uploaded via the admin. 

That's something in the roadmap for me but I don't have a date I can share right now because I'm really lacking the time at this moment.

  • Like 1
Link to comment
Share on other sites

  • 3 weeks later...

To use amazon cloud front, does the file have to be in s3?  Or is it possible to have cloud front work with local processwire files?

Great plugin!  I am bookmarking it.  Was just curious how amazon works!

Thanks!

Link to comment
Share on other sites

  • 2 months later...

Great module!

I'm creating a film database site with private content that authenticated users can download.

I've used your plugin and configured it to work with signed URLs so users can't distribute the download link.

Is signed urls the best method to ensure that downloads are only available for authenticated users?

is it possible to have a setting within a file upload field to flag whether to upload the file to S3/cloudfront or store it locally on the PW server?

Jonathan

Link to comment
Share on other sites

  • 3 weeks later...

A few things have changed regarding amazon s3 and cloudfront, I have made a checklist that takes you step by step through setting it all up, works like a charm, posting here in case I forget the steps myself for next site... :undecided:

1 - Set up a bucket for this site on amazon s3

2 - make sure it is on us standard

3 - go to Account => security credentials => get started with IAM users

4 - create new user

5 - download user security credentials

6 - select user and go to Inline Policy

7 - Create User Policy

8 - {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1425901261000",
            "Effect": "Allow",
            "Action": [
                "s3:DeleteObject",
                "s3:GetObject",
                "s3:PutObject",
                "s3:PutObjectAcl"
            ],
            "Resource": [
                "arn:aws:s3:::name-of-bucket/*"
            ]
        }
    ]
}

9 - go to PW admin and install the amazons3cloudfront module

10 - enter the AWS acces key and secret key for the user you have created

11 - enter the name of the bucket you have created

12 - test the module to see if images are uploaded to the bucket

13 - go to a page in PW and drag & drop an image

14 - refresh the page on the amazon s3 console and you will see the image there in the file and page folder

15 - go to amazon aws console and go to cloudfront

16 - Create Distribution

17 - Web distribution => Get Started

18 - Origin Domain Name => from dropdown choose the bucket you want

19 - Restrict Bucket Access - NO

20 - Create Distribution

21 - Allow the distributiion time to deploy and once status is deployed, it can be used

22 - copy domain name of distribution for entry in PW Admin of module
 

That's it, test and enjoy!

  • Like 1
Link to comment
Share on other sites

Hotlinking Cloudfront solution:

s3 has bucket policy that you can instruct to only allow images get commands from certain domains and or IP's. But cloudfront does not have this, so this only prtects from direct linking to your s3 files. You can apply some level of defensive tactics to limit hotlinking to your cloudfront distributions.

Here's what I figured out:

1 - On cloudfront create a second distribution that you connect to the same bucket that you are already using, wait for it to fully deploy, then:

2 - Check if ithe files can be served from the new cloudfront distribution, then in PW amazons3/loudfront module, change domain name of cloudfront distribution to new distribution, check if all works and disable the old distribution, it can be deleted it you like (I would if I know it is being used for hotlinking).

This does not stop hotlinking right away, but allows you to stop images from being served from your cloudfront distribution as soon as you switch.

The hotlinking sites will now no longer have images showing up and will have to go to your site and get the new locations.

At least it is an easy defense technique to protect against paying for traffic for images not served via your site. Not fool proof, but it slows things down and makes it harder for leaches...

BTW, it is wise to apply a cache-control http header to the s3 bucket and get the distribution to use this, so your images are served with cache control headers, saves you from unnecessary traffic costs on your s3 account.

Would be nice to see this module working together with the minimize solution module. That would really make things easy!

  • Like 1
Link to comment
Share on other sites

To use amazon cloud front, does the file have to be in s3?  Or is it possible to have cloud front work with local processwire files?

Great plugin!  I am bookmarking it.  Was just curious how amazon works!

Thanks!

CloudFront uses S3. There's no other way.

Link to comment
Share on other sites

I've not tried yet to block other sites from hotlinking my CloudFront content. 

You can also check to apply a cache control http header on the module itself. It will apply it to every file uploaded via the module. 

Hotlinking Cloudfront solution:

s3 has bucket policy that you can instruct to only allow images get commands from certain domains and or IP's. But cloudfront does not have this, so this only prtects from direct linking to your s3 files. You can apply some level of defensive tactics to limit hotlinking to your cloudfront distributions.

Here's what I figured out:

1 - On cloudfront create a second distribution that you connect to the same bucket that you are already using, wait for it to fully deploy, then:

2 - Check if ithe files can be served from the new cloudfront distribution, then in PW amazons3/loudfront module, change domain name of cloudfront distribution to new distribution, check if all works and disable the old distribution, it can be deleted it you like (I would if I know it is being used for hotlinking).

This does not stop hotlinking right away, but allows you to stop images from being served from your cloudfront distribution as soon as you switch.

The hotlinking sites will now no longer have images showing up and will have to go to your site and get the new locations.

At least it is an easy defense technique to protect against paying for traffic for images not served via your site. Not fool proof, but it slows things down and makes it harder for leaches...

BTW, it is wise to apply a cache-control http header to the s3 bucket and get the distribution to use this, so your images are served with cache control headers, saves you from unnecessary traffic costs on your s3 account.

Would be nice to see this module working together with the minimize solution module. That would really make things easy!

Link to comment
Share on other sites

  • 10 months later...

To work on PW 3+, you need to change two lines on AmazonS3Cloudfront.module:

Line 1 - Add PW namespace:

<?php namespace ProcessWire;

Line 21 - Add a slash before the class name. This references it on global namespace, not on PW's:

use \Aws\S3\S3Client;
  • Like 2
Link to comment
Share on other sites

  • 1 month later...

Hi @zyON, I just submitted a pull request:

---

Now when a image is resized via API, upload this size variation too to S3.
I also changed line 21 to make it run on PW 3+

---

Could you kindly review it? :)

  • Like 1
Link to comment
Share on other sites

  • 2 weeks later...

Hi nmendes, awesome module, don't know if you're still maintaining it. I think a couple improvements could be done:

  1. Would be the possibility to mimic the pw structure under a folder within a bucket that way many different webpages could use the same bucket ( because the number of buckets per user is limited)
  2. That one would be the best if you could. Not mimic pw structure but use S3 as the place to save the files, because Gb price on a server is far more expensive than using S3 (used by dropbox, spotify and other big companies), if you've a page with many videos, audios...

Thanks and congratulations.

Link to comment
Share on other sites

1. it is possible. Are you imagining something like "example.com/2060/name-of-file"? Where 2060 is the page id.

2. Didn't get exactly what you want in this one. You mean bypass completely the local filesystem and only store the files on S3?  

Link to comment
Share on other sites

I think people have looked at it and found the processwire admin hard codes local folders so it would be hard to have a website only use a remote server for files without modifications to processwire.  (Only save files in s3)

Link to comment
Share on other sites

  • 1 year later...
  • 2 months later...
On 4/1/2016 at 9:34 PM, Sérgio Jardim said:

Hi @zyON, I just submitted a pull request:

---

Now when a image is resized via API, upload this size variation too to S3.
I also changed line 21 to make it run on PW 3+

---

Could you kindly review it? :)

Hi @Sérgio Jardim

I'm looking for this functionality, too. Did you actually get this working, where image resizes are uploaded to S3? It seems the module is perhaps no longer being developed, but it still seems to work in PW 3 - I'd like to enhance at least my copy of it. 

Thanks

  • Like 1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...