Jump to content

First steps toward the multi-instance auto-scaling holy grail...


nbcommunication
 Share

Recommended Posts

Hello,

We've recently been researching how to use ProcessWire in a horizontal scaling environment (multiple server instances using a load balanced, read replica databases), and ran an experiment using AWS Elastic Beanstalk. 

Getting read replica databases up and running was easy - it's built in to the core: https://processwire.com/blog/posts/pw-3.0.175/#how-to-use-it-in-processwire

Using multiple server instances throws up one big problem: how to keep the filesystem on multiple instances in sync given that ProcessWire doesn't currently support using an external service (like s3 or EFS) as the filesystem.

The solution that we came up with is to use various Cloudflare services (R2, Stream, Images) to serve file assets, and we've built a module to facilitate this:

We're not using this in production yet, but our tests on EB were successful, and we're confident this will solve the main part of this problem. However the Cloudflare Images service is still quite new and there's still features to be rolled out (e.g. webP for flexible variants) so it can't be considered a complete solution yet.

Additionally, we use ProCache and this presents an additional multi-instance problem - if the cache is cleared on one, how can we clear it on all? Our solution is to log clears in the database and use this to sync up clearing. We built another module:

Again this worked well in our test, but isn't yet being used in production.

The main purpose of this thread, aside from sharing these potential solutions, is to ask for and discuss other experiences of hosting ProcessWire in a horizontal scaling environment. What solutions did you come up with (if you want to share them) and are there other potential issues we maybe haven't thought about?

Cheers,

Chris

  • Like 8
  • Thanks 1
Link to comment
Share on other sites

Wow... is all I can say right now for the moment.
What amount of traffic or hits/second are you awaiting for that kind of setup?

I built and ran pretty cheap and simple setups that handled up to about 30-50k hits*/day without noticable issues - ok, those sites were ProCached and running behind Cloudflare CDN (free tier), yet... it worked out. They probably could have handled even more.

Nothing of my projects here are scaling horizontally, vertically or in any other direction 🙊 compared to your setup.

It's not within your league of setups by any measure - but here is how I built something that scaled back in my days very well:

  • JS files came from sub[1-3].domain.tld
    • super necessary parts were inlined
    • file_get_contents of custom JS came from external sources
  • CSS files came from sub[1-3].domain.tld
    • almost all (critical) CSS was inlined
    • file_get_contents of custom CSS came from external sources
  • IMGs came from assets[1-3].domain.tld
  • Cloudflare took care of GZIP and compressing and caching the output (not sure about brotli)
  • ProCache took care of the heavy load prior to everything else as 95% of the whole site/s were cached (pre-cached by using a Sitescraper after each release) with a very long lifetime
  • Asset and file handling were kind of static and strict without much options for custom solutions (wasn't really necessary for those sites) as the overall page setups were kind of minimal and simple (blog style - minimal differences)
  • files like JS, CSS, IMGs came from other services and not my host, actually everything from a subdomain came from other services as the hosting was too cheap to handle lots of requests - I used Github, Zeitgeist (which is Vercel now - I guess), and some other services I can't remember, for that

It was a bl**dy hell to make that work back then (BUT I had to save money I didn't have then) - but those were also one of my very first real projects with ProcessWire then (one of my first public 10 projects ever, and most of them were my own projects) - nowadays that setup would probably be still annoying in some parts, yet more feasible and easier to handle with way better results.

My issues back then were limited database and webserver connections (those were over limit pretty fast) at my hosting companies (HostN*n, Dream***, Host***, Blue***, A2***, and such - super cheap) so I split all assets to other services and made them work via subdomains.

In the very early days I only paid something between 0,99 USD/month for those sites. Later on 2,99 USD and even later 8,99 USD.
It only became faster and faster. About a year before selling/shutting down those projects I paid about 60 USD/month/project. STEEP!

Still the almost same setups could easily handle more than double/triple the hits*/day nowadays but with far better pagespeed results than ever before.

Till today I'm happy with these kind of setups for my projects.
The moment I reach at least 50k+ hits*/day with a project I return to that but with methods and services from today.

What I use nowadays (for whatever reason - you will find out 😉😞

  • webgo
  • IONOS
  • Hetzner
  • Plusline Server
  • Netlify
  • Vercel
  • Cloudflare Pages
  • Cloudflare CDN
  • Cloudinary
  • Planetscale
  • Runway
  • Superbase

 

* real hits/users/sessions - no fake requests
** paid plans for super high traffic sites, otherwise free tiers

  • Like 4
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...