First steps toward the multi-instance auto-scaling holy grail...

nbcommunication · April 17, 2023

Hello,

We've recently been researching how to use ProcessWire in a horizontal scaling environment (multiple server instances using a load balanced, read replica databases), and ran an experiment using AWS Elastic Beanstalk.

Getting read replica databases up and running was easy - it's built in to the core: https://processwire.com/blog/posts/pw-3.0.175/#how-to-use-it-in-processwire

Using multiple server instances throws up one big problem: how to keep the filesystem on multiple instances in sync given that ProcessWire doesn't currently support using an external service (like s3 or EFS) as the filesystem.

The solution that we came up with is to use various Cloudflare services (R2, Stream, Images) to serve file assets, and we've built a module to facilitate this:

We're not using this in production yet, but our tests on EB were successful, and we're confident this will solve the main part of this problem. However the Cloudflare Images service is still quite new and there's still features to be rolled out (e.g. webP for flexible variants) so it can't be considered a complete solution yet.

Additionally, we use ProCache and this presents an additional multi-instance problem - if the cache is cleared on one, how can we clear it on all? Our solution is to log clears in the database and use this to sync up clearing. We built another module:

Again this worked well in our test, but isn't yet being used in production.

The main purpose of this thread, aside from sharing these potential solutions, is to ask for and discuss other experiences of hosting ProcessWire in a horizontal scaling environment. What solutions did you come up with (if you want to share them) and are there other potential issues we maybe haven't thought about?

Cheers,

Chris

wbmnfktr · April 18, 2023

Wow... is all I can say right now for the moment.
What amount of traffic or hits/second are you awaiting for that kind of setup?

I built and ran pretty cheap and simple setups that handled up to about 30-50k hits*/day without noticable issues - ok, those sites were ProCached and running behind Cloudflare CDN (free tier), yet... it worked out. They probably could have handled even more.

Nothing of my projects here are scaling horizontally, vertically or in any other direction ? compared to your setup.

It's not within your league of setups by any measure - but here is how I built something that scaled back in my days very well:

JS files came from sub[1-3].domain.tld
- super necessary parts were inlined
- file_get_contents of custom JS came from external sources
CSS files came from sub[1-3].domain.tld
- almost all (critical) CSS was inlined
- file_get_contents of custom CSS came from external sources
IMGs came from assets[1-3].domain.tld
Cloudflare took care of GZIP and compressing and caching the output (not sure about brotli)
ProCache took care of the heavy load prior to everything else as 95% of the whole site/s were cached (pre-cached by using a Sitescraper after each release) with a very long lifetime
Asset and file handling were kind of static and strict without much options for custom solutions (wasn't really necessary for those sites) as the overall page setups were kind of minimal and simple (blog style - minimal differences)
files like JS, CSS, IMGs came from other services and not my host, actually everything from a subdomain came from other services as the hosting was too cheap to handle lots of requests - I used Github, Zeitgeist (which is Vercel now - I guess), and some other services I can't remember, for that

It was a bl**dy hell to make that work back then (BUT I had to save money I didn't have then) - but those were also one of my very first real projects with ProcessWire then (one of my first public 10 projects ever, and most of them were my own projects) - nowadays that setup would probably be still annoying in some parts, yet more feasible and easier to handle with way better results.

My issues back then were limited database and webserver connections (those were over limit pretty fast) at my hosting companies (HostN*n, Dream***, Host***, Blue***, A2***, and such - super cheap) so I split all assets to other services and made them work via subdomains.

In the very early days I only paid something between 0,99 USD/month for those sites. Later on 2,99 USD and even later 8,99 USD.
It only became faster and faster. About a year before selling/shutting down those projects I paid about 60 USD/month/project. STEEP!

Still the almost same setups could easily handle more than double/triple the hits*/day nowadays but with far better pagespeed results than ever before.

Till today I'm happy with these kind of setups for my projects.
The moment I reach at least 50k+ hits*/day with a project I return to that but with methods and services from today.

What I use nowadays (for whatever reason - you will find out ??

webgo
IONOS
Hetzner
Plusline Server
Netlify
Vercel
Cloudflare Pages
Cloudflare CDN
Cloudinary
Planetscale
Runway
Superbase

* real hits/users/sessions - no fake requests
** paid plans for super high traffic sites, otherwise free tiers

Sign In

First steps toward the multi-instance auto-scaling holy grail...

Recommended Posts

nbcommunication

Link to comment

Share on other sites

wbmnfktr

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Recently Browsing 0 members

Browse

Activity

My Activity Streams

Support

Store

My Details