Jump to content
shogun

is processwire database structure on fields scalable?

Recommended Posts

Excuse my ignorance, this a pure whimsical question, but I noticed that whenever you add a field inside processwire it creates an entire database table out of the field inside the processwire database. Seems like you can easily get to 100 tables. I'm wondering why it was done that way. Is that scalable.

I would've thought it's better to just have all processwire fields added into a single database table with a few columns that give context on their associations.

Share this post


Link to post
Share on other sites

Hi shogun,

in my opinion this is the essential key for the flexibility and freedom you have in ProcessWire, compared to other CMF/CMS.
About scalability you should find a lot of positive threads here in the forums.

In short: PW is the right choice!

 

 

Share this post


Link to post
Share on other sites
7 hours ago, shogun said:

I would've thought it's better to just have all processwire fields added into a single database table with a few columns that give context on their associations.

You mean like in a spreadsheet? You mean including the page values of the fields as well or just the field definitions?

Share this post


Link to post
Share on other sites

PW is amazing! I was just wondering. It seems adding tables for every field isn't the best database structure for efficiency, but I assume they made the right choice!

Share this post


Link to post
Share on other sites
Quote

I would've thought it's better to just have all processwire fields added into a single database table with a few columns that give context on their associations.

Is that not how Wordpress does it?

I expect that an additional table per field is more scalable. Even if you have 100 or a 1000 fields-tables, that will not have any negative effect on performance. 

Share this post


Link to post
Share on other sites

No i think wordpress keeps things that keep growing in like one table, but honestly I could care less about Wordpress lol. ProcessWire is definitely better. Just was wondering about this. Maybe I still need to brush up on my database structure skills.

Share this post


Link to post
Share on other sites
18 minutes ago, eydun said:

Even if you have 100 or a 1000 fields-tables, that will not have any negative effect on performance. 

The DB itself won't suffer, but PW performance eventually will. Too many templates and fields won't scale endlessly.

Share this post


Link to post
Share on other sites
38 minutes ago, dragan said:

The DB itself won't suffer, but PW performance eventually will. Too many templates and fields won't scale endlessly.

On the other hand, "flat", that is non-normalized tables, scale only somewhat. Where they don't scale well is in update performance, so if a site is highly dynamic, updates create massive locks and those locks impact the frontend too. Then, if a CMS uses non-normalized tables, multi-valued and complex field contents will have to be serialized and de-serialized to fit into their column. That's only fine until you want to search for it. In that case, things get ugly, and if that search can even be implemented, it will be a much worse performance hog than ten extra field tables in PW.

The web server question "does it scale?" is usually not answered by database statistics. It's answered by a concise design that allows for good, hierarchical and atomic caching strategies. If a site is designed with that in mind (and even sometimes when it isn't), performance can easily be improved.

I like to take our corporate intranet as an example, with 12000 active pages, now 125 templates, 28 repeaters, 35 site modules, 220 additional php scripts and everything a good book on website speed basics tells you not to do 😉 We have around 1000 users who use it daily, and a good number of them have a few auto refresh pages loading every 2 minutes. I'm running opcache, use processwire's $cache interface to populate recurring, user dependent patterns in some fields and have a small home-written module that uses memcache to store and refresh data that would otherwise be retrieved from external systems with each call. First time page load including fonts, styles and a healthy collection of scripts is about 5 seconds. Times to page view for consecutive visits to the site are almost always below 2.5 seconds. Everything runs on one single server, including the database, and there's no caching proxy in front of it. That's on PW 3.0.125. I'm currently in the process of upgrading to latest IIS, latest PW stable and latest PHP. My first impression is that I'll probably get the time-to-view reliably below 2 seconds with a bit of tuning.

  • Like 3

Share this post


Link to post
Share on other sites

One thing to remember with Processwire is that while the default behaviour, and that of most built in fieldtypes is to use one table per field, like with most things, Processwire is incredibly flexible, so if you have a use case where a single table with multiple fields will be a more efficient way to store and access data, Processwire supports that too through custom field types, and indeed their are some built in fieldtypes that actually do work this way, such as map markers. 

Profields also includes a table fieldtype which as it sounds, is a database table with multiple fields.

It takes a bit of work to write a module to provide a custom fieldtype but there are examples in the modules directory such as fieldtype Events which is specifically intended as an example of how to make a fieldtype that corresponds to multiple database fields.

https://modules.processwire.com/modules/fieldtype-events/

In answer to the question, "Is the Processwire database structure scalable", I'd say that's entirely up to the developer, as you can really choose whatever database structure you like, just it takes a bit more work if you want to store multiple fields per table, and usually the built-in fieldtypes and default behaviour works fine.

Where I have noticed there can be a little bit of a performance hit is doing a large import from CSV where each field from the CSV file results in a separate insert operation to a different table, whereas with a traditional database table with multiple fields would result in just a single insert operation per record, and potentially fewer indexes to update as well.

If you're doing single page additions via the backend, via user interaction, Processwire is plenty fast enough, and there are performance gains to be had at the read level if you don't need to read every single field from a page every time it's accessed in front end templates, eg building a menu, you probably only want title and url, so reading the entire page content as well is unnecessary overhead, but if the content is in a separate table, and you don't access it, you don't have that overhead.

There are always performance tradeoffs, but I think Processwire has struck a good balance. A default that works well for most people, but the means to do things differently if you really need to.

  • Like 5

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...