Jump to content

Does PW use too much JOIN queries?


Hieu Do
 Share

Recommended Posts

Hi everyone,

I'm a PW newbie, I just switched from Wordpress to PW a few days ago. After installing and playing around with PW, I do have a concern about PW's database structure. I'm afraid it may not scale easily with several millions of records since it has too many tables. I'm not either a DBA or database expert but I hate joining in relational DB. It is really a pain that slows down most of my WP sites because continuous joining consumes much CPU/RAM. You know, WP has a table wp_postmeta contains metadata of posts (it's really big, at least 10x bigger than wp_posts if you have 10 fields associated with a post) and everytime I want to gather all fields of a book (assuming my site is a book library) WP has to issue a very long query, it also makes mysqld always in top of heavy processes in my VPS.

I can see PW uses separate table for each field. I don't understand the intention of this approach. Could anyone help me catch the idea behind this? Is there anything impact the performance?

I just thought that most information of a book is one-to-one relationship, why we need to separate them then have to join them again?

Thanks

Link to comment
Share on other sites

Hi Hieu Do. Welcome to PW and the forums!

PW can easily handle 100's of thousands of rows. This question has come up before. I can't find the relevant forum posts atm but am sure someone else will give you a better answer.

Meanwhile, this answers part of your question: http://processwire.com/about/what/

Edit:

Here's are other useful threads...

http://processwire.com/talk/topic/2503-question-about-extreme-scale-hundred-of-thousand-maybe-millions-of-pages/

http://processwire.com/talk/topic/37-internal-optimisation/

Edited by kongondo
  • Like 1
Link to comment
Share on other sites

Hi Hieu, first of all welcome to PW!

When i saw all the tables created from PW i was afraid too, but there are some functionalities that maybe can help:

1) You can set some fields that load only when you access them.

2) You can set "$config->dbCache = true" for caching the queries.

3) I think the search is the major problem concerning the memory consumption because by default the PW need to search in every field and so in every DB table. But i made a module that store all the keywords extracted from those fields in only one, doing so PW search in only ONE table without any JOIN.

4) There is also the cache of the entire page. Look at this module and here

I must say that currently i don't had the chance to do a site with a million records, it would be interesting if we can do a test with tons of pages :) !

USSliberty

  • Like 3
Link to comment
Share on other sites

Hi Hieu Do,

What you're worried about is one of the biggest strenght of Pw.

Having each field in a separate table does scale very well - mySQL is very fast when Joining optimized tables, like Pw generates them.

Plus you have the advantage that each field can store as much metadata as needed with different table structures.

Also in Pw you can label your fields different in each template context, this is supporting you to reuse existing fields.

For example: In your 'member' template, just give your title field the label 'Forename + Lastname'

My suggestion is: Give Pw a try, it's faster than every CMS/CMF I used before.

Additionally there is the template cache, markup cache and a module from ryan called ProCache which completely bypasses MySQL + PHP serving

you html pages :)

1) You can set some fields that load only when you access them.

That's the default behaviour. Pw only joins and loads fields you access, unless defined as autoload.

3) I think the search is the major problem concerning the memory
consumption because by default the PW need to search in every field and
so in every DB table.

Not quite sure what you mean here. You can specify in which fields you want to search your keywords

  • Like 6
Link to comment
Share on other sites

One of the reasons for table per field design is definitely performance.

I think this is a very good point - the thing is, different sections of your site will use different fields, so why would you want to pull in every field for pages that don't need them? You have to bear in mind that it's perfectly acceptable to have a template with just a Title field in ProcessWire (there are many practical uses) so why call in a body, summary or image field if there isn't any body, summary or image content? The whole system is designed to only access the data necessary for the page you're viewing.

It's also worth noting that if you are using it for a site with hundreds of thousands of pages, there are a few caching options like MarkupCache or ProCache - the latter of which will make your site lightning fast as essentially it serves a static version of a page and bypasses PHP and mySQL completely :) Of course there are situations where you might have to think about the best caching methods for your particular project, but if you ask around on the forums people will point you in the right direction.

  • Like 3
Link to comment
Share on other sites

Wow, thanks for your advices, they help me a lot and give me more confidence to keep using PW. I just forgot about indexing and caching, they're wide matter but really helpful in scalability indeed. I heard someone said that MySQL would run efficiently with optimized single queries rather than complex queries. :)

About the searching in PW, I think there are a lot of dedicated full text search servers like Sphinx, Solr, Lucence, Xapian that are feature-rich and more powerful than MySQL native search functionality. I used to use Sphinx to replace Wordpress built-in search and it worked perfectly, very fast and very less memory consumption. I can say it's a must for every database-driven website.

A little question, is there any module/plugin support integrating one of those search engines in PW yet?? 

Link to comment
Share on other sites

1) You can set some fields that load only when you access them.
 

 

that what.it do already .only autoload field.join w page

 

 

2) You can set "$config->dbCache = true" for caching the queries.
 

 

that.what it do already two .you no need two set it

 

 

3) I think the search is the major problem concerning the memory consumption because by default the PW need to search in every field and so in every DB table. But i made a module that store all the keywords extracted from those fields in only one, doing so PW search in only ONE table without any JOIN.

pw already have.this it called

fieldtypecache

  • Like 3
Link to comment
Share on other sites

  • 2 weeks later...

:o Oh God, 

3) I think the search is the major problem concerning the memory consumption because by default the PW need to search in every field and so in every DB table. But i made a module that store all the keywords extracted from those fields in only one, doing so PW search in only ONE table without any JOIN.

pw already have.this it called

fieldtypecache

:o Oh God, so 70% of my module is useless. I saw now that fieldtypecache is in the core but not listed on this site, also the name of the module was for me misleading because i always think that "cache" was for templates (html,css,js and so on).

  • Like 1
Link to comment
Share on other sites

:o Oh God, so 70% of my module is useless. I saw now that fieldtypecache is in the core but not listed on this site, also the name of the module was for me misleading because i always think that "cache" was for templates (html,css,js and so on).

I haven't tried your module, but if I remember correctly it had some support for files (pdf, doc) too? If that's really the case, that alone would make it very useful.. and that's definitely what I'd recommend focusing on, as it's pretty damn important for certain sites :)

  • Like 2
Link to comment
Share on other sites

I haven't tried your module, but if I remember correctly it had some support for files (pdf, doc) too? If that's really the case, that alone would make it very useful.. and that's definitely what I'd recommend focusing on, as it's pretty damn important for certain sites :)

In fact it was the main goal of my module. But if i only had knew the existence of fieldtypecache maybe i clould save some lines of code :D

  • Like 2
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...