Hieu Do Posted July 9, 2013 Share Posted July 9, 2013 Hi everyone, I'm a PW newbie, I just switched from Wordpress to PW a few days ago. After installing and playing around with PW, I do have a concern about PW's database structure. I'm afraid it may not scale easily with several millions of records since it has too many tables. I'm not either a DBA or database expert but I hate joining in relational DB. It is really a pain that slows down most of my WP sites because continuous joining consumes much CPU/RAM. You know, WP has a table wp_postmeta contains metadata of posts (it's really big, at least 10x bigger than wp_posts if you have 10 fields associated with a post) and everytime I want to gather all fields of a book (assuming my site is a book library) WP has to issue a very long query, it also makes mysqld always in top of heavy processes in my VPS. I can see PW uses separate table for each field. I don't understand the intention of this approach. Could anyone help me catch the idea behind this? Is there anything impact the performance? I just thought that most information of a book is one-to-one relationship, why we need to separate them then have to join them again? Thanks Link to comment Share on other sites More sharing options...
kongondo Posted July 9, 2013 Share Posted July 9, 2013 (edited) Hi Hieu Do. Welcome to PW and the forums! PW can easily handle 100's of thousands of rows. This question has come up before. I can't find the relevant forum posts atm but am sure someone else will give you a better answer. Meanwhile, this answers part of your question: http://processwire.com/about/what/ Edit: Here's are other useful threads... http://processwire.com/talk/topic/2503-question-about-extreme-scale-hundred-of-thousand-maybe-millions-of-pages/ http://processwire.com/talk/topic/37-internal-optimisation/ Edited July 9, 2013 by kongondo 1 Link to comment Share on other sites More sharing options...
Alessio Dal Bianco Posted July 9, 2013 Share Posted July 9, 2013 Hi Hieu, first of all welcome to PW! When i saw all the tables created from PW i was afraid too, but there are some functionalities that maybe can help: 1) You can set some fields that load only when you access them. 2) You can set "$config->dbCache = true" for caching the queries. 3) I think the search is the major problem concerning the memory consumption because by default the PW need to search in every field and so in every DB table. But i made a module that store all the keywords extracted from those fields in only one, doing so PW search in only ONE table without any JOIN. 4) There is also the cache of the entire page. Look at this module and here I must say that currently i don't had the chance to do a site with a million records, it would be interesting if we can do a test with tons of pages ! USSliberty 3 Link to comment Share on other sites More sharing options...
Wanze Posted July 9, 2013 Share Posted July 9, 2013 Hi Hieu Do, What you're worried about is one of the biggest strenght of Pw. Having each field in a separate table does scale very well - mySQL is very fast when Joining optimized tables, like Pw generates them. Plus you have the advantage that each field can store as much metadata as needed with different table structures. Also in Pw you can label your fields different in each template context, this is supporting you to reuse existing fields. For example: In your 'member' template, just give your title field the label 'Forename + Lastname' My suggestion is: Give Pw a try, it's faster than every CMS/CMF I used before. Additionally there is the template cache, markup cache and a module from ryan called ProCache which completely bypasses MySQL + PHP serving you html pages 1) You can set some fields that load only when you access them. That's the default behaviour. Pw only joins and loads fields you access, unless defined as autoload. 3) I think the search is the major problem concerning the memoryconsumption because by default the PW need to search in every field andso in every DB table. Not quite sure what you mean here. You can specify in which fields you want to search your keywords 6 Link to comment Share on other sites More sharing options...
apeisa Posted July 9, 2013 Share Posted July 9, 2013 One of the reasons for table per field design is definitely performance. 2 Link to comment Share on other sites More sharing options...
Alessio Dal Bianco Posted July 9, 2013 Share Posted July 9, 2013 Wanze if you have a lot of text fields you need to search in if you would have better results Link to comment Share on other sites More sharing options...
Pete Posted July 9, 2013 Share Posted July 9, 2013 One of the reasons for table per field design is definitely performance. I think this is a very good point - the thing is, different sections of your site will use different fields, so why would you want to pull in every field for pages that don't need them? You have to bear in mind that it's perfectly acceptable to have a template with just a Title field in ProcessWire (there are many practical uses) so why call in a body, summary or image field if there isn't any body, summary or image content? The whole system is designed to only access the data necessary for the page you're viewing. It's also worth noting that if you are using it for a site with hundreds of thousands of pages, there are a few caching options like MarkupCache or ProCache - the latter of which will make your site lightning fast as essentially it serves a static version of a page and bypasses PHP and mySQL completely Of course there are situations where you might have to think about the best caching methods for your particular project, but if you ask around on the forums people will point you in the right direction. 3 Link to comment Share on other sites More sharing options...
diogo Posted July 9, 2013 Share Posted July 9, 2013 Not an expert in databases myself, but it seems to me that Ryan did a good job in indexing the tables to improve performance 2 Link to comment Share on other sites More sharing options...
Hieu Do Posted July 10, 2013 Author Share Posted July 10, 2013 Wow, thanks for your advices, they help me a lot and give me more confidence to keep using PW. I just forgot about indexing and caching, they're wide matter but really helpful in scalability indeed. I heard someone said that MySQL would run efficiently with optimized single queries rather than complex queries. About the searching in PW, I think there are a lot of dedicated full text search servers like Sphinx, Solr, Lucence, Xapian that are feature-rich and more powerful than MySQL native search functionality. I used to use Sphinx to replace Wordpress built-in search and it worked perfectly, very fast and very less memory consumption. I can say it's a must for every database-driven website. A little question, is there any module/plugin support integrating one of those search engines in PW yet?? Link to comment Share on other sites More sharing options...
teppo Posted July 10, 2013 Share Posted July 10, 2013 @Hieu Do: there was some discussion about Apache Solr a while back, though I don't think anything concrete came out of it (not yet, at least.) Most of the modules publicly available are listed at the modules directory, so you might want to check that out at some point. 1 Link to comment Share on other sites More sharing options...
WillyC Posted July 10, 2013 Share Posted July 10, 2013 1) You can set some fields that load only when you access them. that what.it do already .only autoload field.join w page 2) You can set "$config->dbCache = true" for caching the queries. that.what it do already two .you no need two set it 3) I think the search is the major problem concerning the memory consumption because by default the PW need to search in every field and so in every DB table. But i made a module that store all the keywords extracted from those fields in only one, doing so PW search in only ONE table without any JOIN. pw already have.this it called fieldtypecache 3 Link to comment Share on other sites More sharing options...
Alessio Dal Bianco Posted July 19, 2013 Share Posted July 19, 2013 Oh God, 3) I think the search is the major problem concerning the memory consumption because by default the PW need to search in every field and so in every DB table. But i made a module that store all the keywords extracted from those fields in only one, doing so PW search in only ONE table without any JOIN. pw already have.this it called fieldtypecache Oh God, so 70% of my module is useless. I saw now that fieldtypecache is in the core but not listed on this site, also the name of the module was for me misleading because i always think that "cache" was for templates (html,css,js and so on). 1 Link to comment Share on other sites More sharing options...
teppo Posted July 19, 2013 Share Posted July 19, 2013 Oh God, so 70% of my module is useless. I saw now that fieldtypecache is in the core but not listed on this site, also the name of the module was for me misleading because i always think that "cache" was for templates (html,css,js and so on). I haven't tried your module, but if I remember correctly it had some support for files (pdf, doc) too? If that's really the case, that alone would make it very useful.. and that's definitely what I'd recommend focusing on, as it's pretty damn important for certain sites 2 Link to comment Share on other sites More sharing options...
Alessio Dal Bianco Posted July 19, 2013 Share Posted July 19, 2013 I haven't tried your module, but if I remember correctly it had some support for files (pdf, doc) too? If that's really the case, that alone would make it very useful.. and that's definitely what I'd recommend focusing on, as it's pretty damn important for certain sites In fact it was the main goal of my module. But if i only had knew the existence of fieldtypecache maybe i clould save some lines of code 2 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now