hheyne Posted March 27, 2015 Share Posted March 27, 2015 Hello, I have a question about a recent project. In this project I use PW as framework (really amazing framework) to build a web app. In this app users can login (via pw user, group access) and post something. I think there will be one million posts per year (200 users, each 20 posts per day). Every post is of course a „page”. Do anyone have had any trouble with the database in such cases? Is there anything (database or process wire) to keep an closer eye on? Best Regards Henning Link to comment Share on other sites More sharing options...
diogo Posted March 27, 2015 Share Posted March 27, 2015 Have a look here https://processwire.com/talk/topic/2503-question-about-extreme-scale-hundred-of-thousand-maybe-millions-of-pages/ Link to comment Share on other sites More sharing options...
LostKobrakai Posted March 27, 2015 Share Posted March 27, 2015 There are people using that much pages (https://processwire.com/talk/topic/9336-need-help-deleting-an-empty-field-from-a-template-with-2-million-pages/?hl=%2Bmillion+%2Bfield) and the only thing that does not automatically scale for that much pages are files/images. I can't recall where I've read about it, but basicly it's a filesystem limitation, which lets folders only have a certain amount of subfolders/items. ProcessWire needs to be set to use multiple site/assets/files/ folders to prevent a "overflow" of the standart single folder. 3 Link to comment Share on other sites More sharing options...
Soma Posted March 27, 2015 Share Posted March 27, 2015 (edited) @lostkobrakai There's in config.php for that problem. $config->pagefileExtendedPaths = true; Apart from that there's no limit in PW, going for millions may require special and different strategy depending on what it is built for. After millions of pages a fulltext search on large text's can grow linear quickly to several seconds. One also just have to take care about what code you build and it's easy to run into timeout or memory limit if you don't be careful. Also it depends if there's a lot going on like lots of user that post something. PW handles that all well but depends also on what server. Edited March 27, 2015 by Adam Kiss Added code styling 4 Link to comment Share on other sites More sharing options...
nickie Posted March 27, 2015 Share Posted March 27, 2015 I am currently managing a PW site with 2 million+ pages. It's admirably fast, and much, much faster than any other CMS we tested. Searching is also ridiculously fast when done on single fields like title. (I also just did a test search using the page finder and it took < 4 seconds to find pages which had a particular field empty from a template which has 1.63 million pages.) The site doesn't deal with many image or file uploads (yet), but two optimizations I have applied so far are to 1) always, always use limits on using $pages->find(), and 2)to cache the sitemaps (which contain thousands of links each) using Procache(https://processwire.com/api/modules/procache/) Once you know where specifically your site is using the most resources, you can apply more selective caching / database optimizations. Thanks for starting this topic, I learned about pageFileExtendedPaths... there's always some cool feature I didn't know about and now must have! 12 Link to comment Share on other sites More sharing options...
hheyne Posted March 28, 2015 Author Share Posted March 28, 2015 Thank you very much for all the answers and hints. I will proceed with the project and report later (in a few month) about the result and what I have learned so far from this project. Link to comment Share on other sites More sharing options...
Pete Posted March 31, 2015 Share Posted March 31, 2015 Searching is also ridiculously fast when done on single fields like title. (I also just did a test search using the page finder and it took < 4 seconds to find pages which had a particular field empty from a template which has 1.63 million pages.) Have you tried using the cache fieldtype to combine multiple fields' data into one for things such as searching? It's in the core an Teppo explains it well here: https://processwire.com/talk/topic/5513-fieldtype-cache-please-elaborate/ 2 Link to comment Share on other sites More sharing options...
enricob Posted April 1, 2015 Share Posted April 1, 2015 Very impressive @nickie, could you please share which is the site with more than 2 milions of pages? Link to comment Share on other sites More sharing options...
nickie Posted April 2, 2015 Share Posted April 2, 2015 Thanks for the suggestion and the link, Pete, so far our front-end search needs were limited to the title field, but this is going to change soon as lots of social features are being implemented, so I will definitely need to learn all about the cache fieldtype. enricob, the site is currently in open beta and lacking some functionality and server tuning we want to have in place before sharing (especially as an example of what PW can do). A lot of new features are still being implemented and hopefully the site should be well-tested and ready to showcase within about a month. I am eagerly looking forward to posting a case-study about it in the forums then, and will drop you a note as a heads up. 4 Link to comment Share on other sites More sharing options...
hheyne Posted April 2, 2015 Author Share Posted April 2, 2015 @nickie I'm also looking forward to see the site. To the end of the year I will share the results of my project – hopefully it will be in a nearly finished state until then. Link to comment Share on other sites More sharing options...
dotnetic Posted May 10, 2017 Share Posted May 10, 2017 @nickie Any update or showcase for the site? 1 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now