Jump to content

Max number of pages?


muzzer
 Share

Recommended Posts

Hi,

Coming from modx (Evolution) I found one restriction (which was later solved in Revo version) was a page limit of between 2000 and 5000 pages, after which performance apparently took a nosedive. I never fully tested this but it was a restriction I was always aware of and it did prevent be building a couple of sites.

I would be interested to know if there is a theoretical "page limit" in PW, or alternatively has anyone developed a site with 5000+ pages and how well does it perform. How does it affect caching? Thanks :)

Link to comment
Share on other sites

Not sure if there is a limit other than your hardware. Depends what those 5000 pages are doing but I see no problem at all even without cache. It's much faster than MODX. If you take care of what you do with queries in templates and maybe use partial markup cache for heavy queries I think you can do a very performant website within the hundred thousands easily only really limited on performance of the server PW is quite optimized.

Anyway it has never really been tested how far you could go but I heard about sites in the twenty+ thousands and they still seem to be fine. Just recently there happened to be some issue with large scale when saving pages and already got more optimized.

  • Like 1
Link to comment
Share on other sites

Highest page count I've seen on a production site so far is 30k+ and that site runs just fine. All things considered I wouldn't be too worried -- PW is surprisingly resilient :)

There's an earlier thread with almost the same question that you really should read; in his reply Ryan explains some of the things you need to consider when building a (very) large site: http://processwire.com/talk/topic/1527-largest-project-performance-multi-server/#entry13779. That performance issue Soma mentioned is discussed further here: http://processwire.com/talk/topic/1508-performance-problem-possibly-a-bug-in-pages-class/ (and yes, related problem / bottleneck has already been fixed.)

  • Like 1
Link to comment
Share on other sites

I have an app built on top of PW running, which is currently on 6.5k pages; of those, 3.5k are children of one page (!); The app feels a little slower, but that is probably due to no performance review/check on my side.

Also live generating statistics over those 3.5k pages takes a few (4-8 ) seconds, but it's just because it's always regenerating statistics, iterating the pages multiple times.

Link to comment
Share on other sites

I'm running 21k pages on one site right now, and no difference in speed than if it was running a dozen pages. Though of course, I'm not doing anything that loads all those 21k pages at once (I don't think that would even be possible). :)

  • Like 1
Link to comment
Share on other sites

  • 2 years later...

is there any limitation?
I'm worried that the PAGE-ID Field is (might be?) a INTEGER and can not be used with >65k pages.. or if there are any other limits in the core...

(Currently I'm running a system with 25k pages, but there are chances to break the 100k limit... therefore i want to be prepared..)
 

Link to comment
Share on other sites

What 100k limit? Somebody here had 500k pages :D.

  1. The page_id field is int(10) - the 10 just means how many digits can be displayed in the db column
  2. It is an unsigned integer field so...from the MySQL docs...

All integer types can have an optional (nonstandard) attribute UNSIGNED. Unsigned type can be used to permit only nonnegative numbers in a column or when you need a larger upper numeric range for the column. For example, if an INT column is UNSIGNED, the size of the column's range is the same but its endpoints shift from-2147483648 and 2147483647 up to 0 and 4294967295.

http://dev.mysql.com/doc/refman/5.1/en/numeric-type-attributes.html

http://stackoverflow.com/questions/8892341/what-does-int5-in-mysql-mean

http://dev.mysql.com/doc/refman/5.0/en/numeric-type-overview.html

You will be fine :D

  • Like 4
Link to comment
Share on other sites

FWIW we have an extranet app running on PW, the top 4 parent pages having 2657, 8201, 2850 & 1750 children each. (There is also a non-PW db table that has about 1.6 million rows but that doesn't count.)

There are a couple of templates that take longer to load than I would be satisfied with on a public site, but that is because of amount of processing going on when they load. The admin interface is no different in performance terms than a site with 10 pages total.

There is no caching, as the data are always changing and all users are always logged in.

  • Like 3
Link to comment
Share on other sites

I have a site that has over 20k, I haven't run any benchmarks on it but it still is running like a champ. I'll have around 10k more by the end of the month and I don't see any reason that the site would be much slower. I will on the other hand be adding more resources for the vps so that page fetches are a bit snappier. I suppose you would just want a bit more horsepower for the MySQL back end if your uncached pages are a bit slow. 

Link to comment
Share on other sites

The bottleneck could be the filesystem when you reach more then 30.000 pages with assets. But ProcessWire has you covered with the pagefileExtendedPaths setting. When you expect your site to grow beyond 30.000 pages with images or/or files enable this setting in /site/config.php

Ryan states that this feature is beta but I confirm it's working without any quicks.

/**
 * Use extended file mapping? Enable this if you expect to have >30000 pages in your site.
 * 
 * Warning: The extended file mapping feature is not yet widely tested, so consider it beta.
 *
 * Set to true in /site/config.php if you want files to live in an extended path mapping system
 * that limits the number of directories per path to under 2000.
 *
 * Use this on large sites living on file systems with hard limits on quantity of directories
 * allowed per directory level. For example, ext2 and its 30000 directory limit.
 *
 * Please note that for existing sites, this applies only for new pages created from this
 * point forward.
 * 
 * @var bool
 *
 */
$config->pagefileExtendedPaths = false;
  • Like 5
Link to comment
Share on other sites

Man that is awesome man, I am going to enable that setting. I foresee my site getting to around 60-70k items as they will most likely be adding inventory in much larger batches now that all they have to do is give me a spreadsheet with the item data and the images. 

Link to comment
Share on other sites

The bottleneck could be the filesystem when you reach more then 30.000 pages with assets. But ProcessWire has you covered with the pagefileExtendedPaths setting. When you expect your site to grow beyond 30.000 pages with images or/or files enable this setting in /site/config.php

Ryan states that this feature is beta but I confirm it's working without any quicks.

/**

* Use extended file mapping? Enable this if you expect to have >30000 pages in your site.

*

* Warning: The extended file mapping feature is not yet widely tested, so consider it beta.

*

* Set to true in /site/config.php if you want files to live in an extended path mapping system

* that limits the number of directories per path to under 2000.

*

* Use this on large sites living on file systems with hard limits on quantity of directories

* allowed per directory level. For example, ext2 and its 30000 directory limit.

*

* Please note that for existing sites, this applies only for new pages created from this

* point forward.

*

* @var bool

*

*/

$config->pagefileExtendedPaths = false;

Ok, I am not seeing this inside my site/config.php is this something that doesn't exist in the latest versions, or is this something that I should add?

**

I found it in my wire/config.php and added it into my site/config.php. Thanks a ton, now to see if it does anything where I am now :).

  • Like 2
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By LuisM
      Hi there,
      while developing a sideproject which is completly build with ProcessModules i suddenly had the urge to measure the performance of some modules 😉 as a result, say welcome to the FlowtiAppPerformance module. 
      It comes bundled with a small helper module called FlowtiModuleProfiler. In the first release, even though you could select other modules, it will track the execution of selected Site/ProcessModules. 
      This will give you the ability to gain insights how your Application behaves. 
      The Main Module itself will come with 2 Logging Options, Database or PW Logs. Select Database for Charts and Logs...well If you just want your profiles as a simple log file in PW. 
      You also could choose to dump the request profile into TracyDebugger as shown here:

      Dont wonder about my avg_sysload, somehow my laptop cant handle multiple VMs that good 😄
      Settings Screen

      Monitoring

      FlowtiLogs

      again, dont look at the sysload 😄
      I will update the Module in the future to give some filter options and aggregation, but for now it satisfies my needs. 
      I hope it is helpfull for some. 
      Module is submited to the directory and hosted at github
      https://github.com/Luis85/FlowtiAppPerformance
      Any suggestions, wishes etc. are much appreciated. 
       
      Cheers,
      Luis
    • By Mobiletrooper
      Hey Ryan, hey friends,
      we, Mobile Trooper a digital agency based in Germany, use ProcessWire for an Enterprise-grade Intranet publishing portal which is under heavy development for over 3 years now. Over the years not only the user base grew but also the platform in general. We introduced lots and lots of features thanks to ProcessWire's absurd flexibility. We came along many CMS (or CMFs for that matter) that don't even come close to ProcessWire. Closest we came across was Locomotive (Rails-based) and Pimcore (PHP based).
      So this is not your typical ProcessWire installation in terms of size.
      Currently we count:
      140 Templates (Some have 1 page, some have >6000 pages)
      313 Fields
      ~ 15k Users (For an intranet portal? That's heavy.)
      ~ 195 431 Pages (At least that's the current AUTOINCREMENT)
       
      I think we came to a point where ProcessWire isn't as scalable anymore as it used to be. Our latest research measured over 20 seconds of load time (the time PHP spent scambling the HTML together). That's unacceptable unfortunately. We've implemented common performance strategies like:
      We're running on fat machines (DB server has 32 gigs RAM, Prod Web server has 32gigs as well. Both are running on quadcores (xeons) hosted by Azure.
      We have load balancing in place, but still, a single server needs up to 20 sec to respond to a single request averaging at around about 12 sec.
      In our research we came across pages that sent over 1000 SQL queries with lots of JOINs. This is obviously needed because of PWs architecture (a field a table) but does this slow mySQL down much? For the start page we need to get somewhere around 60-80 pages, each page needs to be queried for ~12 fields to be displayed correctly, is this too much? There are many different fields involved like multiple Page-fields which hold tags, categories etc.
      We installed Profiler Pro but it does not seem to show us the real bottleneck, it just says that everything is kinda slow and sums up to the grand total we mentioned above.
      ProCache does not help us because every user is seeing something different, so we can cache some fragments but they usually measure at around 10ms. We can't spend time optimising if we can't expect an affordable benefit. Therefore we opted against ProCache and used our own module which generates these cache fragments lazily. 
      That speeds up the whole page rendering to ~7 sec, this is acceptable compared to 20sec but still ridiculously long.
      Our page consists of mainly dynamic parts changing every 2-5 minutes. It's different across multiple users based on their location, language and other preferences.
      We also have about 120 people working on the processwire backend the whole day concurrently.
       
      What do you guys think?
      Here are my questions, hopefully we can collect these in a wiki or something because I'm sure more and more people will hit that break sooner than they hoped they would:
       
      - Should we opt for optimising the database? Since >2k per request is a lot even for a mysql server, webserver cpu is basically idling at that time.
      - Do you think at this point it makes sense to use ProcessWire as a simple REST API?
      - In your experience, what fieldtypes are expensive? Page? RepeaterMatrix?
      - Ryan, what do you consider as the primary bottleneck of processwire?
      - Is the amount of fields too much? Would it be better if we would try to reuse fields as much as possible?
      - Is there an option to hook onto ProcessWires SQL builder? So we can write custom SQL for some selectors?
       
      Thanks and lots of wishes,
      Pascal from Mobile Trooper
       
       
    • By mr-fan
      What i wanna achive is a simple counter like that count up on visit (this is no problem) AND save the specific date (year/month/day) of the count...
      in the end i will be able to get visits per day/per month/per year in a nice and dirty graph.
      Just to have a way better simple counter system.
      Should i only go with a complex setup of pages like this:
      --stats (home template for pageviews)
      ----2018 (year)
      ------08 (month)
      ---------29 ->page_views   (integers on every day template)
      ---------30 ->page_views
      Or just simple use:
      --stats (home template for pageviews)
      ---->count (template) that holds simple field page_views and a date field
      or could a fieldtype like tables (one table field for every month/year or so) be also a solution?
      Or a own SQL table special for this and use it in a module? I don't have any experience on this topic...
      What i have in mind of performance sideeffects on such a thing?
      Or is there a solution that works with PW?
      I wanna go the hard way and implement something like this:
      http://stats.simplepublisher.com/
      only directly within PW and use the API to get the data...maybe create a simple module from it later i don't know if i  could set it up right from the start 😉
      this is the reason for my questions on more experienced devs
      Kind regards mr-fan
       
    • By FrancisChung
      Long but well written, detailed and informative article written by an Engineering Manager for Google Chrome about the true cost of Javascript and what you can do to alleviate some of that cost.
      Must read!

      https://medium.com/@addyosmani/the-cost-of-javascript-in-2018-7d8950fbb5d4
    • By sodesign
      Hello,
      One of our sites is suffering from very slow boot times, and I'm not sure how to diagnose the problem.
      Here's a grab of the debug panel in Tracy debugger after loading the homepage.

      A have a couple of questions -
      Are all of the times listed separate items, or are some of them a breakdown? I ask because the number shown in the tracy debug bar is the total of all of the items but the wording suggests boot.load.modules, boot.load.fields etc are a breakdown of the boot.load. How do I find out what these times consist of? Currently, when using the site, and when running page speed tools, the server load time is consistently upwards of 1s often above 1.5s. This is before the browser even starts downloading resources - a quick grab from my firefox dev tools was even worse:

      I would appreciate any advice on finding the cause here.
      A few details:
      Server is a digital ocean droplet (2GB memory + 2CPUs) running nginx and php7.0 - neither memory or cpu seem particularly taxed Site has 8 locales Using template cache and wirecache for heavy pieces of markup We're on the latest dev branch - the speed issue has been present for the last couple of versions. The speed is similar on when running locally (similar but stripped back nginx config)  
      Thanks,
      Tom

×
×
  • Create New...