Setup #1
- one container page, say /countries/
- hundreds of subpages with children, /countries/<country>[/<state>]/<city>/<data> for example
- create a new page, parent being /countries/
- a large page hierarchy with not that many pages directly under one parent, but hundreds of pages with template(s) that each have the same repeater field
- structure becomes flat again as repeaters use internally page hierarchy of this kind: /processwire/repeaters/for-field-123/for-page-456/1234567890-1234-1
(--> .../for-page-nnn/... are all siblings even when the actual pages are not) - create a page with a repeater field used all over the site, parent is irrelevant
Now, what I don't quite understand is what the Pages class method ___save() does just before calling some hooks in the end (from: wire/core/Pages.php):
if(($isNew && $page->parent->id) || ($page->parentPrevious && !$page->parent->numChildren)) {
$page = $page->parent; // new page or one that's moved, lets focus on it's parent
$isNew = true; // use isNew even if page was moved, because it's the first page in it's new parent
}
if($page->numChildren || $isNew) {
// check if entries aren't already present perhaps due to outside manipulation or an older version
$n = 0;
if(!$isNew) {
$result = $this->db->query("SELECT COUNT(*) FROM pages_parents WHERE parents_id={$page->id}");
list($n) = $result->fetch_array();
$result->free();
}
// if entries aren't present, if the parent has changed, or if it's been forced in the API, proceed
if($n == 0 || $page->parentPrevious || $page->forceSaveParents === true) {
$this->saveParents($page->id, $page->numChildren + ($isNew ? 1 : 0));
}
}
So, for new (and moved) pages it always ends up calling saveParents() for the parent page of the page being saved. saveParents() then works recursively deeper into the tree calling itself for every child page with children of its own.
Given the setup #1 this means going through every .../<country>/ and .../<state> and .../<city> page there is. On the other hand, with setup #2 every single repeater instance (.../<for-page-nnn>/) with data inside will be handled. It doesn't really surprise me anymore this takes a while: saveParents() is deleting rows and inserting new ones into pages_parents for thousands of pages, unnecessarily if I'm right.
It does seem necessary to call saveParents() for the parent page if the page being saved is its first child (pages_parents has a row for each ancestor of the parent page, but not for the page or parent page itself), but the recursion goes all over the place. I think it should be somehow restricted to the new/moved page and its children (and so on) only. Also, the second argument to saveParents() gets a wrong value when $page refers to the parent page as numChildren has already been incremented for the parent page (though this doesn't actually break anything).
As long as new pages can't have children when they're saved for the first time, saveParents() shouldn't be called at all when saving a new page. If child pages may exist, it isn't of course that straightforward but still the saveParents() recursion should be restricted as stated above.
I hope my explanation makes sense to you. It's quite possible there is something laying under the hood that I just can't see now having studied this thing for a while (well, hours). But it sure seems like a bug to me















