Search the Community
Showing results for 'findMany'.
-
Hard to know exactly what to suggest as many different scenarios could affect responsiveness. (Re-?)Rendering images could easily slow things down, as could page reference properties that aren't loaded until called/necessary. Depending on your setup, using findRaw() or findMany() might have a positive impact, as would your autojoin; if you think an autojoin might only be necessary for your one scenario, you could look into findJoin() and whether it accomplishes the same end-result on that page. For a ProcessWire (Pro) module assist, there's always ProfilerPro, as well.
-
Hello @dotnetic First of all, thanks for the detailled report and your suggestions. I am very busy at the moment, so please be patient, I will check all things that you have reported. Timeout issue: This could be possible solution. As I can rememeber, Ryan has also implemented queries for very large amounts of pages (like findMany) to prevent timeouts, but I have to dig into it a little bit more. Placeholder issue: From the first view it seems that everything is ok, but I will test it on my local installation. A few time ago, I have helped another user by a form using placeholders and it has worked as expected. (Take a look here). So for the moment I have not find a reason why it should not work, but I have to dive in a little bit deeper. File size issue: You can always disable the php ini file-size validator by setting $field->removeRule('phpIniFilesize') This removes the php.ini file-size validation, but this does not solve the wrong conversion of 10M to 100kb. I guess there is only a simple calculations issue responsible for this mistake. I will check out the function which is responsible for the conversion and I guess I can solve this problem very soon.? Only to mention: It is going very busy at the moment, so I have not the time to react within a short time and offer a solution promptly, but I will try to solve all these issues until the end of this week and give you an info here. Best regards and thanks for reporting this issues!! Jürgen
-
I have read this 10 or more times now and I am still not sure if my understanding is correct, so therefore the following is a bit of a deep-dive here: You have pages around in your site, that are somewhat special beneath/below other pages or at least need to be treated like they are special/or somewhat. I imagine it like something like this: Spain ... France Eiffel Tower (this is the special one) Germany ... And now you need to know if this... referring page (France) is referring to Eiffel Tower or/and the referring page (France) was accessed and you need to know this in the referred page. see below... Case 1: When you are in need to know if a page or any page uses a reference (PageReference field) to point to a specific page (Paris -> Eiffel Tower), there is a selector for that. <?php namespace ProcessWire; // find pages that reference to a POI by POI-field // field: this_is_a_poi = page reference field to select POI pages // used in that template of target page (Eiffel Tower in this case) $who_links_to_me = $pages->findMany("this_is_a_poi=$page, parent=$page->parent") You could put that into place for your POI pages to see what other pages reference it. Case 2: Oh... and there is this $page->links() method in which you can find out which (HTML) field links to that one specific page. Case 3: In case you want to track access to referring pages we first of all need to know what kind of tracking you use, but at least PageHitCounter could be triggered via JS in case a referring page was accessed. Sure there is a bit more needed for this, but I don't know if this is the case, and won't look for a solution that deep. Sorry. ? My understanding for this: poi = POI = Point of Interest And yes... every snippet is untested. Oh... and btw. ... Welcome to the ProcessWire Forums with your first post.
-
The counter is correct: you still have 124019 pages. Perhaps the script is timing out or running out of memory. Try this variation of @AndZyk's script: $authlogs = $pages->findMany('template=authlogs, limit=100'); foreach ($authlogs as $authlogPage) { $pages->delete($authlogPage); } Note: findMany() reduces memory usage; 'limit' reduces the number of pages selected. (And 'authologs' is changed to 'authlogs' in two places.) Does the count now drop by 100?
-
ProcessWire 3.0.198 contains a mixture of new features and issue resolutions. Below are details on a few of the new features: Support was added for runtime page cache groups. This enables pages to be cached as a group—and just as importantly—uncached as a group. While it can be used by anybody, it was added primarily to add efficiency to $pages->findMany(), so that it can cache supporting pages (like parents and page references). Previously, it would have to load a fresh copy of each supporting page used by findMany() results (for every returned page) since findMany() used no in-memory caching. If it did cache in memory, then you could potentially run out of memory on large result sets, so that strategy was avoided. Consider the case of iterating over all pages returned by findMany() and outputting their URLs... that triggers load of all parent pages for each page in the result set. And without a memory cache, it meant it would have to do it for each page in the results. Following this week's updates, now it can cache these supporting pages for each chunk of 250 pages, offering potentially significant performance improvement in many cases, without creating excess memory usage or memory leaks. When the chunk of 250 result pages is released from memory (to make room for the next chunk), all the supporting pages (parents, page references, etc.) are also released from memory, but not before. Now findMany() can offer both memory efficiency and great performance. For as long as I can remember, ProcessWire has had an apparently undocumented feature for dependent select fields that enables you to have two page reference fields using selects (single, multiple or AsmSelect) where the selected value in one select changes the selectable options in the other select (Ajax powered). Think of "categories" and "subcategories" selects, for example, where your selection for "categories" changes what options are selectable in "subcategories". Part of the reason it's undocumented is that it is one of those features that is a little bit fragile, and didn't work in some instances, such as within Repeater items. That changed this week, as the feature has been updated so that it can also work in Repeater items too. The $pages->findRaw() method was updated with a new "nulls" option (per Adrian's request in #1553). If you enable this new "nulls" option, it will add placeholders in the return value with null values for any fields you requested that were not present on each matching page. This reflects how PW's database works, in that if a field has no value, PW removes it from the database entirely. Without the nulls option (the existing behavior), it retains the more compact return values, which omit non present values completely. For example, consider the following: $items = $pages->findRaw("template=user, fields=email|first_name|last_name"); If a row in the result set had no "first_name" or "last_name" populated, then it would not appear in the that row of the return value at all... [ "email": "ryan@processwire.com" ] By specifying the "nulls" option, it will still include placeholders for field values not present in the database, and these will have a value of null: $items = $pages->findRaw("template=user, nulls=1, fields=email|first_name|last_name"); [ "email": "ryan@processwire.com", "first_name": null, "last_name": null ] By the way, if you don't specify which fields you want to get (which is the same as saying "get all") then adding the nulls option makes it provide placeholders for all fields used by the page's template. As you might expect, without the nulls option, it includes only populated fields. Also included in 3.0.198 are 7 issue fixes, most of which are visible in the dev branch commits log. That's all for this week. Thanks for reading this update and have a great weekend!
- 5 replies
-
- 18
-
OK, guys, it seems there was one more piece to the puzzle. Like @adrian, when I recreated the situation in a simple test environment running 3.0.192, I couldn’t recreate the problem. I should have tested this yesterday, but where I’d discovered that changing findMany() to find() in my code worked around the problem, I figured I’d found it. So, the following conditions are necessary but insufficient: PW 3.0.192 or greater (including the current version) page classes findMany() instead of find() I believe I’ve found the final ingredient. I have tested this in a simplified environment and recreated the problem: 4. getPage() In real life, I’ve rolled my own pagination (since I’d wanted to provide users with logarithmic-like controls for moving backwards and forwards through the site, and provide them with sign-post dates); this paginating has worked well. But it means that rather than a foreach loop I cycle through results in a for loop, and then assign the item with a getPage(). So, Adrian’s code looked like: foreach($pages->findMany('template-basic-page') as $p) { d($p->id . ' : ' . $p->returnParentId()); } but mine looks more like: $items = $pages->findMany('template=Test'); for ($idx = 0; $idx < 1; $idx++): $p = $items->getPage($idx); echo ($p->identify()); endfor; (Yes, that’s my test system, where I’ve hard-coded it to “loop” through just one test record; it’s sufficient to show the problem though: ProcessWire\TestPage #279 id: 1025 name: 'test-record-a' parent: '' template: 'Test' title: 'Test record A' data: array (1) 'title' => 'Test record A' parent there should = '1'. I see that I can do the same thing by removing getPage() altogether: $p = $items($idx); In the grand scheme of things, I don’t know which is better; however, this assignation — whether implicit or via getPage(), seems to be the final piece: add this, and $this->parent() becomes a NullPage. So, to sum up: all of this works fine in 3.0.150-or-so (where I developed it) on up through 3.0.191. As of .192, it breaks. Switching from findMany() to find() works around the problem. Sticking with findMany() and using the customary foreach() code pattern also works around the problem. Phew! @kongondo, I agree that the commit you link to looks relevant. Do you think the (implicit or explicit) getPage() may call any of that code?
-
Hello, I would use PW pages with a "practitioner" template. Data export won't be more complicated, you can use PW API for that. If it needs to be optimized you can use several options like $pages->findMany(), findJoin(), findRaw() or even use raw MySQL query. That way your practitioners are already editable from admin without effort. Also check custom page classes if you need to attach your own class to this practitioners pages. It depends on your actual knowledge on PW, PW has a good documentation, you can start by looking Pages and Page. First one is useful to find your practitioners pages, second one to access a single page or create a new one.
-
Thank you, @adrian, for that pointer. I’d had no idea that JS hadn’t been running. I’ll look into your example soon! But the big news is that I’ve solved the problem! @ryan, @bernhard, & @kongondo — thanks to each of you, too, for your encouragement and good ideas! Essentially, there is a bug, but it only presents itself when using findMany() instead of find(). As of 3.0.192, if you assemble a batch of pages with findMany() , and then call a function defined in a page-class definition on one of those pages, the $this variable in that function will have an empty parent property. By contrast, in 3.0.191, the $this variable in that function will have the proper value in its parent property. And also by contrast, in both 3.0.191 and in 3.0.192, if you use find() instead of findMany(), the function will have the proper value in its parent property. So, the bug shows up only in 3.0.192 or later and you’re using findMany() and you’re trying to find any parent-based information from the perspective of $this within a function of a page-class definition. For my own use today, I’m perfectly OK using find() instead of findMany(), though there may come a time when I’m dealing with too many records for that to work gracefully. So I can upgrade beyond 3.0.184! Thanks again to everyone for your help; I’ve learned a lot through this process. If an admin would like to break all this out (it has nothing to do with Ryan’s original post — I meant to reply initially to his announcement of 3.0.192!), I’d be grateful
-
Hello guys, I am building a sort of an archive. Relatively simple, although I have about 8000 records, each with 15 fields (text, int, images, url). I created a crude search system with a form (emulating the famous Skyscrapper example) to filter through the system. Everything works but it is quite slow... I have 2 questions which are related: 1. How can I search through the database? 2. What is a good practice to display many records like these? ----------------------------------------- 1. I am retrieving the results with $songs = $pages->findMany('template=nk-song'); Then I do a foreach to render them all. I am unsure if that is a good way. If I render all of them on the page, it creates thousands of divs with a bit of text, and this can take a while (10s-15s). 2. This one is even worse :D as every time I retrieve my desired records with something like this: $page->find("field_to_search_through~=my_query_string") I get between 20 and 200, but when I render them I am creating iframes with YouTube videos and that can take up to 10s to finish. I "solved" it by only loading the iframes if they are in view with IntersectionObserver on the client-side. But I feel there is a more precise PHP / ProcessWire approach. Just to clarify, I started doing all of this custom rendering and querying because tools like ElasticSearch or SearchEngine were a bit complicated and I needed a simple to retrieve information and then display it in my own way. Thank you!
-
I guess this is largely a matter of personal preference. My personal preference is for a written description that can be accessed other than just while coding. This means that one can get an at-a-glance overview and dip in for more detail, without having downloaded the module and fired up the IDE. Call me sad, but I sometimes read this stuff sat in an armchair, not at the workstation ?. I gave the example above of the readme for FieldtypeMeasurement. Perhaps the ideal approach is to have all the documentation in PHPDoc. This is pretty much the approach of PW, so that then the help documentation (API ref) can be generated automatically from the code. If that approach is chosen, then a bit more explanation in the PHPDocs would be helpful. For example, the PHPDoc for alfred is /** * ALFRED - A Lovely FRontend EDitor * @return string */ There is no description of the options and their defaults (although these can be seen by inspecting the code). PW PHPDocs tend to include option descriptions. See $pages->find for a (very full) example /** * Given a Selector string, return the Page objects that match in a PageArray. * * - This is one of the most commonly used API methods in ProcessWire. * - If you only need to find one page, use the `Pages::get()` or `Pages::findOne()` method instead (and note the difference). * - If you need to find a huge quantity of pages (like thousands) without limit or pagination, look at the `Pages::findMany()` method. * * ~~~~~ * // Find all pages using template "building" with 25 or more floors * $skyscrapers = $pages->find("template=building, floors>=25"); * ~~~~~ * * #pw-group-retrieval * * @param string|int|array|Selectors $selector Specify selector (standard usage), but can also accept page ID or array of page IDs. * @param array|string $options One or more options that can modify certain behaviors. May be associative array or "key=value" selector string. * - `findOne` (bool): Apply optimizations for finding a single page (default=false). * - `findAll` (bool): Find all pages with no exclusions, same as "include=all" option (default=false). * - `findIDs` (bool|int): 1 to get array of page IDs, true to return verbose array, 2 to return verbose array with all cols in 3.0.153+. (default=false). * - `getTotal` (bool): Whether to set returning PageArray's "total" property (default=true, except when findOne=true). * - `loadPages` (bool): Whether to populate the returned PageArray with found pages (default=true). * The only reason why you'd want to change this to false would be if you only needed the count details from * the PageArray: getTotal(), getStart(), getLimit, etc. This is intended as an optimization for $pages->count(). * Does not apply if $selector argument is an array. * - `cache` (bool): Allow caching of selectors and loaded pages? (default=true). Also sets loadOptions[cache]. * - `allowCustom` (boolean): Allow use of _custom="another selector" in given $selector? For specific uses. (default=false) * - `caller` (string): Optional name of calling function, for debugging purposes, i.e. "pages.count" (default=blank). * - `include` (string): Optional inclusion mode of 'hidden', 'unpublished' or 'all'. (default=none). Typically you would specify this * directly in the selector string, so the option is mainly useful if your first argument is not a string. * - `stopBeforeID` (int): Stop loading pages once page matching this ID is found (default=0). * - `startAfterID` (int): Start loading pages once page matching this ID is found (default=0). * - `lazy` (bool): Specify true to force lazy loading. This is the same as using the Pages::findMany() method (default=false). * - `loadOptions` (array): Optional associative array of options to pass to getById() load options. * @return PageArray|array PageArray of that matched the given selector, or array of page IDs (if using findIDs option). * * Non-visible pages are excluded unless an "include=x" mode is specified in the selector * (where "x" is "hidden", "unpublished" or "all"). If "all" is specified, then non-accessible * pages (via access control) can also be included. * @see Pages::findOne(), Pages::findMany(), Pages::get() * */ I use PHPStorm, not VSCode. It has a structure view similar to VSCode's outline. However, that just lists the method names etc. - I assume VSCode is similar* - so you need to go to the actual code to get the PHPDoc. In any case, you do need to be at the workstation and to have downloaded the module to see this. As I said, I appreciate that this is a personal thing, so please don't take it as a criticism, but you did ask whether a readme would be just as good, to which my answer is The video is very useful to give an introduction, but is longer than it would take to view a readme. Ideally there would be both, but the readme would be more complete, but less wordy (as described above). *PS I downloaded VSCode and I see that the outline does give variables as a drop-down, but not PHPDoc. Perhaps I should investigate it a bit more...
-
Hi, I'm sure this is maybe in the works already, given that findMany() is a recent addition to the API, but having this (and the other new find options) available to $users would be a great addition. Cheers, Chris NB Communication
-
Hi there, I'm using FindMany to find pages ant it works really good, but I'm building social portal and I wanted to use it also for users, unfortunatelly it works only on admin (superuser) account. Users have role "user". I'm a little bit confused why it doesn't work on user account (no problem with pages, only users). Any idea?
-
Another performance test using findMany() So this looks like findMany() is a lot faster, but this is not true because creating the proper array of data takes longer than with RockFinder: $selector = 'parent=/data'; $finder = new RockFinder($selector, ['title', 'headline', 'summary']); t(); $result = $finder->getObjects(); d($rf = t()*1000, 'query time in ms (rockfinder)'); d(count($result), 'items'); d($result[0], 'first item'); t(); $result = $pages->findMany($selector); //$finder->getObjects(); d($fm = t()*1000, 'query time in ms (findmany)'); d($count = count($result), 'items'); d($result[0], 'first item'); t(); $arr = []; foreach($result as $p) { $arr[] = (object)[ 'id' => $p->id, 'title' => $p->title, 'headline' => $p->headline, 'summary' => $p->summary, ]; } d($fm2 = t()*1000, 'create array'); d($arr[0]); d("$count items: rockfinder = " . round($rf,2) . "ms | findmany = " . round($fm+$fm2,2) . "ms | " . round($rf/($fm+$fm2)*100, 2) . "%"); Result: Some other tests: $selector = 'parent=/persons'; // 11 items: rockfinder = 3.8ms | findmany = 7ms | 54.29% $selector = 'parent=/dogs'; // 1000 items: rockfinder = 41ms | findmany = 722.1ms | 5.68% $selector = 'parent=/cats'; // 5000 items: rockfinder = 221.4ms | findmany = 1660.8ms | 13.33% $selector = 'parent=/invoices'; // 10002 items: rockfinder = 526.6ms | findmany = 3385.3ms | 15.56% $selector = 'parent=/data'; // 35000 items: rockfinder = 7161.4ms | findmany = 27722.9ms | 25.83% $selector = 'parent=/data2'; // 91300 items: rockfinder = 59523.6ms | findmany = 76495.8ms | 77.81% What is very interesting (and not good), is that the time needed for RockFinder increases disproportionately when querying > 10.000 pages; 10.000 items = 500ms, but 3x10.000 pages = 7.000ms Maybe any sql experts have an idea?
-
If you don't need the entire page arrays of $shipments, then only querying the counts will always be way faster. Also, try to use findMany() instead of find(). I just ran three queries on a dataset of ~1100 pages in Tracy, and got an average of ~34ms: <?php $sel_2 = $pages->findMany("parent=1041, template=project, created>$month, include=all")->count; $sel_3 = $pages->findMany("parent=1041, template=project, created<$last2Years, include=all")->count; $sel_4 = $pages->findMany("parent=1041, template=project, images.count>0, include=all")->count; d($sel_2); d($sel_3); d($sel_4); Using just find(), it was about 10x slower. Or you can even replace find() with findIDs: <?php $sel_2 = $pages->findIDs("parent=1041, template=project, created>$month, include=all"); $sel_3 = $pages->findIDs("parent=1041, template=project, created<$last2Years, include=all"); $sel_4 = $pages->findIDs("parent=1041, template=project, images.count>0, include=all"); d(count($sel_2)); d(count($sel_3)); d(count($sel_4)); This takes it down to ~20ms, i.e. approx. 33% faster than findMany()->count Of course, if you have a really huge amount of pages, then a real direct DB-query (avoiding PW's API) will always be the fastest option. Or try @bernhard's RockFinder module.
-
Sorry guys for all those posts... Found the performance-killer: It is the ORDER BY field(`pages`.`id`, 52066,52067,52068,52069,52070 ... ) part. Without retaining the sort order of the pages->findIDs it is a LOT faster (4s without sort compared to 60s with sort and 75 using findMany): 91300 items: rockfinder = 4385.5ms | findmany = 74213.9ms | 5.91% I'll add this as an additional option and switch sort order OFF by default since sorting will be done by RockGrid anyhow No problem at all. I need this stuff for my own work, so any help is welcome but of course not expected PS: again the tests without sort order 11 items: rockfinder = 4ms | findmany = 6.9ms | 57.97% 1000 items: rockfinder = 35.7ms | findmany = 744.2ms | 4.8% 5000 items: rockfinder = 165ms | findmany = 1675.4ms | 9.85% 10002 items: rockfinder = 327ms | findmany = 3359.5ms | 9.73% 35000 items: rockfinder = 1745.2ms | findmany = 28547.7ms | 6.11% 91300 items: rockfinder = 4385.5ms | findmany = 74213.9ms | 5.91% Now that looks a lot better, doesn't it?
-
Well, but that can’t be right. You posted this, with findMany() working just fine yesterday: So it looks like there’s something about getPage() (implicit or explicit), first(), and presumably last(), in combination with findMany(). I think.
-
Ok, so I can confirm but it doesn't have anything to do with page classes - it appears to be all about getPage() together with findMany(). Here is a bare bones example with page classes disabled. Notice the change in parent between find and findMany:
-
@ErikMH - nice work narrowing it down to findMany(). Did you know there was a fix for a major issue with findMany() that was implemented for 3.0.195 - https://github.com/processwire/processwire/commit/1eb156f1aac64d388374c63cd401279f8d528481 Does upgrading to the latest dev fix the issue for you?
-
ProcessWire 3.0.195 contains a few fixes and optimizations relative to the previous version. While there aren't major additions, if you are running a previous version of the dev branch it's a worthwhile upgrade. Version 3.0.194 added the lazy loading fields, templates and fieldgroups, but there were still a few cases where PW would load them all. This version corrects those few cases. So if you considered 3.0.194 to be a nice upgrade, 3.0.195 is an even better version of it. There are also some useful bug fixes in this version. One notable bug fix (found by Adrian) was that the $pages->findMany() method was lagging due to some changes a couple versions back (limited to dev branch). That's been fixed so now findMany() is fast again. Also a notable addition: the included site profile (site-blank) now includes a /site/classes/ directory with an example HomePage.php file/class that serves as both a placeholder (to ensure the directory exists in Git) as well as an example custom class for the “home” template. I thought this was a good idea because many might not even know about the custom page classes feature without that obvious pointer to the feature. More core updates and additions next week. Thanks for reading and have a great weekend!
- 2 replies
-
- 22
-
{solved} find a field from parent pages by ID
Gideon So replied to neophron's topic in General Support
Hi @neophron I think the findIDs function only returns the IDs of the found pages but not full wire page objects. https://processwire.com/api/ref/pages/find-i-ds/ You should try using: <?php foreach ($pages->findMany('id=1223|1224|1225, sort=-created')->children as $item) : ?> or <?php foreach ($pages->find('id=1223|1224|1225, sort=-created')->children as $item) : ?> Gideon -
If a find() is too slow, use findMany() with a suitable selector: https://processwire.com/api/ref/pages/find-many/. Or if that's still too slow, or if you find them preferable for some other reason, use findJoin() or findRaw(), as described in the blog post at https://processwire.com/blog/posts/find-faster-and-more-efficiently/. Another option would be to try RockFinder3: https://processwire.com/modules/rock-finder3/.
-
-
Excellent thought. I haven’t done this before, so hopefully I’ve got the details right: findMany() combined with getPage() gives pages whose parents are (incorrectly) NullPage
-
There are 20 commits between 3.0.191 and 3.0.192. It might be helpful to know which commit "broke" findMany()?
-
That's a real shame - I thought we might have been onto something. I just tested findMany() here, calling that same method I demo'd above and everything looks fine. Does this setup demonstrate the issue when you run it on your site?