Jump to content
Pete Jones

Improve performance of big query

Recommended Posts

The default pagination methods should work with the resulting pagearray.

  • Like 2

Share this post


Link to post
Share on other sites

Hi, sorry to bring this up again. After looking further down the pages, we can see some improper ordering. We are getting pages that should be found with this selector

"template=horse-note, sort=-h_notes_last_comment"

appearing before pages from this selector

"template=horse-note, $session->unread_by>0, sort=-h_notes_last_comment"

 

Share this post


Link to post
Share on other sites

If you manipulate this method (the only one) in PagesPaginatorNoDuplicates.php you could log or otherwise store found pages/selectors for each separate part. The resulting list is just a pagearray, where new pages are appended one after another.

protected function getItems($selector, $key, $start, $limit, $storage)
{
	$selector = $selector . ", start=$start, limit=$limit, id!=$storage";
	wire('log')->save('paginator', $selector);
	$pa = wire('pages')->find($selector);
	wire('log')->save('paginator', (string) $pa);
	return $pa;
}

The class(es) itself do not order things. It could be that $start / $limit values are off, but these are tested, so they should work. Could also be that some selector of yours is not working as you expect it. 

  • Like 1

Share this post


Link to post
Share on other sites

Hi LK, we appear to have an issue with where the results are starting from on a particular page. For example, we're getting items on page 2 that have already been shown on page 1. Could this be something to do with the 'no duplicates' functionality that is somehow affecting start and end points?

Share this post


Link to post
Share on other sites

Ah yeah, sure. The items on page1 never make it into the $storage variable for page2, which I didn't really realize. So you'd probably need to make your selectors a bit more concrete, so they don't overlap. Otherwise we're back to loading at least all id's of all items.

Share this post


Link to post
Share on other sites

Would it be best if we try to make sure there are no duplicates in the selectors inside our $paginator function, and then we can use the regular PagesPaginator?

Share this post


Link to post
Share on other sites

Yeah, that's probably the only way for this to work without needing to exactly know which items where in pages before the current. The NoDuplicates class is more a failed attempt :D

  • Like 1

Share this post


Link to post
Share on other sites

Is the best way to do this using a id!=$alreadyselectedelements? How would that work with a selector array?

Share this post


Link to post
Share on other sites

Actually, might be able to do it by being more specific in our current selectors. Will give that a go first.

Share this post


Link to post
Share on other sites

There is no $alreadySelectedElements, because we're not loading any items before those we really need. That's the crux in pagination. It should at best work without the runtime even knowing which items came before the current page, because otherwise there will always be scalability issues. 

  • Like 1

Share this post


Link to post
Share on other sites

Just stripping this back to basics so that I can prevent duplicates in my selectors and have noticed that the order is being controlled by the sort that I have in the selectors, rather than the order that the selectors are put into the $paginator function. That probably makes no sense, so I'll try to illustrate.

My paginator function looks like this:

$notes = $paginator(array(
	"template=horse-note, h_notes_comments.$session->unread_by>0, sort=-h_notes_last_comment",
	"template=horse-note, $session->unread_by>0, sort=-h_notes_last_comment"
), $pageNum, $limit);

h_notes_comments is a repeater with a few text fields and some checkboxes.

So what I should get is:

  • pages with 1 or more of the h_notes_comments repeaters containing a checked checkbox, sorted by the h_notes_last_comment datetime field
  • other pages without a checked checkbox inside the h_notes_comments repeater (or no repeater items at all), sorted by the h_notes_last_comment datetime field

But what's happening is the h_notes_last_comment field is taking precedence, so pages found with the second selector are appearing before pages found with the first, if the h_notes_last_comment happens to be more recent. Is that the expected behaviour or are we doing something wrong in our selectors?

As an aside, am i right in thinking that when we're doing

h_notes_comments.$session->unread_by>0

that will match any page that has at least one of the h_notes_last_comments repeaters with that checkbox checked? So even if there are 8 repeater items on a page, and 7 of them don't have the checkbox checked, it will still be returned?

Share this post


Link to post
Share on other sites

You should really test your selectors first before packing it into the paginator and trying to make sense of it. But I finally managed to setup a small test case and found a few issues, not releated to sorting, but to counting the total number of pages. It seems like there's no possible solution to wrap multiple selectors, where at least one does use or-groups, to one combined to run a $pages->count() on. 

  • Like 1

Share this post


Link to post
Share on other sites

Yes, I have noticed that renderPager() is giving more pages than needed. Before I look at that, there is a strange reordering happening. Take this pair of selectors:

$notes = $paginator(array(
		"template=horse-note, parent.h_type=2, h_notes_comments.count>0, h_notes_comments.$session->unread_by>0, sort=-h_notes_last_comment",
		"template=horse-note, parent.h_type=2, $session->unread_by>0, sort=-h_notes_last_comment"
	), $pageNum, $limit);

One of the pages from the second selector is appearing first in the list. It's definitely coming from that selector, because if I remove that from the array, that page is no longer returned. All other pages seem to be in the correct order.

I've also noticed that on page 1, where I am echoing out count($notes), this is returning 96 even though $limit is 100. It's fine on all subsequent pages though.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By LuisM
      Hi there,
      while developing a sideproject which is completly build with ProcessModules i suddenly had the urge to measure the performance of some modules 😉 as a result, say welcome to the FlowtiAppPerformance module. 
      It comes bundled with a small helper module called FlowtiModuleProfiler. In the first release, even though you could select other modules, it will track the execution of selected Site/ProcessModules. 
      This will give you the ability to gain insights how your Application behaves. 
      The Main Module itself will come with 2 Logging Options, Database or PW Logs. Select Database for Charts and Logs...well If you just want your profiles as a simple log file in PW. 
      You also could choose to dump the request profile into TracyDebugger as shown here:

      Dont wonder about my avg_sysload, somehow my laptop cant handle multiple VMs that good 😄
      Settings Screen

      Monitoring

      FlowtiLogs

      again, dont look at the sysload 😄
      I will update the Module in the future to give some filter options and aggregation, but for now it satisfies my needs. 
      I hope it is helpfull for some. 
      Module is submited to the directory and hosted at github
      https://github.com/Luis85/FlowtiAppPerformance
      Any suggestions, wishes etc. are much appreciated. 
       
      Cheers,
      Luis
    • By Anders
      I want to allow full text search on my site. There is a very nice solution that comes right out of the box:
      $selector = "title|body~=$q, limit=50"; This works, but to make it even better I would want to give higher weight to pages where the search term occurs in the title, than if it just occurs in the body. After all, a page with the title "Wine from France" is probably the best match for the search "france wine". How do I accomplish this in ProcessWire?
      I can see three possible paths, but I am not very fond of any of them:
      Do a direct SQL query, circumventing the API, along these lines. But I would prefer to abstract away the database layout if at all possible. Use something like ElasticSearch, but to be honest that would be to complicated to set up and maintain in the long run. Make multiple lookups, first for matches in the title, then for matches in the body, and merge and sort in PHP. My suspicion is that this would get complicated quite quickly. For instance, how do you deal with a page that has two of the three search terms in the title and the third in the body? Is there a magic option four I should look into? Or are any of the above options better than the others? Any input is welcome!
    • By MoritzLost
      This is a new module that provides a simple solution to clearing all your cache layers at once, and an extensible interface to perform various cache-related actions.
      The simple motivation behind this module was that I was tired of manually clearing caches in several places after deploying a change on a live site. The basic purpose of this module is a simple Clear all caches link in the Setup menu which clears out all caches, no matter where they hide. You can customize what exactly the module does through it's configuration menu:
      Expire or delete all cache entries in the database, or selectively clear caches by namespace ($cache API) Clear the the template render cache. Clear out specific folders inside your site's cache directory (/site/assets/cache) Refresh version strings for static assets to bust client-side browser caches (this requires some setup, see the full documentation for details). This is the basic function of the module. However, you can also add different cache management action through the API and execute them through the module's interface. For this advanced usage, the module provides:
      An interface to see all available cache actions and execute them. A system log and logging output on the module page to see verify what the module is doing. A CacheControlTools class with utility functions to clear out different caches. An API to add cache actions, execute them programmatically and even modify the default action. Permission management, allowing you granular control over which user roles can execute which actions. The complete documentation can be found in the module's README.
      Beta release
      Note that I consider this a Beta release. Since the module is relatively aggressive in deleting some caches, I would advise you to install in on a test environment before using it on a live site.
      Let me know if you're getting any errors, have trouble using the module or if you have suggestions for improvement!
      In particular, can someone let me know if this module causes any problems with the ProCache module? I don't own or use it, so I can't check. As far as I can tell, ProCache uses a folder inside the cache directory to cache static pages, so my module should be able to clear the ProCache site cache as well, I'd appreciate it if someone can test that for me.
      Future plans
      If there is some interest in this, I plan to expand this to a more general cache management solution. I particular, I would like to add additional cache actions. Some ideas that came to mind:
      Warming up the template render cache for publicly accessible pages. Removing all active user sessions. Let me know if you have more suggestions!
      Links
      https://github.com/MoritzLost/ProcessCacheControl ProcessCacheControl in the Module directory

    • By verdeandrea
      Hello,
      I am using ProCache v3.1.8 on ProcessWire 3.0.96.
      Everything worked fine in the past, but today I noticed that the css file serverd by procache gives a 410 error. 
      The file is there, I checked.
      I deleted the cached files, I deleted the css file, I looked into the .htaccess file looking for some clues about this problem but nothing worked.
      The only way i can see my website correctly again is disabling ProCache. 
      Has anyone any clue on what could be the cause of the problem or on what should I do to fix it?
      Thanks!
    • By abdulqayyum
      Hy Processwire community,
      There are some problem in fileCompiler cache.
      when i change under the directory \site\templates\ it must change under the directory /site/assets/cache/FileCompiler/site/templates/
      but it does not update and functionality working with /site/assets/cache/FileCompiler/site/templates/ directory.
       
      In this case please suggest me how i clear fileCompiler cache?
      what i have to clear it manually?
      Thanks AbdulQayyum.
×
×
  • Create New...