Jump to content
Pete Jones

Improve performance of big query

Recommended Posts

We have a big selector which we have broken down into 3 chunks to return a list of notes (pages) with repeaters as follows. We also allow the user to filter the results. The problem we have is that the page currently takes nearly 10 seconds to process results. Is there anything we can do to improve the performance of this? I wonder if it would be worth bringing the filters into each of the find()s. I assume that caching here wouldn't work due to querystring parameters?

$selector = "template=horse-note";

// Notes with unread comments (date order, most recent first)
$notes_with_unread_comments = $pages->find("{$selector}, h_notes_comments.count>0, h_notes_comments.{$session->unread_by}>0, sort=h_notes_last_comment");
//echo 'Notes with unread comments ('.count($notes_with_unread_comments).'):<br />'.$notes_with_unread_comments.'<br /><br />';

// Unread notes (date order, most recent first)
$notes_unread = $pages->find("{$selector}, {$session->unread_by}>0, sort=h_notes_last_comment");
//echo 'Notes unread ('.count($notes_unread).'):<br />'.$notes_unread.'<br /><br />';

// Read notes in date order (most recent first) that they were either added or that the last comment was made, whichever is most recent.
$notes_other = $pages->find("{$selector}, sort=-h_notes_last_comment");
//echo 'Notes other ('.count($notes_other).'):<br />'.$notes_other.'<br /><br />';

// create notes PageArray
$notes_total = new PageArray();
$notes_total->add($notes_other);
$notes_total->prepend($notes_unread);
$notes_total->prepend($notes_with_unread_comments);

// FILTER
// sanitize inputs
$horse = $sanitizer->text($input->get->horse);
$category = $sanitizer->int($input->get->category);
$from_date = $sanitizer->text($input->get->from_date);
$to_date = $sanitizer->text($input->get->to_date);
$comments = $sanitizer->int($input->get->comments);

// horse name
if($horse)
{
	$selector .= ", parent.h_name%=$horse";
}

// note category
if($category)
{
	$selector .= ", h_notes_category_id=$category";
}

// from date
if($from_date)
{
	$selector .= ", h_notes_last_comment>=".strtotime("$from_date 00:00:00");
}

// to date
if($to_date)
{
	$selector .= ", h_notes_last_comment<=".strtotime("$to_date 23:59:59");
}

// comments
if($comments)
{
	$selector .= ", h_notes_comments.count>0";
}

// apply filter
if($selector!='template=horse-note')
{
	$notes_total = $notes_total->find($selector);
}

// slice PageArray according to pageNum
$pageNum = $input->pageNum;
$limit = 15;
$start = ($pageNum-1)*$limit;
$notes = $notes_total->slice($start, $limit);

Share this post


Link to post
Share on other sites

Why don’t you do the three database finds after building the selector? I would imagine that to speed things up, because you won’t be loading as many pages just to throw them out later.

Instead of slicing the PageArray you can also put limit and start into the selector like so "start={$start}, limit={$limit}".

Edit: Actually, maybe I’m missing something, but it looks like you’re getting ALL horse-notes in $notes_other. I’m also pretty sure that PageArrays automatically filter out duplicates, so when you prepend the other two PageArrays, you’re not really doing anything.

  • Like 2

Share this post


Link to post
Share on other sites

You way does load all of those notes into memory even though only 15 are used later. Build your Selector first and then query the db for the total result.

  • Like 3

Share this post


Link to post
Share on other sites

Also you might want to look into building a selector with arrays, since that is more readable.

And if you only want to count the pages, you could also use the count method. That is blazing fast. We're calling more than 80 count method on one pageload searching around 40k pages and it is loads within 1 second.

  • Like 4

Share this post


Link to post
Share on other sites

I'm not sure I can do what we need with a single selector. The reason we ended up creating a pageArray was because we couldn't get the pages to be in the order we needed with a single selector. The refinement selectors are only used if the user chooses one or more of the filter options, so 99% of the time these won't be applied.

We're grabbing all pages and then using 'slice' to grab the portion of pages for the current page. E.g., if the page number is 10, we want slice(9*15, 15) of the total pages.

I know count is a lot quicker, but can't see how we can use that usefully here. Happy to take any advice though.

Share this post


Link to post
Share on other sites

If the order is your issue than use count() with those selectors, the current page num and some math to determine which pages are to be shown on the current page - then load just those.

Edit:

I've not tested it with real pages, but the test class seems to work correctly: https://github.com/LostKobrakai/Paginator

  • Like 10

Share this post


Link to post
Share on other sites

Dude! Nice one LostKobrakai...

  • Like 1

Share this post


Link to post
Share on other sites

Indeed, many thanks for putting that together for us LostKobrakai. Looking at my original code, can I now do this?
 

$result = $paginator(array(
    "template=horse-note, h_notes_comments.count>0, h_notes_comments.{$session->unread_by}>0, sort=h_notes_last_comment",
    "template=horse-note, {$session->unread_by}>0, sort=h_notes_last_comment",
    "template=horse-note, sort=-h_notes_last_comment"
), $input->pageNum, 15);

Share this post


Link to post
Share on other sites

Very nice LostKobrakai!

Share this post


Link to post
Share on other sites

@Pete

At least in theory it should work that way. As stated above I hadn't had time to test if further than the plain array testclass. Would be nice if you could report back if it works. If so I'd be also interested how much render time it saved you.

Share this post


Link to post
Share on other sites

Just giving an update on this. We are getting results. Page load time is now 1 second cf. 9 seconds using the original code. The results are not in the same order though.

We did have problems getting them into the correct order which is why we had do this:

// create notes PageArray
$notes_total = new PageArray();
$notes_total->add($notes_other);
$notes_total->prepend($notes_unread);
$notes_total->prepend($notes_with_unread_comments);

Can we influence the order in $paginator?

Share this post


Link to post
Share on other sites

It's taking each selector separately one after another, so it shouldn't bring anything out of order – it shouldn't even query later selectors if the ones before where enough to fulfill the limit of pages for the current pageNum. Just make sure it's in the correct order in the first place.

Share this post


Link to post
Share on other sites

I don't know what the context of this search query is, but if it is from a search box, you could consider to use a Google Custom Search Engine.

When the search queries become 'to complex' I would go for that option. Google gives back the URL, together with other found data. You could use the URL to get back to your own 'ProcessWire' data and enrich the result as wished. 

When you are scared having to pay for the searches, you could always store the google results temporary with MarkupCache or WireCache.

  • Like 1

Share this post


Link to post
Share on other sites

The search query returns has some specific ordering which we've been unable to create using a single selector. When the page first loads there are no filters applied so we are querying 6000 items (and counting). We need the pagination to work without having to query everything. Seems strange that the current results are not coming back in the same order as the previous query. Could it be that we are prepending the results rather than appending? What is the default order of the array?

Share this post


Link to post
Share on other sites

My pagination class has no concept of appending or prepending. It's just selector after selector after selector. Just put your selectors in the intended order from the start and you should be fine. If the first selector does not return any items it'll just skip to the next one.

  • Like 2

Share this post


Link to post
Share on other sites

Ah ok, we had a few issues with the ordering, prepending vs appending and also the order of the items in each selector. Will have a tinker with it.

Share this post


Link to post
Share on other sites

The ordering is fine with the first 2 selectors in the paginator. The third one breaks the ordering for some reason.

So this is fine:

$notes = $paginator(array(
        "template=horse-note, h_notes_comments.count>0, h_notes_comments.$session->unread_by>0, sort=-h_notes_last_comment",
        "template=horse-note, $session->unread_by>0, sort=-h_notes_last_comment"
    ), $pageNum, $limit);

But this is out of order:

$notes = $paginator(array(
        "template=horse-note, h_notes_comments.count>0, h_notes_comments.$session->unread_by>0, sort=-h_notes_last_comment",
        "template=horse-note, $session->unread_by>0, sort=-h_notes_last_comment",
        "template=horse-note, sort=-h_notes_last_comment"
    ), $pageNum, $limit);

Is there any reason you can think of that the third selector would interfere with the ordering?

Share this post


Link to post
Share on other sites

Shot in the dark: are there duplicate pages being added by the last selector? (I'm really only just following from afar, I haven't like into how this works).

Share this post


Link to post
Share on other sites

I'm just using import() to import the results of each "turn" into a single PageArray, therefore duplicates will be moved to the later places if they are returned twice. I can make some adjustments, so the class could actually prevent duplicates to be found again by later selectors.

  • Like 4

Share this post


Link to post
Share on other sites

Okay, so I've now added the 'NoDuplicates' version into my page. I'm getting the same results as before, i.e. it's fine with the first 2 selectors, but the third one breaks it.

include 'src/Paginator.php';
include 'src/PagesPaginator.php';
include 'src/PagesPaginatorNoDuplicates.php';

I feel it's 99% there though, and so much quicker than the previous query we were using.

Share this post


Link to post
Share on other sites

Did you also change the part where you're instantiating the class? 

$paginator = new PagesPaginatorNoDuplicates();

If that's not the issue I might need to look more detailed into what's happening on your end. 

Share this post


Link to post
Share on other sites

Erm, no. But only to check you were paying attention. I've done that and it now appears to be matching the order of the old query. Good work LK! Can we buy you a beer?

One thing I do need to do now is to create pagination. If I'm now only returning the results I need for the current page, can I still work out how many pages I need in total?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By LuisM
      Hi there,
      while developing a sideproject which is completly build with ProcessModules i suddenly had the urge to measure the performance of some modules 😉 as a result, say welcome to the FlowtiAppPerformance module. 
      It comes bundled with a small helper module called FlowtiModuleProfiler. In the first release, even though you could select other modules, it will track the execution of selected Site/ProcessModules. 
      This will give you the ability to gain insights how your Application behaves. 
      The Main Module itself will come with 2 Logging Options, Database or PW Logs. Select Database for Charts and Logs...well If you just want your profiles as a simple log file in PW. 
      You also could choose to dump the request profile into TracyDebugger as shown here:

      Dont wonder about my avg_sysload, somehow my laptop cant handle multiple VMs that good 😄
      Settings Screen

      Monitoring

      FlowtiLogs

      again, dont look at the sysload 😄
      I will update the Module in the future to give some filter options and aggregation, but for now it satisfies my needs. 
      I hope it is helpfull for some. 
      Module is submited to the directory and hosted at github
      https://github.com/Luis85/FlowtiAppPerformance
      Any suggestions, wishes etc. are much appreciated. 
       
      Cheers,
      Luis
    • By Anders
      I want to allow full text search on my site. There is a very nice solution that comes right out of the box:
      $selector = "title|body~=$q, limit=50"; This works, but to make it even better I would want to give higher weight to pages where the search term occurs in the title, than if it just occurs in the body. After all, a page with the title "Wine from France" is probably the best match for the search "france wine". How do I accomplish this in ProcessWire?
      I can see three possible paths, but I am not very fond of any of them:
      Do a direct SQL query, circumventing the API, along these lines. But I would prefer to abstract away the database layout if at all possible. Use something like ElasticSearch, but to be honest that would be to complicated to set up and maintain in the long run. Make multiple lookups, first for matches in the title, then for matches in the body, and merge and sort in PHP. My suspicion is that this would get complicated quite quickly. For instance, how do you deal with a page that has two of the three search terms in the title and the third in the body? Is there a magic option four I should look into? Or are any of the above options better than the others? Any input is welcome!
    • By MoritzLost
      This is a new module that provides a simple solution to clearing all your cache layers at once, and an extensible interface to perform various cache-related actions.
      The simple motivation behind this module was that I was tired of manually clearing caches in several places after deploying a change on a live site. The basic purpose of this module is a simple Clear all caches link in the Setup menu which clears out all caches, no matter where they hide. You can customize what exactly the module does through it's configuration menu:
      Expire or delete all cache entries in the database, or selectively clear caches by namespace ($cache API) Clear the the template render cache. Clear out specific folders inside your site's cache directory (/site/assets/cache) Refresh version strings for static assets to bust client-side browser caches (this requires some setup, see the full documentation for details). This is the basic function of the module. However, you can also add different cache management action through the API and execute them through the module's interface. For this advanced usage, the module provides:
      An interface to see all available cache actions and execute them. A system log and logging output on the module page to see verify what the module is doing. A CacheControlTools class with utility functions to clear out different caches. An API to add cache actions, execute them programmatically and even modify the default action. Permission management, allowing you granular control over which user roles can execute which actions. The complete documentation can be found in the module's README.
      Beta release
      Note that I consider this a Beta release. Since the module is relatively aggressive in deleting some caches, I would advise you to install in on a test environment before using it on a live site.
      Let me know if you're getting any errors, have trouble using the module or if you have suggestions for improvement!
      In particular, can someone let me know if this module causes any problems with the ProCache module? I don't own or use it, so I can't check. As far as I can tell, ProCache uses a folder inside the cache directory to cache static pages, so my module should be able to clear the ProCache site cache as well, I'd appreciate it if someone can test that for me.
      Future plans
      If there is some interest in this, I plan to expand this to a more general cache management solution. I particular, I would like to add additional cache actions. Some ideas that came to mind:
      Warming up the template render cache for publicly accessible pages. Removing all active user sessions. Let me know if you have more suggestions!
      Links
      https://github.com/MoritzLost/ProcessCacheControl ProcessCacheControl in the Module directory

    • By verdeandrea
      Hello,
      I am using ProCache v3.1.8 on ProcessWire 3.0.96.
      Everything worked fine in the past, but today I noticed that the css file serverd by procache gives a 410 error. 
      The file is there, I checked.
      I deleted the cached files, I deleted the css file, I looked into the .htaccess file looking for some clues about this problem but nothing worked.
      The only way i can see my website correctly again is disabling ProCache. 
      Has anyone any clue on what could be the cause of the problem or on what should I do to fix it?
      Thanks!
    • By abdulqayyum
      Hy Processwire community,
      There are some problem in fileCompiler cache.
      when i change under the directory \site\templates\ it must change under the directory /site/assets/cache/FileCompiler/site/templates/
      but it does not update and functionality working with /site/assets/cache/FileCompiler/site/templates/ directory.
       
      In this case please suggest me how i clear fileCompiler cache?
      what i have to clear it manually?
      Thanks AbdulQayyum.
×
×
  • Create New...