sreeb

Faster Database search

Recommended Posts

Hi Everyone,

Since a couple of months, we have our new website live with processwire as CMS system.

In this website there is a huge database (1900+ trees), wich can be found with different filters. Take a look at www.vdberk.com/trees.
The website does not to be seen very fast when you select a couple of filters. Every time there is a delay between 3-5 seconds. 

Now we are migrate the website to another sever, with more preformance. So we hope this give us some timeprofits.

Does anyone have any kind of suggest, to make this filter faster. Now the website is on PHP5.6, is php7 better? Can this make a different? We used ProCache already.

Thanks in advance

 

  • Like 1

Share this post


Link to post
Share on other sites

Cool thanks @abdus,

Is processwire and the modules like procache and formbuilder etc ready for PHP 7. Or is this a heel of a job to make it work on php7?

We have processwire 3.0.

Are there any other sollution to make this huge database faster?

Share this post


Link to post
Share on other sites

It's usually plug and play, I dont remeber having any problems with the upgrade.

Unless you're using deprecated features, you should be ok

http://php.net/manual/en/migration70.php
http://php.net/manual/en/migration70.deprecated.php

Here are some guides on Tuts+ and DigitalOcean

https://code.tutsplus.com/tutorials/upgrading-your-linux-server-to-php-7--cms-27583
https://www.digitalocean.com/community/tutorials/how-to-upgrade-to-php-7-on-ubuntu-14-04

  • Like 1

Share this post


Link to post
Share on other sites

Definitely with @abdus on this one. PHP7 is noticeably faster than 5.6, and PW has no problems (that I'm aware of?) running under 7. If it is available, do it (taking the usual precautions, obvs).

Also, there might be optimisations you could make to your selectors and there are probably mysterious optimisations you may be able to make to your mySQL setup, depending on how much control you have within your hosting. First of those you will probably get help with here, if you are able to share some code, second is down to black magic, voodoo and selling the soul of your firstborn. Or hiring an expert.

All of that having been said, it's a very good looking website, and it works well - initial loading doesn't feel slow, just the filtering isn't instant.

<edit>Man, 3 more replies while I write this one.</edit>

  • Like 3
  • Haha 1

Share this post


Link to post
Share on other sites

Hi @sreeb

Do you cache results of searches? 

I have a similar project with a lot of filtering options. I use WireCache for caching search results ( PageArrays) and MarkupCache for result cards. It works quite fast on shared hosting. 

  • Like 1

Share this post


Link to post
Share on other sites

Superb website. It should be included in the featured sites if it isn't already.

1 hour ago, sreeb said:

The website does not to be seen very fast when you select a couple of filters.

Perhaps your filter queries are not optimal. Feel free to post the filter code if you would like any feedback on this.

  • Like 1

Share this post


Link to post
Share on other sites

A generous use of WireCache can help with generating overviews for a large number trees.

<?php namespace ProcessWire;
/** @var $cache WireCache */

foreach ($trees as $tree) {
    $overview = $cache->getFor($tree, "tree-overview");
    if (!$overview) {
        $overview = wireRenderFile('parts/tree-overview');
        $cache->saveFor($tree, 'tree-overview', WireCache::expireWeekly);
    }
    echo $overview;
}

 

Share this post


Link to post
Share on other sites

Wow, thanks a lot for al the suggets.

I will send this to our developers, so they can expertmate with it. 

@Zeka, do you have an example website so i can see the search?

Share this post


Link to post
Share on other sites

I am actually wondering if you might be better off not relying on ajax calls for each filter step. Have you considered storing all the details of all trees in a wireCache'd JSON object that you load up when the page originally loads and then filter through that in memory? I have done this before - I have 3MB of data in JSON which is cached, I make sure that data is transferred compressed (so it's only a couple of hundred KBs). The initially page load takes a couple of seconds, but after that, the filtering is instant. I think so long as you know that the data will not expand to an outrageous size, this might be a good approach.

PS - Great looking site!

  • Like 7

Share this post


Link to post
Share on other sites

Unfortunately, php7 does not give the desired speed gain. Do you have experience with Varnish Cache and PW?
Also extra server capacity did not do the job. It looks like its a code technical problem.

As i can see, all search query's are set as an URL, like:

https://www.vdberk.com/trees/?soil-type=1147&sort=0
https://www.vdberk.com/trees/?soil-type=1147&soil-moisture=1157&sort=0
https://www.vdberk.com/trees/?soil-type=1147&soil-moisture=1157&resistance=1189&sort=0

Etc.etc.

The page will not be change very often. they are quite static. It looks like below. But i'am not sure.
Unfortainly our developers cant find a sollution, so this is my last hope;)

 

 public function getTreesSelector($filterParams){
        
        $selector = array('template=tree-detail');
        
        foreach ($filterParams as $key => $value) {
            if(trim($value) != ""){
                $key = $this->getFilterKey($key);
                
                switch ($key) {
                    case "key":
                        $selector[] = "title|tree_specific_name|tree_family|tree_subfamily%=".urldecode($value);
                        break;

 

 

Share this post


Link to post
Share on other sites

i think you need to do some more debugging! i just played around with a local test-setup and got this results with 10.000 pages:

creation of pages:

//foreach($pages->find('parent=8181') as $p) $pages->delete($p);

$i=0;
$tmp = range('A', 'Z');
while($i<10000) {
    $p = new Page();
    $p->template = 'filtertest';
    $p->parent = 8181;
    $p->title = "test$i";
    $p->a = $tmp[array_rand($tmp)];
    $p->b = $tmp[array_rand($tmp)];
    $p->c = $tmp[array_rand($tmp)];
    $p->save();
    $i++;
}

dump (results):

d($pages->find('template=filtertest, a=a, b=b')->each('title'));
array (13)
14.48ms, 0.09MB

d($pages->find('template=filtertest, a|b|c=a|b|c')->each('title'));
array (3084)
1110.54ms, 6.54MB

d($pages->find('template=filtertest, a|b|c%=a|b|c')->each('title'));
array (3084)
1339.68ms, 6.54MB

your filter function looks totally weird to me. you have ID values as filter but then you use the slow %= selector. why? see my first example using " = " as selector should give you an instant result! also i don't understand why you are using the OR operator ( | ) for searching different fields. is the information spread over multiple fields?? shouldn't every filter-value be stored in a separate field?

 

PS: my template "filtertest" has fields A, B and C holding letters from A-Z

PPS: are you sure the selector is slowing the site down? maybe it is the way you count your number of results in the filter sidebar?

59e5bdbd413fb_2017-10-1710_21_03-Findtreesonline_over1600treesincl.photos.png.b8c5e5a0536365b848ebd064560462b9.png

  • Like 3

Share this post


Link to post
Share on other sites

you have 175 labels showing the tree count:

59e5f242727d4_2017-10-1714_05_09-ProgramManager.png.7267e1f4281f9a7582fa134d6b7dc622.png

i tried a loop with 200 iterations on my test-install with the SLOW selector from my above example and got all the counts within 500ms:

59e5f2435adcb_2017-10-1714_04_39-EditPage_Homehandsontable_dev.png.d6f258c818e7fee310afcc487e05ae62.png

so i think it should be no problem to get the site to returning results plus all the filter count-labels under 500ms.

but as i said in my pm: that can only be a wild guess as we don't know any details about your setup.

Share this post


Link to post
Share on other sites
11 minutes ago, bernhard said:

i tried a loop with 200 iterations on my test-install with the SLOW selector from my above example and got all the counts within 500ms:

Once PW fetches pages from DB, it saves them to memory, meaning subsequent find() operations with the same selector don't really have an effect, so it's essentially one DB operation.

// PagesLoaderCache.php

 * Cache the given selector string and options with the given PageArray
 *
 * @param string $selector
 * @param array $options
 * @param PageArray $pages
 * @return bool True if pages were cached, false if not
 *
 */
public function selectorCache($selector, array $options, PageArray $pages) {

    // get the string that will be used for caching
    $selector = $this->getSelectorCache($selector, $options, true);

    // optimization: don't cache single pages that have an unpublished status or higher
    if(count($pages) && !empty($options['findOne']) && $pages->first()->status >= Page::statusUnpublished) return false;

    $this->pageSelectorCache[$selector] = clone $pages;

    return true;
}

 

  • Like 1
  • Thanks 1

Share this post


Link to post
Share on other sites

ah, thanks, you are right :)

ok, so now i get 15seconds. if you really need the count-labels it seems you would need to implement some caching...

59e5f556a43a2_2017-10-1714_18_49-EditPage_Homehandsontable_dev.png.8a26d0eec99a1fbaacdb7dfd90888246.png

  • Like 2

Share this post


Link to post
Share on other sites

I haven't been following this closely for a while, but perhaps an SQL query would be helpful?

SELECT COUNT(id) 
FROM `pages` 
WHERE (pages.templates_id=29) 
AND (pages.title='My Page') 
AND (pages.status<1024) 
GROUP BY pages.id

You might find this snippet helpful. It will generate an SQL query from a find selector. I just modified the returned result to be a COUNT instead.

59e5ff7bae8cd_ScreenShot2017-10-17at6_02_18AM.thumb.png.fc0a6aba609e8be85a713efeb323d26b.png

Share this post


Link to post
Share on other sites

No if it makes the speed  much faster, the countlabels can go away.

I will ask to our developers to look at this. Maby this give them some information.

Is Varnish caching a good sollution?

Share this post


Link to post
Share on other sites
23 minutes ago, adrian said:

It will generate an SQL query from a find selector. I just modified the returned result to be a COUNT instead.

A simpler way could be passing a second parameter to $pages->count() method like this

$count = $pages->count('id>0', ['getTotalType' => 'calc']); // default, slower, fetches everything then counts
$count = $pages->count('id>0', ['getTotalType' => 'count']); // uses SQL query, faster

From the core:

Spoiler

<?php
/**
 * Return all pages matching the given selector.
 *
 * @param Selectors|string|array $selectors Selectors object or selector string
 * @param array $options
 * ....
 *  - `getTotalType` (string): Method to use to get total, specify 'count' or 'calc' (default='calc').
 * ....
 */
public function ___find($selectors, array $options = [])
{
    // ...
    if ($this->getTotal) {
        if ($this->getTotalType === 'count') {
            $query->set('select', ['COUNT(*)']);
            $query->set('orderby', []);
            $query->set('groupby', []);
            $query->set('limit', []);
            $stmt = $query->execute();
            $errorInfo = $stmt->errorInfo();
            if ($stmt->errorCode() > 0) throw new PageFinderException($errorInfo[2]);
            list($this->total) = $stmt->fetch(\PDO::FETCH_NUM);
            $stmt->closeCursor();
        } else {
            $this->total = (int)$database->query("SELECT FOUND_ROWS()")->fetchColumn();
        }

    } else {
        $this->total = count($matches);
    }

    // ...
}

Results:

image.png.b05a23ddb6658958e44bb0b8874b9b0b.png

 

  • Like 2

Share this post


Link to post
Share on other sites
2 minutes ago, sreeb said:

hi @adrian, how can i run that snippet?

Which one - the one to generate an SQL query from a PW selector, or the actual SQL query itself?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By modifiedcontent
      I have one central website, with membership registration and content etc., and then several related websites with their own URL/domains, each on Processwire, all on the same server.
      I would like to access the database of the central website from the sister websites. How would I do that?
      You can't bootstrap one PW installation into another. You can include template parts from one in the other by just using the server path, but whatever you try to get/post just comes/goes to the database of the site you are on.
      Could you switch databases by including the config.php from another PW installation somewhere?
      What is the correct, secure way to do this?
    • By answersmode
      Please help me........
      DATABASE CONNECTION PROBLEM  - CREATE DATABASE IF NOT EXISTS `lwteswm664_ltingesjfo_itbd` DEFAULT CHARACTER SET latin1 COLLATE latin1_swedish_ci;
      Error
      SQL query:
       
      -- -- Database: `lwteswm664_ltingesjfo_itbd` -- CREATE DATABASE IF NOT EXISTS `lwteswm664_ltingesjfo_itbd` DEFAULT CHARACTER SET latin1 COLLATE latin1_swedish_ci;  
      MySQL said:
      #1044 - Access denied for user 'newhost'@'localhost' to database 'lwteswm664_ltingesjfo_itbd' 
       
      I'm facing this peoblem in my website xxxxx and xxxxx When I wanted to upload my databaz backup file.
      Now what can I do?
      Please help me........
       
       

    • By thmsnhl
      Hi everybody,
      we started our first Processwire driven project in my new company and for the first time, I was working on one site with more than 2 colleagues on the same site.
      It didn't take long for us to stumble across some problems when multiple developers work at the same time, conflicts with updating the database on vagrant machines, like duplicate entries for page IDs, errors when setting up fields and stuff like this. We ended up working on a dedicated database server, that we linked to our vagrant machines and most of the problems were gone, but the performance of this constellation is really bad compared to our first approach with database running on vagrant machines.
      I already tried to find a solution in the forums but I couldn't find anyone with problems like this.
      So I was wondering: how do you manage projects with multiple developers on vagrant machines in a git-based workflow?
    • By hezmann
      Hi!  I've just started on ProcessWire, coming from Wordpress.  I've done several sites in WP but this new one is just to data driven for Wordpress to handle without so many plugins and custom code it just seemed like there must be a better option so here I am.  I used to do a lot of stuff in Access (many years ago) and have some experience with PHP/MySQL.
      I've had a look at a few tutorials and created the planets website etc.  I have a specific structure that I'm looking to create so thought I would ask for some expert advice before making every mistake in the book.
      I do a lot of long distance (inn to inn) walking and am always looking for new ones which involves searching many different sites.  I've yet to find a good repository for this type of thing so thought I would build my own.  The basic structure is (more info added later but trying to get basic idea of best practices):
       
      Walks -> Walk Variations <->Walk Segments <-> Walk Towns (go into Walk Segments as Start and End Towns).
      Walks <->Links
      Walk Towns <-> Accomodation, Walk Towns <-> Dining
      Should I be using ProFields?  How should I structure this as far as fields/templates?
      Thanks in advance for your help.
      Heather
    • By eangulo
      Hello everyone,
      Usually in many CMS database tables prefixed or suffixed with "cache" can be manually cleared without a problem because the system will populate them on the "next page request". Actually in Processwire I am expecting this behaviour:
      [On PW 3.x]
      Manually clear table "caches" in database Go to "client" side (not in the admin panel) All references to my "/site/modules" in my template files does not work : wire("modules")->get(""), $modules->get("") and modules()->get("") PHP error:  Fatal error: Uncaught TypeError: Return value... My _init.php file are not able to find the references to my /site/modules/  The client side not working because this PHP fatal error. If I go to the admin panel "Admin -> Modules" and I trigger the action "Check for New Modules" in the top-right corner in the page, it populates the caches table with the required information and them the client side works.
      It is normal? Or I am doing something wrong ?
      A solution could be to manually call the script that the button "Check for New Modules" calls, but I want to know if  I am doing something wrong here.
      Thank you in advance guys !