Jump to content
sreeb

Faster Database search

Recommended Posts

Hi Everyone,

Since a couple of months, we have our new website live with processwire as CMS system.

In this website there is a huge database (1900+ trees), wich can be found with different filters. Take a look at www.vdberk.com/trees.
The website does not to be seen very fast when you select a couple of filters. Every time there is a delay between 3-5 seconds. 

Now we are migrate the website to another sever, with more preformance. So we hope this give us some timeprofits.

Does anyone have any kind of suggest, to make this filter faster. Now the website is on PHP5.6, is php7 better? Can this make a different? We used ProCache already.

Thanks in advance

 

  • Like 1

Share this post


Link to post
Share on other sites

Cool thanks @abdus,

Is processwire and the modules like procache and formbuilder etc ready for PHP 7. Or is this a heel of a job to make it work on php7?

We have processwire 3.0.

Are there any other sollution to make this huge database faster?

Share this post


Link to post
Share on other sites

It's usually plug and play, I dont remeber having any problems with the upgrade.

Unless you're using deprecated features, you should be ok

http://php.net/manual/en/migration70.php
http://php.net/manual/en/migration70.deprecated.php

Here are some guides on Tuts+ and DigitalOcean

https://code.tutsplus.com/tutorials/upgrading-your-linux-server-to-php-7--cms-27583
https://www.digitalocean.com/community/tutorials/how-to-upgrade-to-php-7-on-ubuntu-14-04

  • Like 1

Share this post


Link to post
Share on other sites

Definitely with @abdus on this one. PHP7 is noticeably faster than 5.6, and PW has no problems (that I'm aware of?) running under 7. If it is available, do it (taking the usual precautions, obvs).

Also, there might be optimisations you could make to your selectors and there are probably mysterious optimisations you may be able to make to your mySQL setup, depending on how much control you have within your hosting. First of those you will probably get help with here, if you are able to share some code, second is down to black magic, voodoo and selling the soul of your firstborn. Or hiring an expert.

All of that having been said, it's a very good looking website, and it works well - initial loading doesn't feel slow, just the filtering isn't instant.

<edit>Man, 3 more replies while I write this one.</edit>

  • Like 3
  • Haha 1

Share this post


Link to post
Share on other sites

Hi @sreeb

Do you cache results of searches? 

I have a similar project with a lot of filtering options. I use WireCache for caching search results ( PageArrays) and MarkupCache for result cards. It works quite fast on shared hosting. 

  • Like 1

Share this post


Link to post
Share on other sites

Superb website. It should be included in the featured sites if it isn't already.

1 hour ago, sreeb said:

The website does not to be seen very fast when you select a couple of filters.

Perhaps your filter queries are not optimal. Feel free to post the filter code if you would like any feedback on this.

  • Like 1

Share this post


Link to post
Share on other sites

A generous use of WireCache can help with generating overviews for a large number trees.

<?php namespace ProcessWire;
/** @var $cache WireCache */

foreach ($trees as $tree) {
    $overview = $cache->getFor($tree, "tree-overview");
    if (!$overview) {
        $overview = wireRenderFile('parts/tree-overview');
        $cache->saveFor($tree, 'tree-overview', WireCache::expireWeekly);
    }
    echo $overview;
}

 

  • Thanks 1

Share this post


Link to post
Share on other sites

Wow, thanks a lot for al the suggets.

I will send this to our developers, so they can expertmate with it. 

@Zeka, do you have an example website so i can see the search?

Share this post


Link to post
Share on other sites

I am actually wondering if you might be better off not relying on ajax calls for each filter step. Have you considered storing all the details of all trees in a wireCache'd JSON object that you load up when the page originally loads and then filter through that in memory? I have done this before - I have 3MB of data in JSON which is cached, I make sure that data is transferred compressed (so it's only a couple of hundred KBs). The initially page load takes a couple of seconds, but after that, the filtering is instant. I think so long as you know that the data will not expand to an outrageous size, this might be a good approach.

PS - Great looking site!

  • Like 7

Share this post


Link to post
Share on other sites

Unfortunately, php7 does not give the desired speed gain. Do you have experience with Varnish Cache and PW?
Also extra server capacity did not do the job. It looks like its a code technical problem.

As i can see, all search query's are set as an URL, like:

https://www.vdberk.com/trees/?soil-type=1147&sort=0
https://www.vdberk.com/trees/?soil-type=1147&soil-moisture=1157&sort=0
https://www.vdberk.com/trees/?soil-type=1147&soil-moisture=1157&resistance=1189&sort=0

Etc.etc.

The page will not be change very often. they are quite static. It looks like below. But i'am not sure.
Unfortainly our developers cant find a sollution, so this is my last hope;)

 

 public function getTreesSelector($filterParams){
        
        $selector = array('template=tree-detail');
        
        foreach ($filterParams as $key => $value) {
            if(trim($value) != ""){
                $key = $this->getFilterKey($key);
                
                switch ($key) {
                    case "key":
                        $selector[] = "title|tree_specific_name|tree_family|tree_subfamily%=".urldecode($value);
                        break;

 

 

Share this post


Link to post
Share on other sites

i think you need to do some more debugging! i just played around with a local test-setup and got this results with 10.000 pages:

creation of pages:

//foreach($pages->find('parent=8181') as $p) $pages->delete($p);

$i=0;
$tmp = range('A', 'Z');
while($i<10000) {
    $p = new Page();
    $p->template = 'filtertest';
    $p->parent = 8181;
    $p->title = "test$i";
    $p->a = $tmp[array_rand($tmp)];
    $p->b = $tmp[array_rand($tmp)];
    $p->c = $tmp[array_rand($tmp)];
    $p->save();
    $i++;
}

dump (results):

d($pages->find('template=filtertest, a=a, b=b')->each('title'));
array (13)
14.48ms, 0.09MB

d($pages->find('template=filtertest, a|b|c=a|b|c')->each('title'));
array (3084)
1110.54ms, 6.54MB

d($pages->find('template=filtertest, a|b|c%=a|b|c')->each('title'));
array (3084)
1339.68ms, 6.54MB

your filter function looks totally weird to me. you have ID values as filter but then you use the slow %= selector. why? see my first example using " = " as selector should give you an instant result! also i don't understand why you are using the OR operator ( | ) for searching different fields. is the information spread over multiple fields?? shouldn't every filter-value be stored in a separate field?

 

PS: my template "filtertest" has fields A, B and C holding letters from A-Z

PPS: are you sure the selector is slowing the site down? maybe it is the way you count your number of results in the filter sidebar?

59e5bdbd413fb_2017-10-1710_21_03-Findtreesonline_over1600treesincl.photos.png.b8c5e5a0536365b848ebd064560462b9.png

  • Like 3

Share this post


Link to post
Share on other sites

you have 175 labels showing the tree count:

59e5f242727d4_2017-10-1714_05_09-ProgramManager.png.7267e1f4281f9a7582fa134d6b7dc622.png

i tried a loop with 200 iterations on my test-install with the SLOW selector from my above example and got all the counts within 500ms:

59e5f2435adcb_2017-10-1714_04_39-EditPage_Homehandsontable_dev.png.d6f258c818e7fee310afcc487e05ae62.png

so i think it should be no problem to get the site to returning results plus all the filter count-labels under 500ms.

but as i said in my pm: that can only be a wild guess as we don't know any details about your setup.

Share this post


Link to post
Share on other sites
11 minutes ago, bernhard said:

i tried a loop with 200 iterations on my test-install with the SLOW selector from my above example and got all the counts within 500ms:

Once PW fetches pages from DB, it saves them to memory, meaning subsequent find() operations with the same selector don't really have an effect, so it's essentially one DB operation.

// PagesLoaderCache.php

 * Cache the given selector string and options with the given PageArray
 *
 * @param string $selector
 * @param array $options
 * @param PageArray $pages
 * @return bool True if pages were cached, false if not
 *
 */
public function selectorCache($selector, array $options, PageArray $pages) {

    // get the string that will be used for caching
    $selector = $this->getSelectorCache($selector, $options, true);

    // optimization: don't cache single pages that have an unpublished status or higher
    if(count($pages) && !empty($options['findOne']) && $pages->first()->status >= Page::statusUnpublished) return false;

    $this->pageSelectorCache[$selector] = clone $pages;

    return true;
}

 

  • Like 1
  • Thanks 1

Share this post


Link to post
Share on other sites

ah, thanks, you are right :)

ok, so now i get 15seconds. if you really need the count-labels it seems you would need to implement some caching...

59e5f556a43a2_2017-10-1714_18_49-EditPage_Homehandsontable_dev.png.8a26d0eec99a1fbaacdb7dfd90888246.png

  • Like 2

Share this post


Link to post
Share on other sites

I haven't been following this closely for a while, but perhaps an SQL query would be helpful?

SELECT COUNT(id) 
FROM `pages` 
WHERE (pages.templates_id=29) 
AND (pages.title='My Page') 
AND (pages.status<1024) 
GROUP BY pages.id

You might find this snippet helpful. It will generate an SQL query from a find selector. I just modified the returned result to be a COUNT instead.

59e5ff7bae8cd_ScreenShot2017-10-17at6_02_18AM.thumb.png.fc0a6aba609e8be85a713efeb323d26b.png

Share this post


Link to post
Share on other sites

No if it makes the speed  much faster, the countlabels can go away.

I will ask to our developers to look at this. Maby this give them some information.

Is Varnish caching a good sollution?

Share this post


Link to post
Share on other sites
23 minutes ago, adrian said:

It will generate an SQL query from a find selector. I just modified the returned result to be a COUNT instead.

A simpler way could be passing a second parameter to $pages->count() method like this

$count = $pages->count('id>0', ['getTotalType' => 'calc']); // default, slower, fetches everything then counts
$count = $pages->count('id>0', ['getTotalType' => 'count']); // uses SQL query, faster

From the core:

Spoiler

<?php
/**
 * Return all pages matching the given selector.
 *
 * @param Selectors|string|array $selectors Selectors object or selector string
 * @param array $options
 * ....
 *  - `getTotalType` (string): Method to use to get total, specify 'count' or 'calc' (default='calc').
 * ....
 */
public function ___find($selectors, array $options = [])
{
    // ...
    if ($this->getTotal) {
        if ($this->getTotalType === 'count') {
            $query->set('select', ['COUNT(*)']);
            $query->set('orderby', []);
            $query->set('groupby', []);
            $query->set('limit', []);
            $stmt = $query->execute();
            $errorInfo = $stmt->errorInfo();
            if ($stmt->errorCode() > 0) throw new PageFinderException($errorInfo[2]);
            list($this->total) = $stmt->fetch(\PDO::FETCH_NUM);
            $stmt->closeCursor();
        } else {
            $this->total = (int)$database->query("SELECT FOUND_ROWS()")->fetchColumn();
        }

    } else {
        $this->total = count($matches);
    }

    // ...
}

Results:

image.png.b05a23ddb6658958e44bb0b8874b9b0b.png

 

  • Like 3

Share this post


Link to post
Share on other sites
2 minutes ago, sreeb said:

hi @adrian, how can i run that snippet?

Which one - the one to generate an SQL query from a PW selector, or the actual SQL query itself?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By Juergen
      Hello @ all,
      I am creating a new inputfield/fieldtype to store opening hours, but I am struggeling to save values from multiple dynamic created inputfields in 1 column of the database.
      Scenario:
      The user can enter one or more opening times per day in a UI.
      Fe:
      Monday open from 08:00 to 12:00 and from 14:00 to 17:00 Tuesday open from 08:00 to 12:00 and from 14:00 to 19:00 and so on
      Via a little JavaScript you can add as much opening times as you need per day - the additional inputfield will be created dynamically.
      After form submission all the values are in the POST array -> this works (see example below):
      ProcessWire\WireInputData Object ( [openinghours_mo-0-start] => 09:00 [openinghours_mo-0-finish] => 13:00 [openinghours_mo-1-start] => 14:00 [openinghours_mo-1-finish] => 18:00 [openinghours_mo-2-start] => 21:00 [openinghours_mo-2-finish] => 23:00 [openinghours_tu-0-start] => 09:00 [openinghours_tu-0-finish] => 13:00 [openinghours_tu-1-start] => 14:00 [openinghours_tu-1-finish] => 18:00 [openinghours_we-0-start] => 09:00 [openinghours_we-0-finish] => 13:00 [openinghours_we-1-start] => 14:00 [openinghours_we-1-finish] => 18:00 [openinghours_th-0-start] => 09:00 [openinghours_th-0-finish] => 13:00 [openinghours_th-1-start] => 14:00 [openinghours_th-1-finish] => 18:00 [openinghours_fr-0-start] => 09:00 [openinghours_fr-0-finish] => 13:00 [openinghours_fr-1-start] => 14:00 [openinghours_fr-1-finish] => 18:00 [openinghours_sa-0-start] => [openinghours_sa-0-finish] => [openinghours_so-0-start] => [openinghours_so-0-finish] => ) The property name is always the name attribute of the field 😉 . If the property is empty means closed on that day.
      Now I need to combine all those values into 1 array (or json array) and store it in the database in 1 column called 'hours' in my case (see screenshot below):

      In my ___processInput(WireInputData $input) method I have tried to make it work like this:
      public function ___processInput(WireInputData $input): self { $name = $this->attr('name'); $value = $this->attr('value'); //input object includes always every input on the page, so lets filter out only inputs from this field //we need to do this, because the number of values is variable - so extract only values that starts with $name.'_' $nameAttributes = []; foreach($input as $key=>$value){ if(substr($key, 0, strlen($name.'_')) === $name.'_'){ $nameAttributes[$key] = $value; } } // loop through all inputfields of this fieldtype $time_values = []; foreach($nameAttributes as $nameAttr => $value) { $time_values[$nameAttr] = $value; } } //save it in the database $input->set('hours', serialize($time_values)); return $this; } The only important part of this code is the last part with the serialize function.
      After saving it will create a record in the database, but the value is always NULL (default value) (see below).

      Checking $time_values returns all the values, but printing out "$this" shows me that the property "hours" inside the Openinghours object is empty (see below) - so the mistake must be there, but I dont know where?!?!?!?
      [title] => Home [openinghours] => ProcessWire\OpeningHours Object ( [data] => Array ( [hours] => ) ) If I check the sleepValue() method or the sanitizeValue() - they are also empty. So it seems that the values will not reach these methods. I havent found a clear documentation of whats going on behind the saving process of an inputfield.
      As far as I know the saving process starts with the form submission. The values are in the POST array and will be processed by the processInput() method. Before they will be saved in the database they will be sanitized by the sanitizeValue() mehtod and afterwards they will be prepared for storage in the sleepValue() method.  The last step is the storage itself.
      Has someone an idea what is missing by storing values from multiple fields into 1 database column or has someone a working example of such a scenario on github to help me out.
      A clear explanation of the storage process will be also helpful.
      Thanks and best regards
    • By Pip
      Hi Everyone 
      I've been working on Processwire for two months now. Structuring the website as needed. Unsure why but I'm getting this one now. Seems my fields has crashed. 
      I've tried googling some answers but can't seem to find a step by step guide on how to rectify this. Any advise? Practically new on this. 
      TIA. 

    • By Rodd
      Hi everyone!
      I have a website in a production environment and I want to duplicate it in a local environment. I exported the content of the website (with the 'Site Profile Exporter' module) but I cannot use it actually. I've got an issue with the database. I imported this one in MAMP then.

      I also exported the pages (with the 'ProcessPagesExportImport' module), but I cannot import it to my local website because the fields don't exist. So I created this fields, but I have this error :
      How can I use the elements that already exist and are presents in my database? How can I duplicate correctly the templates, fields and pages?
      Thanks by advance
      PS: Sorry if my english is bad
       
    • By DooM
      Hello guys,
      I'm trying to figure out how to sync fields and templates between staging and production environments.
      I've found Migrations module by Lostkobrakai, but with use of it all the fields and templates must be created by API, which is kind of uncomfortable.
      I also tried ProcessDatabaseBackups module which can export only certain tables, but I don't think it's the best practice to do that.
      How do you guys solve this problem? It's very annoying to setup everything three times (dev, staging, production).
      Thanks a lot :)
    • By iipa
      Hi everybody!
      I have been reading about Multisite, but it kinda bugs me that every topic talks about having both admin and database same for multiple sites.
      I have a project where customer tests it by adding content to the site, while I still need to do some changes here and there in code, maybe some in database. If something crashes for a while, customer can't keep testing, which is a bit problematic.
      Is there any way that I could have two separate versions of one site ("production" and development) that share the same database, but are otherwise independent? Just the thought of having to migrate database every time I want to show client something new gives me anxiety 😁
×
×
  • Create New...