Jump to content
teppo

How to turn on autojoin for user-defined fields (or mimic similar functionality) for single query on the fly?

Recommended Posts

I'm developing a module that lets user specify which fields she wants to show in the result list (which is intended for print use, but that's not very relevant here.) My problem is that when multiple fields are selected SQL queries tend to pile up which results in sluggish page load times.

Currently matching pages (selector is also generated based on user input, but that's another story) are loaded with $pages->find() and then looped through with foreach and rest of the fields (which, as I mentioned above, are actually chosen by user) are fetched individually. This is where things get really slow, especially if user has specified 10+ fields to show :)

One solution I've used to make things a bit faster is that most commonly used fields have autojoin setting on and thus PW doesn't require another query to fetch their content. Problem with this is that same data is used elsewhere in simple list views, which get way more hits than my little module here does and if I make all the fields autojoin it would seriously cripple those other views and in the long run possibly suffocate whole server..

Long story short, what I'm looking for is a way to turn on autojoin for certain fields on the fly before fetching pages OR mimic similar feature some other way. Any ideas how to achieve something like this are more than welcome - and don't hesitate to tell me that I'm doing it all wrong, if you know a better way to achieve same results.. :)

Share this post


Link to post
Share on other sites

You might try creating a new field using the FieldtypeCache type. This isn't installed by default, but it is included with PW, so you'll just need to click "install" from the modules menu.

This field essentially keeps the values from several fields in one field, so that they can all be loaded at once as a group. Like all caches, it's creating duplication and trading disk space for speed. These cache fields are also handy if you are building a text searching engine that needs to scan lots of fields at once... your selector can query just the cache field rather than "field1|field2|field3|etc".

Once installed, create the new "cache" field, click to its "details" tab and follow the instructions there. Don't make it autojoin. Instead, have your output generation code reference the cache field field (like $page->my_cache) before any of the others that are triggering queries. This should enable you to reduce it to just one query for all those fields.

I'll be honest and say I created this fieldtype with many good intentions, but have rarely found it necessary to use. So be certain the situation you describe really is the bottleneck, and that you aren't omitting pagination or accidentally referencing something like $page->children() or $page/s->find() without a "limit=". The reason that I mention it is that MySQL selects are extremely fast, to the point where the difference between autojoin and non-autojoin may require some scale before you can measure the difference. If you are query counting, avoid doing that, as it often doesn't correlate with performance, especially in ProcessWire. (I used to be a big query counter until I read the book High Performance MySQL, which is reflected in PW's DB query approach).

One other idea you might look at is temporarily convincing the fields in question at runtime that they are autojoin, before loading your pages:

foreach($selectedFields as $fieldName) {
 $field = wire('fields')->get($fieldName);
 $field->flags = $field->flags | Field::flagAutojoin; 
}

That's assuming all the fields in question support autojoin. Some fields, like files/images, can't be autojoined and telling it to may cause unpredictable results.

  • Like 2

Share this post


Link to post
Share on other sites
One other idea you might look at is temporarily convincing the fields in question at runtime that they are autojoin, before loading your pages

Thanks, Ryan -- this is actually exactly what I had in mind, though I didn't think it'd actually work. Going to give that a try now. I did consider using FieldtypeCache too, but since (unless I'm mistaking something here) it would require me to cache all the fields used by this template I rejected that solution.

Sad thing here is that omitting pagination / limit is intentional. This module I'm talking about produces a print-friendly view to page data and thus pagination / limits are unwelcome. Also: I'm trying not to look too much into query counter, but when there are 5000+ queries in a relatively simple situation, I'm afraid it does become a factor. Dealing with quite a lot of data here. :)

Share this post


Link to post
Share on other sites

The solution mentioned above actually helped shave seconds off some relatively complex queries. Still not fast enough, though, so I ended up using pagination anyway -- not how I'd like this to work, but I guess I can cope with that after all :)

Now I'm facing a new kind of problem though: multi-value get-parameters (param[]=x&param[]=y) don't seem to work properly with renderPager(). For an example this URL:

/processwire/custom-search/?print%5B%5D=brand_name&print%5B%5D=street_address&print%5B%5D=business_id

... is converted to this within pager links:

/processwire/custom-search/page2?print=brand_name%7Cstreet_address%7Cbusiness_id

Any ideas how to fix this, or should I just generate my own pager markup?

Share this post


Link to post
Share on other sites
The solution mentioned above actually helped shave seconds off some relatively complex queries. Still not fast enough, though, so I ended up using pagination anyway -- not how I'd like this to work, but I guess I can cope with that after all

Glad to hear that shaved off a few seconds! I think the only way to possibly go faster would be with direct SQL queries. Though it sounds like you are dealing with a large quantity of data and the scale is probably going to be a bottleneck no matter how it's done.

Any ideas how to fix this, or should I just generate my own pager markup?

I'm not sure I know exactly why it's doing that, though I've also never used array[] style in GET vars. But think the simplest solutions would be to either 1) str_replace it to where you need it, or 2) use commas to separate values in the URL rather than repeating the same var[]. For instance:

// setting
$url = $page->url . '?print=' . implode(',' $print); 

// getting
$print = explode(',', $input->get->print); 

Share this post


Link to post
Share on other sites

I guess it would be possible to avoid this problem by adding some kind of a preprocessor for certain predefined attributes; convert their values into arrays with explode if they originally were strings with certain characters in them. That's kind of ugly though, so I'll have to think about this a bit.

Part of the problem, it seems, was self-made; my module whitelisted vars after they had gone through rather strict filtering and this is where "|" characters came from. Anyway, that didn't fix the underlying problem, which is that MarkupPagerNav can't handle input arrays properly. The problem seems to be caused by this part of setGetVars method of MarkupPagerNav.module:

foreach($this->options['getVars'] as $key => $value) {
   if(is_array($value)) {
       $a = $value;
       $value = '';
       foreach($a as $k => $v) $value .= "$v,";
       $value = rtrim($value, ", ");

   }
   ...

If an array is passed as a GET var to MarkupPagerNav, it's automatically converted to one string separated by commas. This might be considered intended behavior, but it's still effectively making the use of GET arrays with MarkupPagerNav more difficult than it should be. Personally I'd prefer to keep them as they were (with urlencode included, naturally) -- or perhaps add a toggle of some sort for this.

What do you think? :)

Edit: just to clarify what I'm actually talking about here these are the changes I've made to my local MarkupPagerNav.module:

--- modules/Markup/MarkupPagerNav/MarkupPagerNav.module (revision ...)
+++ modules/Markup/MarkupPagerNav/MarkupPagerNav.module (working copy)
@@ -224,14 +224,12 @@
		 $queryString = "?";
		 foreach($this->options['getVars'] as $key => $value) {
				 if(is_array($value)) {
-							 $a = $value;
-							 $value = '';
-							 foreach($a as $k => $v) $value .= "$v,";
-							 $value = rtrim($value, ", ");
+							 foreach($value as $k => $v) $queryString .= "$key%5B%5D=" . urlencode($v) . "&";
+					 } else {
+							 // $queryString .= "$key=" . urlencode(htmlspecialchars($value)) . "&";
+							 $queryString .= "$key=" . urlencode($value) . "&";
				 }
-					 // $queryString .= "$key=" . urlencode(htmlspecialchars($value)) . "&";
-					 $queryString .= "$key=" . urlencode($value) . "&";
		 }
		 $this->queryString = htmlspecialchars(rtrim($queryString, "?&"));
 }

Share this post


Link to post
Share on other sites

I'm good with a toggle. I don't want to change the existing default behavior, just because it's been there awhile, and some people may be relying on it (including me). So dont want to produce any surprises. But I've added a toggle called 'arrayToCSV' that you can put in the $options. If you set that to false, then it'll put it in the format that you prefer. I've attached a copy of this file. Can you test to confirm that this works the way you were thinking and let me know? If all looks good, I'll commit it to the source.

MarkupPagerNav.module

  • Like 1

Share this post


Link to post
Share on other sites

Ryan, I've just tested this and the new arrayToCSV toggle does exactly what I had in mind and works like a charm! :)

Share this post


Link to post
Share on other sites

Thanks for testing it, I've committed the changes so they now appear in 2.2.6.

  • Like 1

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By GWColeman
      Hi,
      I am working on a processwire project for my college. I created an entire site about a year ago without any problem. Today however, I was making some changes to the site and I was experimenting with duplicating data across multiple pages. There is a repeater on one of the pages that I need to be the same across all pages.

      I clicked the checkbox to autojoin within the field options for the repeater and processwire crashed. I am not able to make any changes to the page to uncheck and I cannot view the site any longer.
      This is the message I get on the page.
      Fatal error: Exception: Could not execute User::__construct() (in /var/www/html/wire/core/Pages.php line 323) #0 /var/www/html/wire/core/Pages.php(323): mysqli_result->fetch_object('User', Array) #1 /var/www/html/wire/core/PagesType.php(101): Pages->getById(Array, Object(Template), 29) #2 /var/www/html/wire/core/Session.php(64): PagesType->get(41) #3 /var/www/html/wire/core/ProcessWire.php(138): Session->__construct() #4 /var/www/html/wire/core/ProcessWire.php(46): ProcessWire->load(Object(Config)) #5 /var/www/html/index.php(185): ProcessWire->__construct(Object(Config)) #6 {main} in /var/www/html/index.php on line 217

      I tried to comment out 323 without any luck. I also took a look in the database to see if I could manually turn off the autojoin option. However, I was not able to figure out where the option was located within the database.
       
      Any help or suggestions would appreciated.
      EDIT:
      I was able to solve the issue on my own. I did a google search and found that I needed to change he flag for the field from 1 to 0 in the database, and everything works again.

      Here is the post that I found that solved the issue.
       
       
    • By ceberlin
      A conversation from the GITHUB issues which I thought to be useful to have in this Forum's knowledge base.
      Ryan Cramer put some valuable information there, which should be searchable in this forum, i.m.h.o.
      My Summary:
      Use the "autojoin" option on pagefield (when allowing multiple pages) with care. 
      There could be unwanted side effects especially if you use more than one of these fields with that setup on a single template.
      Original conversation:
      tbba asked:
      I have 2 page fields (one is the clone of the other - so they are unique apart from the name).
      Both are used on a template.
      The first field DOES remember the (dragged) sort order of the pages, the second does NOT.
      Has anyone else have the problem too?
      Is this a cache problem, or corrupt indexes?
      PW latest dev / php 5.6 fastcgi / Safari+Firefox
      Update: If I change the display order of those 2 fields in the template, both fields behave the opposite:
      So the first (whatever the first field is in the template when editing the page) always WORKS the second NOT. (The 2nd field lets me still adding or deleting a page - just the order is corrupt there.)
        Update: On localhost everything works (same php version but has opcache off in MAMP - maybe this info is important)
      Update: If I switch "autojoin" off for those page fields, everything works everywhere. 
      This is my workaround for now.
      (I remember that "autojoin" has been a problem for me with page fields already in much earlier versions of PW on many sites so it is not a temporary phenomena.)
      ryancramerdesign commented If you need to retain a sort order, you'd only want to autojoin one of the page fields. Autojoin causes everything to be loaded in one query, and there can't be two different sorted result sets included in that query as far as I know. Though it's interesting it works on your localhost and not on the other server. There are some MySQL versions known to have issues with sorting and certain queries, and you may be hitting up against that on one server and not the other. Though I can definitely see how auto joining two Page fields in one query would pose challenges to retaining the sort order. So in either case, I would limit your autojoin to just one Page field, or simply not use Autojoin for page fields at all. There's not a major benefit in doing so.  
      tbba commented MYSQL on the live server is 5.6 and on localhost/MAMP it is still 5.5.
      I am very glad you can confirm that this issue is not a phantom of a defect system and can be 100% nailed down to "Autojoin". 
      I used Autojoin under the assumption that it indexes a field and makes certain searches (menu building) quicker. (Since the new cache options, I am not worried with speed on certain searches any more.)
      If Autojoin has, in the best case, no "major" effect with page fields and in the worst case totally confuses the system, why is the option nor switched off and deactivated for this field type? 
      Is Autojoin only bad for page arrays or even if the page field is set to accept only one page? 
      Are there other fields that also have a problem with "Autojoin"?
      (I must say this "mystery bug" - which sometimes happens and sometimes not - was quite a debugging nightmare and other users should be warned/hindered to set-up such a combination, i.m.h.o.)
      ryancramerdesign commented
      I am very glad you can confirm that this issue is not a phantom of a defect system and can be 100% nailed down to "Autojoin".
        Actually I can't nail it down to autojoin, but outside of using it with a simple text or number field, autojoin is an advanced option that can be affected by a lot of different factors. It's one of those things we suggest using when you find it helps, and don't use when you find it interferes. When you autojoin a multi-value field, MySQL is performing what's called a group_concat, and is at the mercy of certain MySQL memory buffers and configuration settings. It's very possible you are hitting up against that too, which is made all the more likely by auto joining two multi-value fields. That would explain why you see it on one server and not another.
      I used Autojoin under the assumption that it indexes a field and makes certain searches (menu building) quicker. (Since the new cache options, I am not worried with speed on certain searches any more.)
      It can make things quicker but it can also slow them down. It really depends on the case. The only way to tell for certain in a given instance is by testing and timing it. I do know for certain that auto-joining one or two text fields (like title and summary, for example) does provide a performance benefit to large navigation lists that use the autojoined fields, though not likely one that you would feel unless loading hundreds of thousands of pages. 
      If Autojoin has, in the best case, no "major" effect with page fields and in the worst case totally confuses the system, why is the option nor switched off and deactivated for this field type? 
      Because it is an advanced option that may be beneficial. If auto joining one multi-value field, like a small list of category page references, you may find it beneficial when outputting a 100 item navigation list. Or you may not. It really does come down to installation specific configuration, so it's one of those things you want to test when you use it to make sure it's working and worthwhile for your case. I wouldn't want to disable the option for this Fieldtype, because I think it can be worthwhile. But it's on the advanced tab for a reason. 
      Is Autojoin only bad for page arrays or even if the page field is set to accept only one page? 
      Are there other fields that also have a problem with "Autojoin"?
      Autojoin should be fine for Page fields, especially single page fields. What I would avoid is giving several multi-value Page fields autojoin on the same template. You are more likely to run into issues there, but again, the only way to know for certain if it's going to be an issue on your installation is to test it. If you need something guaranteed to be portable across systems, then limit autojoin use to single-value text fields and such. Any multi-value Fieldtype has potential to run into installation-specific limits that could cause issues, so always treat it as an advanced option and test.
    • By adinuno
      Hi there,
      Today I was testing Autojoin and Global features on fields. I have a repeater field and I activated Global and Autojoin and processwire now shows the following message:
      Unable to complete this request due to an error. Error has been logged.   The error log was this:
      2013-10-31 01:36:08    ?    http://localhost/?/    Error:     Exception: Could not execute User::__construct() (in /Applications/XAMPP/xamppfiles/htdocs/cms/wire/core/Pages.php line 323)   How can I solve this problem? I have searched the site database, but I couldn't find this option the autojoin option to toggle it manually.
      Thanks in advance
×
×
  • Create New...