Jump to content

How to turn on autojoin for user-defined fields (or mimic similar functionality) for single query on the fly?


teppo
 Share

Recommended Posts

I'm developing a module that lets user specify which fields she wants to show in the result list (which is intended for print use, but that's not very relevant here.) My problem is that when multiple fields are selected SQL queries tend to pile up which results in sluggish page load times.

Currently matching pages (selector is also generated based on user input, but that's another story) are loaded with $pages->find() and then looped through with foreach and rest of the fields (which, as I mentioned above, are actually chosen by user) are fetched individually. This is where things get really slow, especially if user has specified 10+ fields to show :)

One solution I've used to make things a bit faster is that most commonly used fields have autojoin setting on and thus PW doesn't require another query to fetch their content. Problem with this is that same data is used elsewhere in simple list views, which get way more hits than my little module here does and if I make all the fields autojoin it would seriously cripple those other views and in the long run possibly suffocate whole server..

Long story short, what I'm looking for is a way to turn on autojoin for certain fields on the fly before fetching pages OR mimic similar feature some other way. Any ideas how to achieve something like this are more than welcome - and don't hesitate to tell me that I'm doing it all wrong, if you know a better way to achieve same results.. :)

Link to comment
Share on other sites

You might try creating a new field using the FieldtypeCache type. This isn't installed by default, but it is included with PW, so you'll just need to click "install" from the modules menu.

This field essentially keeps the values from several fields in one field, so that they can all be loaded at once as a group. Like all caches, it's creating duplication and trading disk space for speed. These cache fields are also handy if you are building a text searching engine that needs to scan lots of fields at once... your selector can query just the cache field rather than "field1|field2|field3|etc".

Once installed, create the new "cache" field, click to its "details" tab and follow the instructions there. Don't make it autojoin. Instead, have your output generation code reference the cache field field (like $page->my_cache) before any of the others that are triggering queries. This should enable you to reduce it to just one query for all those fields.

I'll be honest and say I created this fieldtype with many good intentions, but have rarely found it necessary to use. So be certain the situation you describe really is the bottleneck, and that you aren't omitting pagination or accidentally referencing something like $page->children() or $page/s->find() without a "limit=". The reason that I mention it is that MySQL selects are extremely fast, to the point where the difference between autojoin and non-autojoin may require some scale before you can measure the difference. If you are query counting, avoid doing that, as it often doesn't correlate with performance, especially in ProcessWire. (I used to be a big query counter until I read the book High Performance MySQL, which is reflected in PW's DB query approach).

One other idea you might look at is temporarily convincing the fields in question at runtime that they are autojoin, before loading your pages:

foreach($selectedFields as $fieldName) {
 $field = wire('fields')->get($fieldName);
 $field->flags = $field->flags | Field::flagAutojoin; 
}

That's assuming all the fields in question support autojoin. Some fields, like files/images, can't be autojoined and telling it to may cause unpredictable results.

  • Like 2
Link to comment
Share on other sites

One other idea you might look at is temporarily convincing the fields in question at runtime that they are autojoin, before loading your pages

Thanks, Ryan -- this is actually exactly what I had in mind, though I didn't think it'd actually work. Going to give that a try now. I did consider using FieldtypeCache too, but since (unless I'm mistaking something here) it would require me to cache all the fields used by this template I rejected that solution.

Sad thing here is that omitting pagination / limit is intentional. This module I'm talking about produces a print-friendly view to page data and thus pagination / limits are unwelcome. Also: I'm trying not to look too much into query counter, but when there are 5000+ queries in a relatively simple situation, I'm afraid it does become a factor. Dealing with quite a lot of data here. :)

Link to comment
Share on other sites

The solution mentioned above actually helped shave seconds off some relatively complex queries. Still not fast enough, though, so I ended up using pagination anyway -- not how I'd like this to work, but I guess I can cope with that after all :)

Now I'm facing a new kind of problem though: multi-value get-parameters (param[]=x&param[]=y) don't seem to work properly with renderPager(). For an example this URL:

/processwire/custom-search/?print%5B%5D=brand_name&print%5B%5D=street_address&print%5B%5D=business_id

... is converted to this within pager links:

/processwire/custom-search/page2?print=brand_name%7Cstreet_address%7Cbusiness_id

Any ideas how to fix this, or should I just generate my own pager markup?

Link to comment
Share on other sites

The solution mentioned above actually helped shave seconds off some relatively complex queries. Still not fast enough, though, so I ended up using pagination anyway -- not how I'd like this to work, but I guess I can cope with that after all

Glad to hear that shaved off a few seconds! I think the only way to possibly go faster would be with direct SQL queries. Though it sounds like you are dealing with a large quantity of data and the scale is probably going to be a bottleneck no matter how it's done.

Any ideas how to fix this, or should I just generate my own pager markup?

I'm not sure I know exactly why it's doing that, though I've also never used array[] style in GET vars. But think the simplest solutions would be to either 1) str_replace it to where you need it, or 2) use commas to separate values in the URL rather than repeating the same var[]. For instance:

// setting
$url = $page->url . '?print=' . implode(',' $print); 

// getting
$print = explode(',', $input->get->print); 
Link to comment
Share on other sites

I guess it would be possible to avoid this problem by adding some kind of a preprocessor for certain predefined attributes; convert their values into arrays with explode if they originally were strings with certain characters in them. That's kind of ugly though, so I'll have to think about this a bit.

Part of the problem, it seems, was self-made; my module whitelisted vars after they had gone through rather strict filtering and this is where "|" characters came from. Anyway, that didn't fix the underlying problem, which is that MarkupPagerNav can't handle input arrays properly. The problem seems to be caused by this part of setGetVars method of MarkupPagerNav.module:

foreach($this->options['getVars'] as $key => $value) {
   if(is_array($value)) {
       $a = $value;
       $value = '';
       foreach($a as $k => $v) $value .= "$v,";
       $value = rtrim($value, ", ");

   }
   ...

If an array is passed as a GET var to MarkupPagerNav, it's automatically converted to one string separated by commas. This might be considered intended behavior, but it's still effectively making the use of GET arrays with MarkupPagerNav more difficult than it should be. Personally I'd prefer to keep them as they were (with urlencode included, naturally) -- or perhaps add a toggle of some sort for this.

What do you think? :)

Edit: just to clarify what I'm actually talking about here these are the changes I've made to my local MarkupPagerNav.module:

--- modules/Markup/MarkupPagerNav/MarkupPagerNav.module (revision ...)
+++ modules/Markup/MarkupPagerNav/MarkupPagerNav.module (working copy)
@@ -224,14 +224,12 @@
		 $queryString = "?";
		 foreach($this->options['getVars'] as $key => $value) {
				 if(is_array($value)) {
-							 $a = $value;
-							 $value = '';
-							 foreach($a as $k => $v) $value .= "$v,";
-							 $value = rtrim($value, ", ");
+							 foreach($value as $k => $v) $queryString .= "$key%5B%5D=" . urlencode($v) . "&";
+					 } else {
+							 // $queryString .= "$key=" . urlencode(htmlspecialchars($value)) . "&";
+							 $queryString .= "$key=" . urlencode($value) . "&";
				 }
-					 // $queryString .= "$key=" . urlencode(htmlspecialchars($value)) . "&";
-					 $queryString .= "$key=" . urlencode($value) . "&";
		 }
		 $this->queryString = htmlspecialchars(rtrim($queryString, "?&"));
 }
Link to comment
Share on other sites

I'm good with a toggle. I don't want to change the existing default behavior, just because it's been there awhile, and some people may be relying on it (including me). So dont want to produce any surprises. But I've added a toggle called 'arrayToCSV' that you can put in the $options. If you set that to false, then it'll put it in the format that you prefer. I've attached a copy of this file. Can you test to confirm that this works the way you were thinking and let me know? If all looks good, I'll commit it to the source.

MarkupPagerNav.module

  • Like 1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...