Searching for pages with $=

btcleigh · June 10, 2011

Somewhat related to this discussion, in addition to ~= and *= there's also the $= operator, which runs a LIKE instead of MATCH.

For a site search (where modifying the MySQL index isn't an option) I'm intending to use something along the lines of:

$grabbedPages = $page->children("title~=" . $term);
if($grabbedPages->count() == 0){
    $grabbedPages = $page->children("title$=" . $term);
}

So first it searches using the FULLTEXT index, but if there's no matches then it jumps in and does a slow LIKE match.

The problem I'm having is in core on line 72 is:

$query->where("$tableField LIKE '%$v'");

Which isn't ideal for searches, it's preferable for it to be along the lines of

$query->where("$tableField LIKE '%$v%'");

I was curious if anyone knows if there's a dependancy on the line somewhere else in core that would break if it changed (did a quick flip through the core but hadn't found anything else relying on $=). Or for that matter any modules that could be disrupted.

Cheers

ryan · June 10, 2011

Good question. The $= and ^= search operators are actually designed to match text at the end ($) or beginning (^) of a field. They are supposed to serve the equivalent role of what they do in jQuery. The LIKE is only supposed to run as a secondary part of the query. In this case, the LIKE can be pretty fast because MySQL will only run the LIKE on the results of the MATCH as part of the same query. The reason you see LIKE '%$v' is because that is the proper way to match something at the end of a text field (and LIKE '$v%' matches at the beginning). MATCH can't actually determine where something is in the field, so that's where the LIKE comes into play ... it does a MATCH and then filters those results through the LIKE in the same query. A regular LIKE query can be pretty slow, but a MATCH+LIKE query can potentially be very fast.

Everything described above is specific to ProcessWire 1.x. I haven't yet fully implemented the $= and ^= search operators in PW2, nor are they documented anywhere. So I'm pretty impressed by your question–you've done research. The plan was that I would eventually implement them in PW2, but they are more specific-purpose, so it just hadn't come up yet. What you see in the PW2 source is for placeholder purposes until it can be finished. I'm not in front of the code right now, but I don't think $= (or ^=) is functional... or if it is in some way, it wasn't intended.

It sounds like you are looking for something that matches anywhere in a field, but does a LIKE without a MATCH. This is something that PW doesn't have currently, but I think it may make sense for us to add it? Perhaps as "%=" as the search operator (which nods to the LIKE syntax). True it'll be slow... but we'd use it with that understanding, and sometimes it really doesn't matter if it's slow. I think we can add it pretty easily.

btcleigh · June 10, 2011

Yep, been digging around a bit in the guts to figure out how the selectors have been implented. Didn't quite put 2 and 2 together though and figure out it's intended use for matching at the end of text.

In any case, for the code example above it's working fine. BTW, at the moment ^= is doing the same as *=, but $= has its own case.

To try and hack as little as possible I'm going to add a separate handler based on your suggestion of %= into DatabaseQuerySelectFulltext.match to implement the LIKE '%text%'. I should probably add an exception handler to my little search routine to catch the 'Unimplmented operator' WireException in case the core gets updated and my %= case isn't there any more.

Thanks for the info Ryan. Might do a bit of testing on how %= works in more complex selectors and get a patch organised.

ryan · June 10, 2011

This was a really simple thing to add so I went ahead and put it in, and it's now in the latest commit of PW 2.1:

https://github.com/ryancramerdesign/P21/commit/f1c848a9a3766bf031e88bf7f10b667495eb26a3

In testing it on a site with 3,000 'bodycopy' fields (about 12 megs worth), I found it was faster than I expected. It's finding matches in about 0.2 seconds in this relatively large site. Though it depends on the search, and I'm suspicious MySQL might be optimizing some of them to MATCH/AGAINST queries. But all in all, it seems like it's a pretty practical solution for a site that isn't going to grow massive. If you need to match short words or stopwords, %= is going to be pretty darn useful, so thanks for bringing this up.

While I was in the code, I also completed the ^= and $= search operators, so they should be fully functional now too. I use RLIKE for these rather than LIKE so that it could skip over potential markup tags or punctuation at the beginning or end of the field (something that couldn't be done with LIKE).

ryan · June 10, 2011

Documentation updated to cover these as well:

http://processwire.com/api/selectors/

btcleigh · June 16, 2011

That's awesome Ryan, thanks for adding that in.

Sign In

Searching for pages with $=

Recommended Posts

btcleigh

Link to comment

Share on other sites

ryan

Link to comment

Share on other sites

btcleigh

Link to comment

Share on other sites

ryan

Link to comment

Share on other sites

ryan

Link to comment

Share on other sites

btcleigh

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Recently Browsing 0 members

Browse

Activity

My Activity Streams

Support

Store

My Details