Jump to content
LMD

TIP: Using "AND" selectors across multiple fields (named selectors to the rescue)

Recommended Posts

I don't know if this is the right place, it's not really a tutorial, just a tip based on notes I wrote myself in a recent project to get it straight in my own head. I thought it might be useful for others in a similar situation.

Scenario:

Create a search function that will search for keywords "foo" and "bar" in multiple fields, but the keywords do not have to be adjacent, in order, or even all in the same field. For eample, the selector must match if "foo" is in "field_a" and "bar" is in "field_b" -- so long as both keywords are present somewhere, the page match is valid.

It is possible to just split the terms and do multiple queries on each field separately and then combine the results into a single PageArray for pagination (I believe there is a module that helps with this). However, I wanted to see if it was possible to do a basic version with a single query.

Not The Solution:

The following selector does not work when keywords appear separately in different fields (operator '~=' - contains all the words):

$selector = "title|field_a|field_b~=foo bar";

What the selector is saying:

FIND BOTH "foo" AND "bar" IN title
OR
FIND BOTH "foo" AND "bar" IN field_a
OR
FIND BOTH "foo" AND "bar" IN field_b

In this case, both "foo" and "bar" have to be in the same field (but not adjacent or in order) to match.

The Actual Solution

What we need to use is "named selectors" to let us match each individual keyword separately while still using one selector.

Using the same example as before:

$selector = "selector1=(title|field_a|field_b~=foo), selector2=(title|field_a|field_b~=bar)";

What the selector is saying at its most basic level:

FIND BOTH selector1 AND selector2

Or, to expand on this, it is saying:

(FIND "foo" IN title OR field_a OR field_b)
AND
(FIND "bar" IN title OR field_a OR field_b)

Crucially, "foo" and "bar" do not have to be in the same field to match.

Practical Method

In this example code, I am actually allowing the search for phrases (using "quoted text") as well as individual terms, so a person could enter...

"foo bar" baz

... and it will keep "foo bar" together aa one term and "baz" as a separate term and match them as an exact phrase.

// Keywords obtained from $input->get and cleaned (multiple spaces removed)/sanitized etc.
$keywords = '"foo bar" baz';

// Split into individual search terms by space (preserve spaces in quoted text)
$terms = str_getcsv($keywords, " "); // array("foo bar", "baz")

// Build up named selectors
$ns = ""; // named selectors string
$i=1; // named selector count

foreach ($terms as $term) {
    // operator '*=' - contains the exact word or phrase
    $ns .= ", ns{$i}=(title|field_a|field_b*=" . trim($term) . ")";
    $i++;
}
//$ns = ", ns1=(title|field_a|field_b*=foo bar), ns2=(title|field_a|field_b*=baz)"

// Construct the whole selector (modify/add other general selectors as needed)
$selector = "template=my-template, limit=20, sort=-date" . $ns;

// Find pages based on selector
$results = $pages->find($selector);

 

DISCLAIMER

I haven't done any tests to see if this method is more efficient than running queries on each field separately and combining the results, I just wanted to see if it was possible!

  • Like 6

Share this post


Link to post
Share on other sites

I think you can also do:

$selector = "title|field_a|field_b~=foo|bar";

 

Share this post


Link to post
Share on other sites
1 hour ago, Tom. said:

I think you can also do:


$selector = "title|field_a|field_b~=foo|bar";

 

Not quite, that would return pages where only ONE of the search terms is present in any of the fields ("foo" OR "bar").

FIND EITHER "foo" OR "bar" IN title
OR
FIND EITHER "foo" OR "bar" IN field_a
OR
FIND EITHER "foo" OR "bar" IN field_b

So it would match a page where only "foo" is contained in any (or all) of the fields -- but "bar" must also be present somewhere.

Likewise, this would not work either:

$selector = "title~=foo|bar, field_a~=foo|bar, field_b~=foo|bar";

FIND EITHER "foo" OR "bar" IN title
AND
FIND EITHER "foo" OR "bar" IN field_a
AND
FIND EITHER "foo" OR "bar" IN field_b

In this case, "foo" or "bar" must be present in all the fields, which is also not what we want

Share this post


Link to post
Share on other sites

Thanks for the tip @LMD

It is worth to note the new "tags for fields" search "shortcut" of ProcessWire 3.0.106:
https://processwire.com/blog/posts/pw-3.0.106/

I guess title|field_a|field_b in your example could be simplified too by using this new feature. Provided it supports "named selectors".

Share this post


Link to post
Share on other sites

@LMD Thanks for this post. I'll definitely keep this bookmarked.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By celfred
      Hello,
      I've just upgraded to 3.0.165 (and updated my Ubuntu version as well) and on my localhost, I am facing a weird issue : all my requests having the ~= selector cause a Mysql error with this message :

      PDOException #HY000 SQLSTATE[HY000]: General error: 3685 Illegal argument to a regular expression. I have no idea what is going on. If I change my request from

      $visualizer = $pages->get("name~=visualizer"); // Triggers the error
      to
      $visualizer = $pages->get("name=visualizer");
      or
      $visualizer = $pages->get("name~*=visualizer");
      My code works fine again.
      Any idea ? Shall I change all my requests (but from what I understand by reading the documentation, ~= exists and fits my needs : all words in any order [though I do understand that in my example above it may be useless since I have only one word])
      Thanks !
    • By Anders
      I want to allow full text search on my site. There is a very nice solution that comes right out of the box:
      $selector = "title|body~=$q, limit=50"; This works, but to make it even better I would want to give higher weight to pages where the search term occurs in the title, than if it just occurs in the body. After all, a page with the title "Wine from France" is probably the best match for the search "france wine". How do I accomplish this in ProcessWire?
      I can see three possible paths, but I am not very fond of any of them:
      Do a direct SQL query, circumventing the API, along these lines. But I would prefer to abstract away the database layout if at all possible. Use something like ElasticSearch, but to be honest that would be to complicated to set up and maintain in the long run. Make multiple lookups, first for matches in the title, then for matches in the body, and merge and sort in PHP. My suspicion is that this would get complicated quite quickly. For instance, how do you deal with a page that has two of the three search terms in the title and the third in the body? Is there a magic option four I should look into? Or are any of the above options better than the others? Any input is welcome!
    • By iipa
      Hi!
      I have two Page Reference fields: Category and Subcategory. Category is parent of Subcategory, and can have 0 to n Subcategories. Fields are selected by dropdown selection.
      I'm trying to achieve following logic:
      1. Select Category (obviously works)
      2. Change Subcategory options based on selected Category (this works)
      3. Hide the Subcategory field if Category has no Subcategories (to prevent weird dropdown with nothing to select)
      I have tried to investigate two possible alternatives to achieve step 3:
      A) Make dynamic condition into Subcategory selection's "Only visible if..." field
      B) Make a hook that fires when Category changes, then hides Subcategory field if Category has no children
      So far I haven't been successful in either. Conditions I've tried always lead to hidden field, and I haven't been able to find set up a hook that fires on field change.
      So here's the question (finally): Is there a way to alter field visibility in API, for example in ready.php or inside a hook? In API I could loop through categories, find the ones that have children and then make a selector based on their IDs.
    • By schwarzdesign
      I'm working on a news feed that will show the most recent news in a full teaser grid, and all older news as a simpler archive-type list view. My selectors so far:
      $news_full = $page->children("template=news, limit={$page->feed_count_full}"); $news_archive = $page->children("template=news, start={$page->feed_count_full}, limit=9999"); The $page->feed_count_full field controls how many items to show in the teaser grid (I've confirmed it contains the correct value, and the $news_full selector works as intended).
      This works, but I don't like the limit in the second selector. Unfortunately, if I leave it out (i.e. I only specify a start, not a limit), the start is ignored and I get all news instead. Not a big problem as we will never have more than 9999 news, but it still bothers me, as semantically speaking I don't want to set a limit in this case.
      Is this the intended behaviour of start/limit selectors? Is there a cleaner way to specify an offset (start selector) without a limit?
      ProcessWire Version 3.0.123
      Thanks!
    • By pwFoo
      I played with the Selectors object and would like to use it for a special use case...
      That is a custom Selectors object with dummy data converted to an php array to see the structure (Selectors object is a WireArray with "fields" added)
      Array ( [0] => Array ( [0] => Array ( [field] => seg1 [value] => val1 [not] => [group] => [quote] => [forceMatch] => ) [1] => Array ( [field] => seg2 [value] => val2 [not] => [group] => [quote] => [forceMatch] => ) ) )  
      But instead of "seg1" the field name is "field" with value "seg1" (= my field name).
       
      So I can't search the Selectors WireArray (= custom WireArray with added Selectors objects) with PW "find('seg1=val1')", Is there a way to search with "find()" or build a simple wrapper to make the elements searchable / filterable with find()?
×
×
  • Create New...