Jump to content

Advice on which selector to use for general site search


DrQuincy
 Share

Recommended Posts

I have an older site where the client wishes to add a general site search: a single search box that will search everything. I have done this before by adding a hidden text field to each searchable templates and creating a plain text version of the page when it is saved (using hooks).

Then when I do a search using ~= for fulltext and if the number of characters if fewer than the minimum character limit for natural language search for that engine I switch to %= and do a LIKE search (although searches must be at least three characters).

This site (which uses MyISAM), however, is based in the engineering industry and as well as the usual pages of about, news, etc they have products and manuals that they will want to be made searchable.

I am wondering if a user searches Acme 123 will it only search for Acme since 123 is too short for MyISAM — rather give more weight to the specific 123 product? I guess it depends on the selector (see below).

I'm wondering how to approach this. With my custom work I tend to write a query that does a MATCH AGAINST with the phrase with no quotes and then again wrapped in double-quotes, given more weight to the latter (you can do this with InnoDB without having to use BOOLEAN MODE) via the relevancy score.

The problem is natural language is better for general searches but LIKE is possibly better for searching product names. I'm not sure there's an easy solution other than to add a checkbox “Exact match only (useful for product searches)” — unless there is a way to do it with fulltext selectors.

I've noticed that newer versions of ProcessWire offer a wider selection of selectors. Here are the relevant ones:

FULLTEXT
*=    Contains phrase/text    Given phrase or word appears in value compared to.
~=    Contains all words    All given whole words appear in compared value, in any order.
~*=    Contains all partial words    All whole or partial words appear in value, in any order.*
~~=    Contains all words live    All whole words and last partial word appear in any order.*
~|=    Contains any words    Any given whole words appear in value, in any order.*
~|*=    Contains any partial words    Any given whole or partial words appear in value, in any order.*
**=    Contains match    Any given whole words match against value.*

LIKE
%=    Contains phrase/text like    Phrase or word appears in value compared to, using like.
~%=    Contain all words like    All whole or partial words appear in value using like, in any order.*
~|%=    Contains any words like    Any given whole or partial words appear in value using like, in any order.*

*Available in ProcessWire 3.0.160 or newer.

I'm just a bit lost as to which selectors to use. If you use word-based fulltext selectors does it still match phrases as well and give a higher relevancy? If there's a way to search Acme 123 as an exact match first and then just Acme and 123 separately (the latter would be ignored though unless using InnoDB, which is fine) and still return results by relevancy that would be ideal.

When using fulltext selectors can you combine them and have results returned by aggregate relevancy?

Any advice or experiences would be appreciated. ?

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...