Jump to content
Anders

Full text search that gives height weight to mathces in the title than in the body?

Recommended Posts

I want to allow full text search on my site. There is a very nice solution that comes right out of the box:

$selector = "title|body~=$q, limit=50"; 

This works, but to make it even better I would want to give higher weight to pages where the search term occurs in the title, than if it just occurs in the body. After all, a page with the title "Wine from France" is probably the best match for the search "france wine". How do I accomplish this in ProcessWire?

I can see three possible paths, but I am not very fond of any of them:

  1. Do a direct SQL query, circumventing the API, along these lines. But I would prefer to abstract away the database layout if at all possible.
  2. Use something like ElasticSearch, but to be honest that would be to complicated to set up and maintain in the long run.
  3. Make multiple lookups, first for matches in the title, then for matches in the body, and merge and sort in PHP. My suspicion is that this would get complicated quite quickly. For instance, how do you deal with a page that has two of the three search terms in the title and the third in the body?

Is there a magic option four I should look into? Or are any of the above options better than the others? Any input is welcome!

Share this post


Link to post
Share on other sites
On 2/24/2020 at 8:40 AM, Anders said:

Is there a magic option four I should look into?

There's no magic option - you have to code your own search in the way that suits you.

If the number of search results is not huge and pagination is not required then you can get all the results where any field matches, and then divide off the pages that have matches in the title field, rendering those results before the others.

Otherwise you might want to use an SQL query, perhaps returning just an array of page IDs which you could then slice according to the pagination number and load to a PageArray via $pages->getById().

  • Thanks 1

Share this post


Link to post
Share on other sites

Thanks @Robin S! There can be more than 100 result, so I think I will go with the SQL then.

If I manage to make any progress, I will post some code here in case anyone else needs it.

  • Like 1

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By sebr
      Hi
      In my search page, I used a selector like this :
      $searchQuery = $sanitizer->entities($input->get('q')); $searchQuery = $sanitizer->selectorValue($searchQuery); $selector = 'title|subtitle|summary|html_body_noimg~=' . $searchQuery; $matches = $pages->find($selector); I don't have the same results if $searchQuery contains accent or not.
      For example,
      with « bâtiment » I have no result with « batiment » I have onea result : « Les bâtiments et les smart-city » Normally I should have the same results? How can I do that ?
      Thanks for your help
    • By SwimToWin
      I want to add a dependent SELECT field on my template page that lists pages from a parent "sub-page" in the current parent node.
      On /product1/page I have the field "photo" which is a SELECT field.
      I want the SELECT to list pages from /ROOTPARENT/photos.
      The idea is that I can reuse the same photo in many places - but only need to keep it update it once under /product1/photos.
      My page structure looks like so:
      /product1/page /product1/photos/photo3 (template=photos) /product2/photos/photo9 I have tried adding these Selector Strings on the Field (Setup -> Fields -> PHOTO -> Input tab -> Selectable Pages field group -> Selector String):
      parent=/product1/page, template=photos, sort=name WORKS (but only on children of current product). parent=page.rootParent ... parent=$page.rootParent ... parent=$page.rootParent parent=$parent ... parent=$parent1 When using a SELECT Input Field Type, the editing pages gives the fatal error "Unrecognized operator: $". parent=parent ... parent=. Returns an empty list How might I find child pages from the current "/product1/photos/ page"?
      Your inputs are appreciated. Thanks.
    • By celfred
      Hello,
      I've just upgraded to 3.0.165 (and updated my Ubuntu version as well) and on my localhost, I am facing a weird issue : all my requests having the ~= selector cause a Mysql error with this message :

      PDOException #HY000 SQLSTATE[HY000]: General error: 3685 Illegal argument to a regular expression. I have no idea what is going on. If I change my request from

      $visualizer = $pages->get("name~=visualizer"); // Triggers the error
      to
      $visualizer = $pages->get("name=visualizer");
      or
      $visualizer = $pages->get("name~*=visualizer");
      My code works fine again.
      Any idea ? Shall I change all my requests (but from what I understand by reading the documentation, ~= exists and fits my needs : all words in any order [though I do understand that in my example above it may be useless since I have only one word])
      Thanks !
    • By michelangelo
      Hello guys, I am building a sort of an archive. Relatively simple, although I have about 8000 records, each with 15 fields (text, int, images, url). I created a crude search system with a form (emulating the famous Skyscrapper example) to filter through the system. Everything works but it is quite slow... I have 2 questions which are related:

      1. How can I search through the database?
      2. What is a good practice to display many records like these?
      -----------------------------------------
      1. I am retrieving the results with
      $songs = $pages->findMany('template=nk-song'); Then I do a foreach to render them all. I am unsure if that is a good way. If I render all of them on the page, it creates thousands of divs with a bit of text, and this can take a while (10s-15s).
       
      2. This one is even worse :D as every time I retrieve my desired records with something like this:
      $page->find("field_to_search_through~=my_query_string") I get between 20 and 200, but when I render them I am creating iframes with YouTube videos and that can take up to 10s to finish. I "solved" it by only loading the iframes if they are in view with IntersectionObserver on the client-side. But I feel there is a more precise PHP / ProcessWire approach.
       
      Just to clarify, I started doing all of this custom rendering and querying because tools like ElasticSearch or SearchEngine were a bit complicated and I needed a simple to retrieve information and then display it in my own way.
      Thank you!
    • By snobjorn
      I have a website with multiple content types that I want to be accessible through search. I really like the live search on processwire.com, that sorts content types while typing. I tried to find the code to recreate this, with no luck. Does anyone know if this is jquery, specific jquery plugins, json/xml cached files, and what kind of PHP code is used? Any tip that point me in the right direction would be much apperciated.
      The search result listing seems fairly easy to create with sorting through parameters.
×
×
  • Create New...