Jump to content

Site search


Fluid Studios
 Share

Recommended Posts

Hi,

I'd like to also return an excerpt from the field which matched.

 

<?php namespace Processwire;

/**
 * Search.
 *
 * Global site search.
 *
 * @author snipped.
 * @version 1.0.0
 */

// Retrieve the GET request.
$query = $sanitizer->text($input->get->query);

if (!$query) {
    return;
}

// Sanitise the user input.
$query = $sanitizer->selectorValue($query);

// Instantiate the Search object.
new Search($query);

/**
 * Search.
 *
 * Return results from matching pages.
 *
 * @param query
 */

class Search {

    protected $query;
    protected $excludedFieldTypes;
    protected $templates;
    protected $matches;
    protected $output;

    public function __construct($query) {
        $this->query = $query;
        $this->excludedFieldTypes = ['FieldtypeImage', 'FieldtypeSelect', 'FieldtypeMediaManager', 'FieldtypeFieldsetTabOpen', 'FieldtypeFieldsetClose', 'FieldtypeRepeater', 'FieldtypeCheckbox', 'FieldtypeModule', 'FieldtypePassword'];
        $this->templates = $this->getTemplates();
        $this->matches = $this->getMatches();

        $this->renderResults();
    }

    /**
     * getTemplates
     *
     * Return an array of templates & fields to be searched.
     *
     * @return pages
     */

    public function getTemplates() {

        $templates = [];

        foreach (wire('templates') as $template) {

            $fields = [];

            foreach ($template->fields as $field) {

                // Exclude certain field types.
                if (in_array($field->type, $this->excludedFieldTypes)) {
                    continue;
                }

                array_push($fields, $field);
            }

            // Turn the array to a string suitable for the API.
            $fields = implode("|", $fields);

            $templates[$template->name] = [
                'template' => $template->name,
                'fields' => $fields,
            ];

        }

        return $templates;

    }

    /**
     * getMatches
     *
     * Return an array of matching pages.
     *
     * @return matchingPages
     */

    public function getMatches() {
        $matches = [];

        foreach ($this->templates as $template) {
            $selector = "template=" . $template['template'] . ", " . $template['fields'] . "%=$this->query";

            $results = wire('pages')->find($selector);

            foreach ($results as $result) {
                array_push($matches, $result);
            }

        }

        return $matches;

    }

    /**
     * renderResults
     *
     * Render the results of the search.
     */

    public function renderResults() {
        if (!$this->matches) {
            echo 'No results.';
            return;
        }

        foreach ($this->matches as $match) {
            $this->output .= '<p>' . $match->title . '</p>';
            $this->output .= '<p>' . $match->url . '</p>';
            $this->output .= '<p>' . $match->field . '</p>';
        }

        echo $this->output;
    }

}

/**
 * debug
 *
 * Helper method for visualising arrays.
 */

function debug($variable) {
    echo '<pre>';
    print_r($variable);
    echo '<pre>';
}

 

  • Like 1
Link to comment
Share on other sites

12 hours ago, Fluid Studios said:

Thanks for the quick reply, this isn't too relevant to me as I designed the search system to query every field on every template, not just the body field.

As per the last comment in the linked thread, my approach is to use a Pages::saveReady hook save all text content on each page to a hidden "index" field in the template. Then there is just a single field to search and pull excerpts from.

I have a module that I'll get around to releasing one of these days, but the basic idea is that in the saveReady hook you loop over all fields in the page, get the text content from the field depending on field type (e.g. strip markup from CKEditor fields, get descriptions from an images field, loop over subfields in a Repeater/PageTable, etc) and save that to the index field.

  • Like 1
Link to comment
Share on other sites

17 hours ago, Robin S said:

As per the last comment in the linked thread, my approach is to use a Pages::saveReady hook save all text content on each page to a hidden "index" field in the template. Then there is just a single field to search and pull excerpts from.

I have a module that I'll get around to releasing one of these days, but the basic idea is that in the saveReady hook you loop over all fields in the page, get the text content from the field depending on field type (e.g. strip markup from CKEditor fields, get descriptions from an images field, loop over subfields in a Repeater/PageTable, etc) and save that to the index field.

Thanks, could collating the fields can cause cross-field excerpts?

I like the idea of adding a hook to page saves, thanks.

 

Link to comment
Share on other sites

9 hours ago, gmclelland said:

I would certainly be interested in seeing how that is done.  I've always wondered how to search pages that are built in sections with Repeaters/PageTables/PageReference fields.

There's a bit more more work needed before my module could be released publicly but I'll put it on my "to do" list.

 

8 hours ago, Fluid Studios said:

Thanks, could collating the fields can cause cross-field excerpts?

Yes, but in my use cases that's a good thing. This approach only suits certain types of search needs, where you want to implement a broad text search across most/all content. In my cases I want the search to match as many pages as possible, so I do things like explode the search phrase on space characters and then match each term with the %= LIKE operator so I'm catching part words too and without regard to the order of terms:

$search_terms = $sanitizer->selectorValue($input->get->search);
$search_terms = preg_replace("/\s+/", " ", $search_terms); // replace multiple spaces with single space
$terms = explode(' ', $search_terms);
foreach($terms as $term) {
	$selector .= "index%=$term, ";
}
//...

And I want the excerpt to be Google-like in that it deals with the page's text content as a whole rather than caring about what fields were used behind the scenes.

  • Like 4
Link to comment
Share on other sites

I've been using a very similar solution at weekly.pw, i.e. indexing data into a field called "search_index". The content is mostly built using PageTable, and I'm indexing content from those for the parent page's search_index field as well. No Repeaters or RepeaterMatrix fields here, but they'd require similar processing.

The whole solution is built into some methods placed in /site/init.php. Might be useful for someone, so here's the code: https://gist.github.com/teppokoivula/83036c6e73d7460be7706def620d80d4.

Note that in this case I'm not using the same index as an excerpt, but rather the summary field of the page. Excerpt would be better – will probably add them at some point. Once you have the index at hand, it's relatively simple ?

Another thing to note here is that I'm currently using the same index field to store links in link:url format. Since tags are stripped from the index text, URLs wouldn't otherwise be a part of the index – and this also allows me to perform specific link queries, such as https://weekly.pw/search/?q=link:https://modules.processwire.com/modules/sanitizer-transliterate/ ?

  • Like 4
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...