Jump to content

SearchEngine


teppo

Recommended Posts

SearchEngine 0.10.0 was just released. Not a whole lot of stuff in this release – basically just one new feature, one minor addition to the default theme CSS, and some PHPDoc improvements.

### Added
- New Renderer::___renderResultsJSON() method for rendering search results as a JSON string.
- Additional CSS rules to make sure that visited links appear correctly in the default output.

While building an AJAX suggest search feature it occurred to me that it would be nice if SearchEngine could return search results as JSON out of the box. Newly added renderResultsJSON() method provides this capability, and new settings results_json_fields and results_json_options allow customising what gets returned, and how.

More details (and an example of using this feature) in the README: https://github.com/teppokoivula/SearchEngine#json-output.

  • Like 5
Link to comment
Share on other sites

  • 4 weeks later...

Hello - Thanks for your great module!

I have a special need for the side, I'm working on right now:  Is there a possibility not to show the "result_summary_field" in the rendered results, but the 'near' content of the search-match in the search field itself - with the match highlited?

So lets say there is this content in the search_index field: ' Hey - this is just a simple little test for the search engine!'
I am searching for the phrase 'little'.
The rendered result now should show me lets say: '... just a simple little  test for the s...'

To be honest: My php-skills are not that good to understand all the files of your module -  but as far as I can see, I am afraid that it will not be possible in an easy way to change the code for my needs!? 

Thanks for any reply! 

  • Like 1
Link to comment
Share on other sites

On 9/24/2019 at 11:02 PM, planmacher said:

I have a special need for the side, I'm working on right now:  Is there a possibility not to show the "result_summary_field" in the rendered results, but the 'near' content of the search-match in the search field itself - with the match highlited?

Assuming that I correctly understood your need, this would require dynamically generating the descriptions. Technically it should be doable by hooking into Renderer::RenderResultDesc(), but you'd have to provide the entire description generation and highlighting logic yourself, so it might be a bit difficult to achieve in practice.

This is actually something that I've got planned, but I'm not really sure when I'll get to it. Anyway, it's good to know that there's demand for it.

  • Like 1
Link to comment
Share on other sites

Couple of updates during the past week, 0.11.0 and 0.11.1. I'll attach the changelog below, but here's a summary of changes:

  • Saving just a single field (via API or perhaps some async feature) should now correctly regenerate the search index.
  • Page Reference fields are supported: the title and name values of referenced pages are stored in the index. Page names can also be search with the field name; if, say, you have a Page Reference field called "tags", you can find specific matches by searching for "tags:[tag-name]" ("tags:tailwind" etc.)
  • One relatively big change behind the scenes is that now the search index is generated after page has been saved, by hooking into Pages::savedPageOrField and triggering a new save for just the index field, quietly and with hooks disabled.

Originally the module hooked into Pages::saveReady and generated the index so that it got saved along with any user-provided changes, but it turns out that this could result in some rather obscure bugs due to output formatting being enabled on the fly (which is intentional, and required for the most representative search index) which in turn was triggering interesting side effects under some specific conditions. So far the new method hasn't resulted in any unexpected side effects as far as I can tell, but it's worth pointing out this is a pretty big update.

## [0.11.1] - 2019-09-26

### Changed
- Index value gets saved in Pages::savedPageOrField instead of Pages::saved.

## [0.11.0] - 2019-09-26

### Added
- Support for indexing Page Reference fields.
- Support for indexing non-field Page properties (id, name).
- New hookable method Indexer::___getPageReferenceIndexValue().

### Changed
- Index value gets saved in Pages::saved instead of Pages::saveReady so that we can avoid messing with the regular save process.

### Fixed
- Fixed the "save" behaviour of the Indexer::indexPage() method.

 

  • Like 1
Link to comment
Share on other sites

Hi Teppo, is there a way to make the search language aware? - I mean getting the right context from mutilanguage fields per active language? Probably this would mean an indexing per language?

If not, I wonder if at least all language text entries from a multi language field get indexed? - so a search in the non default language might find maybe too much, but not too little?

  • Like 1
Link to comment
Share on other sites

On 9/30/2019 at 11:01 AM, ceberlin said:

Hi Teppo, is there a way to make the search language aware? - I mean getting the right context from mutilanguage fields per active language? Probably this would mean an indexing per language?

Sorry for the late reply. This is definitely something I'll have to figure out, one way or another, just haven't had the time (or need for that matter) yet.

I'll have to do some testing first and get back to this later ?

  • Like 2
Link to comment
Share on other sites

  • 1 month later...

Hi Teppo, I'm close to build my first-time-ever search engine in processwire and wonder if you have made progress into the multilanguage side of your cool module ?
Not a big deal anyway, otherwise I will try another approach ?

Thanks!

  • Like 1
Link to comment
Share on other sites

Just a FYI @teppo I'll definitely need a site search on my next project so I'll try your module soon. I've found https://markjs.io/ today and it seems to be great (and MIT). Maybe this could be implemented into your module for highlighting the results? Or did you have any other solutions in mind?

Would you be interested in me bringing markjs into your module or would you prefer if I build something seperate?

Link to comment
Share on other sites

1 hour ago, bernhard said:

Just a FYI @teppo I'll definitely need a site search on my next project so I'll try your module soon. I've found https://markjs.io/ today and it seems to be great (and MIT). Maybe this could be implemented into your module for highlighting the results? Or did you have any other solutions in mind?

Would you be interested in me bringing markjs into your module or would you prefer if I build something seperate?

Hey Bernhard!

SearchEngine already provides a way to highlight hits found from the generated summary, but since there's currently no built-in way to show dynamic summaries on the search page, this doesn't apply to the entire index. See here for an example: https://wireframe-framework.com/search/?q=wireframe.

Please let me know if I've completely misunderstood what markjs does – I'm interested, but not quite sure if it would be beneficial, and/or how it would differ from current situation ?

 

  • Like 1
Link to comment
Share on other sites

1 hour ago, 3fingers said:

Hi Teppo, I'm close to build my first-time-ever search engine in processwire and wonder if you have made progress into the multilanguage side of your cool module ?
Not a big deal anyway, otherwise I will try another approach ?

This is on the top of my to-do list now ?

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

37 minutes ago, teppo said:

Hey Bernhard!

SearchEngine already provides a way to highlight hits found from the generated summary, but since there's currently no built-in way to show dynamic summaries on the search page, this doesn't apply to the entire index. See here for an example: https://wireframe-framework.com/search/?q=wireframe.

Please let me know if I've completely misunderstood what markjs does – I'm interested, but not quite sure if it would be beneficial, and/or how it would differ from current situation ?

Thx, seems that I missed that ? 

  • Like 1
Link to comment
Share on other sites

@ceberlin @3fingers

As of 0.12.0 SearchEngine now supports multi-language indexing and searching. This is based on the native language features, so the results you see depend on the language of current user etc. While I don't have a good test case at hand right now, I did some quick testing on one of my own sites and it seemed to work pretty much as expected – though please let me know if there are problems with the latest version ?

What you need to do to enable this is convert the index field (which is by default called search_index) from FieldtypeTextarea to FieldtypeTextareaLanguage.

  • Like 5
Link to comment
Share on other sites

Thanks @teppo ! You rock! ?

I will report back my experience with your module as soon as my client give me the ok with the implementation.

Once again, thanks!

Edit:

I found couple typos in the README inside the "manual approach" section (a double echo and a $searchEngine declaration referenced as $searchengine w/o camel casing), I paste the correction below as hidden:

Spoiler

<?php namespace ProcessWire;
$searchEngine = $modules->get('SearchEngine');
?>
<head>
    <?= $searchEngine->renderStyles(); ?>
    <?= $searchEngine->renderScripts(); ?>
</head>

<body>
    <?php
    // Note: results are rendered before form because this way the form instantly
    // has access to whitelisted query string (if a search was already performed).
    $results = $searchEngine->renderResults();
    $form = $searchEngine->renderForm();
    echo $form . $results;
    ?>
</body>

 

Also, in the "Advanced Use" :

Spoiler

if ($q = $sanitizer->selectorValue($input->get->q)) {
    // This finds pages matching the query string and returns them as a PageArray:
    $results = $pages->find('search_index%=' . $query_string . ', limit=25');

    // Render results and pager with PageArray::render() and PageArray::renderPager():
    echo $results->render(); // PageArray::render()
    echo $results->renderPager(); // PageArray::renderPager()

    // ... or you iterate over the results and render them manually:
    echo "<ul>";
    foreach ($results as $result) {
        echo "<li><a href='{$result->url}'>{$result->title}</a></li>";
    }
    echo "</ul>";
}

 

the "$query_string" variable declaration...shouldn't be "$q" or I am missing something?

  • Thanks 1
Link to comment
Share on other sites

I'm making progresses!

Everything is working fine except for the renderPager, which loose the query string on subsequential pages:

$searchEngine = $modules->get('SearchEngine');
$form = $searchEngine->renderForm();
echo $form;

$out= '';
if ($q = $sanitizer->selectorValue($input->get->q)) {
    $results = $pages->find('search_index%=' . $q . ', limit=25');
	foreach ($results as $result) {
	$out.= // my code here, everything is great and populated, even repeater matrix elements whaoo! :)
	}
}
$out.= $results->renderPager();
echo $out;

On page http://localhost:7880/my_url/root/it/search/?q=my_query everything is ok.
As soon as I go to page2 the url is http://localhost:7880/my_url/root/it/search/page2 (the query is gone and nothing more is shown).

Btw my "search.php" template has "Allow page numbers" in the backend set to ON.

Any clue? ?

Link to comment
Share on other sites

18 hours ago, 3fingers said:

On page http://localhost:7880/my_url/root/it/search/?q=my_query everything is ok.
As soon as I go to page2 the url is http://localhost:7880/my_url/root/it/search/page2 (the query is gone and nothing more is shown).

Btw my "search.php" template has "Allow page numbers" in the backend set to ON.

Any clue? ?

The trick is to whitelist "q", after which MarkupPagerNav automatically sticks with it. Add this after you've validated the $q variable:

$input->whitelist('q', $q);

Note also that if this is your literal code, you should move the renderPager call in the if statement – if $q isn't set or validated, $results will be null, and this will cause an error ?

  • Like 1
Link to comment
Share on other sites

Just released SearchEngine 0.13.0. This version adds more validation regarding the search index field: there's a warning if the field is of wrong type, an option to automatically create it if it doesn't exist (i.e. it has been removed after the module was installed), and there's a notice in the module settings screen if the index field exists and is valid but hasn't been added to any templates yet.

Additionally there's a fix for an issue where FieldtypeTextareaLanguage wasn't recognised as a valid index field type in module settings. This could've resulted in the index field setting getting unintentionally cleared if/when module settings were saved.

  • Like 2
  • Thanks 1
Link to comment
Share on other sites

Hi @teppo, another gentle question for you ?

I'm trying to style the pagination, using this code:

$config->SearchEngine = [

    // This are the settings I already used, successfully, elsewhere in the site to style my pagination.

    'pager_args' => [
        'nextItemLabel' => "Next",
        'previousItemLabel' => "Prev",
        'listMarkup' => "<ul class='pagination'>{out}</ul>",
        'itemMarkup' => "<li class='pagelink {class}'>{out}</li>",
        'linkMarkup' => "<a href='{url}'>{out}</a>",
        'currentLinkMarkup' => "{out}",
        'currentItemClass' => "current",
    ],

	// other settings here stripped out in this example.
];

Those settings seem to be ignored though. I'm surely missing something, can you point me in the right direction? ?

Link to comment
Share on other sites

4 minutes ago, 3fingers said:

Those settings seem to be ignored though. I'm surely missing something, can you point me in the right direction? ?

If you're using the code you posted earlier, i.e. "$results = $pages->find(...)" and "$out.= $results->renderPager()", SearchEngine settings won't kick in at all. This is plain old native ProcessWire stuff, so you'd have to provide pager settings directly to the renderPager() method ?

SearchEngine pager_args only apply if you're using it's native rendering features. Though please let me know if I'm missing the point here!

  • Like 1
Link to comment
Share on other sites

You right! Passing them directly to the renderPager() method now works. Just to clarify, SearchEngine settings kick in, but not those dedicated to pagination ?

Thanks @teppo, glad you're on the same time-zone as mine, you're always super fast ?

Link to comment
Share on other sites

  • 2 months later...

Hey @teppo great module thanks!

Would it be possible to add support for ProFields: Table fields? 

Also is there / can I request a method to get the whole index for json output. I want to use this with a client side fuzzy search library.

Currently my needs aren't that complicated, I am indexing two templates so I can just get the index field from those and the other data I need around it manually which is perfectly fine. Perhaps this is the correct way to do it.

 

 

Link to comment
Share on other sites

Thanks @Mikie!

I'll definitely look into the table part.

There's no built-in way to get the entire index at the moment, but it might make sense to have one -- though what you're doing sounds like a perfectly good way to solve this for the time being. Anyway, I'll look into this as well.

Just for the record, the renderResultsJSON method was added primarily to solve client side search needs (but I'm guessing that's not exactly what you're after) ?

  • Like 2
Link to comment
Share on other sites

9 minutes ago, teppo said:

I'll definitely look into the table part.

Cheers! I am storing magazine style story credits (role, name, website url etc) in the Table. I feel that since Table only accepts text based fields this is an ok candidate for indexing. Can try to hack away at your module myself for now, no rush.

9 minutes ago, teppo said:

Just for the record, the renderResultsJSON method was added primarily to solve client side search needs (but I'm guessing that's not exactly what you're after) ?

Yeah I originally was thinking I wanted to use a client side library like to handle the search itself (fuzzy style to handle mis-spellings eg https://rawgit.com/farzher/fuzzysort/master/test.html ), but now I am thinking that is overboard and probably confusing. Will most likely stick with your module / pw search, and just do client side querying / rendering.

Cheers!

  • Like 2
Link to comment
Share on other sites

Hi @teppo come up against a weird bug with certain special characters. Its hard to isolate but for example the character à is throwing "SQLSTATE[22007]: Invalid datetime format: 1366 Incorrect string value" in ckeditor fields (but not title fields).

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...