Jump to content
teppo

SearchEngine

Recommended Posts

SearchEngine 0.10.0 was just released. Not a whole lot of stuff in this release – basically just one new feature, one minor addition to the default theme CSS, and some PHPDoc improvements.

### Added
- New Renderer::___renderResultsJSON() method for rendering search results as a JSON string.
- Additional CSS rules to make sure that visited links appear correctly in the default output.

While building an AJAX suggest search feature it occurred to me that it would be nice if SearchEngine could return search results as JSON out of the box. Newly added renderResultsJSON() method provides this capability, and new settings results_json_fields and results_json_options allow customising what gets returned, and how.

More details (and an example of using this feature) in the README: https://github.com/teppokoivula/SearchEngine#json-output.

  • Like 4

Share this post


Link to post
Share on other sites

Hello - Thanks for your great module!

I have a special need for the side, I'm working on right now:  Is there a possibility not to show the "result_summary_field" in the rendered results, but the 'near' content of the search-match in the search field itself - with the match highlited?

So lets say there is this content in the search_index field: ' Hey - this is just a simple little test for the search engine!'
I am searching for the phrase 'little'.
The rendered result now should show me lets say: '... just a simple little  test for the s...'

To be honest: My php-skills are not that good to understand all the files of your module -  but as far as I can see, I am afraid that it will not be possible in an easy way to change the code for my needs!? 

Thanks for any reply! 

  • Like 1

Share this post


Link to post
Share on other sites
On 9/24/2019 at 11:02 PM, planmacher said:

I have a special need for the side, I'm working on right now:  Is there a possibility not to show the "result_summary_field" in the rendered results, but the 'near' content of the search-match in the search field itself - with the match highlited?

Assuming that I correctly understood your need, this would require dynamically generating the descriptions. Technically it should be doable by hooking into Renderer::RenderResultDesc(), but you'd have to provide the entire description generation and highlighting logic yourself, so it might be a bit difficult to achieve in practice.

This is actually something that I've got planned, but I'm not really sure when I'll get to it. Anyway, it's good to know that there's demand for it.

Share this post


Link to post
Share on other sites

Couple of updates during the past week, 0.11.0 and 0.11.1. I'll attach the changelog below, but here's a summary of changes:

  • Saving just a single field (via API or perhaps some async feature) should now correctly regenerate the search index.
  • Page Reference fields are supported: the title and name values of referenced pages are stored in the index. Page names can also be search with the field name; if, say, you have a Page Reference field called "tags", you can find specific matches by searching for "tags:[tag-name]" ("tags:tailwind" etc.)
  • One relatively big change behind the scenes is that now the search index is generated after page has been saved, by hooking into Pages::savedPageOrField and triggering a new save for just the index field, quietly and with hooks disabled.

Originally the module hooked into Pages::saveReady and generated the index so that it got saved along with any user-provided changes, but it turns out that this could result in some rather obscure bugs due to output formatting being enabled on the fly (which is intentional, and required for the most representative search index) which in turn was triggering interesting side effects under some specific conditions. So far the new method hasn't resulted in any unexpected side effects as far as I can tell, but it's worth pointing out this is a pretty big update.

## [0.11.1] - 2019-09-26

### Changed
- Index value gets saved in Pages::savedPageOrField instead of Pages::saved.

## [0.11.0] - 2019-09-26

### Added
- Support for indexing Page Reference fields.
- Support for indexing non-field Page properties (id, name).
- New hookable method Indexer::___getPageReferenceIndexValue().

### Changed
- Index value gets saved in Pages::saved instead of Pages::saveReady so that we can avoid messing with the regular save process.

### Fixed
- Fixed the "save" behaviour of the Indexer::indexPage() method.

 

Share this post


Link to post
Share on other sites

Hi Teppo, is there a way to make the search language aware? - I mean getting the right context from mutilanguage fields per active language? Probably this would mean an indexing per language?

If not, I wonder if at least all language text entries from a multi language field get indexed? - so a search in the non default language might find maybe too much, but not too little?

  • Like 1

Share this post


Link to post
Share on other sites
On 9/30/2019 at 11:01 AM, ceberlin said:

Hi Teppo, is there a way to make the search language aware? - I mean getting the right context from mutilanguage fields per active language? Probably this would mean an indexing per language?

Sorry for the late reply. This is definitely something I'll have to figure out, one way or another, just haven't had the time (or need for that matter) yet.

I'll have to do some testing first and get back to this later 🙂

  • Like 1

Share this post


Link to post
Share on other sites

Hi Teppo, I'm close to build my first-time-ever search engine in processwire and wonder if you have made progress into the multilanguage side of your cool module 🙂
Not a big deal anyway, otherwise I will try another approach 🙂

Thanks!

  • Like 1

Share this post


Link to post
Share on other sites

Just a FYI @teppo I'll definitely need a site search on my next project so I'll try your module soon. I've found https://markjs.io/ today and it seems to be great (and MIT). Maybe this could be implemented into your module for highlighting the results? Or did you have any other solutions in mind?

Would you be interested in me bringing markjs into your module or would you prefer if I build something seperate?

Share this post


Link to post
Share on other sites
1 hour ago, bernhard said:

Just a FYI @teppo I'll definitely need a site search on my next project so I'll try your module soon. I've found https://markjs.io/ today and it seems to be great (and MIT). Maybe this could be implemented into your module for highlighting the results? Or did you have any other solutions in mind?

Would you be interested in me bringing markjs into your module or would you prefer if I build something seperate?

Hey Bernhard!

SearchEngine already provides a way to highlight hits found from the generated summary, but since there's currently no built-in way to show dynamic summaries on the search page, this doesn't apply to the entire index. See here for an example: https://wireframe-framework.com/search/?q=wireframe.

Please let me know if I've completely misunderstood what markjs does – I'm interested, but not quite sure if it would be beneficial, and/or how it would differ from current situation 🙂

 

Share this post


Link to post
Share on other sites
1 hour ago, 3fingers said:

Hi Teppo, I'm close to build my first-time-ever search engine in processwire and wonder if you have made progress into the multilanguage side of your cool module 🙂
Not a big deal anyway, otherwise I will try another approach 🙂

This is on the top of my to-do list now 🙂

  • Thanks 1

Share this post


Link to post
Share on other sites
37 minutes ago, teppo said:

Hey Bernhard!

SearchEngine already provides a way to highlight hits found from the generated summary, but since there's currently no built-in way to show dynamic summaries on the search page, this doesn't apply to the entire index. See here for an example: https://wireframe-framework.com/search/?q=wireframe.

Please let me know if I've completely misunderstood what markjs does – I'm interested, but not quite sure if it would be beneficial, and/or how it would differ from current situation 🙂

Thx, seems that I missed that 🙂 

  • Like 1

Share this post


Link to post
Share on other sites

@ceberlin @3fingers

As of 0.12.0 SearchEngine now supports multi-language indexing and searching. This is based on the native language features, so the results you see depend on the language of current user etc. While I don't have a good test case at hand right now, I did some quick testing on one of my own sites and it seemed to work pretty much as expected – though please let me know if there are problems with the latest version 🙂

What you need to do to enable this is convert the index field (which is by default called search_index) from FieldtypeTextarea to FieldtypeTextareaLanguage.

  • Like 4

Share this post


Link to post
Share on other sites

Thanks @teppo ! You rock! 🙂

I will report back my experience with your module as soon as my client give me the ok with the implementation.

Once again, thanks!

Edit:

I found couple typos in the README inside the "manual approach" section (a double echo and a $searchEngine declaration referenced as $searchengine w/o camel casing), I paste the correction below as hidden:

Spoiler

<?php namespace ProcessWire;
$searchEngine = $modules->get('SearchEngine');
?>
<head>
    <?= $searchEngine->renderStyles(); ?>
    <?= $searchEngine->renderScripts(); ?>
</head>

<body>
    <?php
    // Note: results are rendered before form because this way the form instantly
    // has access to whitelisted query string (if a search was already performed).
    $results = $searchEngine->renderResults();
    $form = $searchEngine->renderForm();
    echo $form . $results;
    ?>
</body>

 

Also, in the "Advanced Use" :

Spoiler

if ($q = $sanitizer->selectorValue($input->get->q)) {
    // This finds pages matching the query string and returns them as a PageArray:
    $results = $pages->find('search_index%=' . $query_string . ', limit=25');

    // Render results and pager with PageArray::render() and PageArray::renderPager():
    echo $results->render(); // PageArray::render()
    echo $results->renderPager(); // PageArray::renderPager()

    // ... or you iterate over the results and render them manually:
    echo "<ul>";
    foreach ($results as $result) {
        echo "<li><a href='{$result->url}'>{$result->title}</a></li>";
    }
    echo "</ul>";
}

 

the "$query_string" variable declaration...shouldn't be "$q" or I am missing something?

  • Thanks 1

Share this post


Link to post
Share on other sites

I'm making progresses!

Everything is working fine except for the renderPager, which loose the query string on subsequential pages:

$searchEngine = $modules->get('SearchEngine');
$form = $searchEngine->renderForm();
echo $form;

$out= '';
if ($q = $sanitizer->selectorValue($input->get->q)) {
    $results = $pages->find('search_index%=' . $q . ', limit=25');
	foreach ($results as $result) {
	$out.= // my code here, everything is great and populated, even repeater matrix elements whaoo! :)
	}
}
$out.= $results->renderPager();
echo $out;

On page http://localhost:7880/my_url/root/it/search/?q=my_query everything is ok.
As soon as I go to page2 the url is http://localhost:7880/my_url/root/it/search/page2 (the query is gone and nothing more is shown).

Btw my "search.php" template has "Allow page numbers" in the backend set to ON.

Any clue? 🙂

Share this post


Link to post
Share on other sites
18 hours ago, 3fingers said:

On page http://localhost:7880/my_url/root/it/search/?q=my_query everything is ok.
As soon as I go to page2 the url is http://localhost:7880/my_url/root/it/search/page2 (the query is gone and nothing more is shown).

Btw my "search.php" template has "Allow page numbers" in the backend set to ON.

Any clue? 🙂

The trick is to whitelist "q", after which MarkupPagerNav automatically sticks with it. Add this after you've validated the $q variable:

$input->whitelist('q', $q);

Note also that if this is your literal code, you should move the renderPager call in the if statement – if $q isn't set or validated, $results will be null, and this will cause an error 🙂

  • Like 1

Share this post


Link to post
Share on other sites

I will do that on Monday, in the meantime thanks for your support and kindness :)

  • Like 1

Share this post


Link to post
Share on other sites

Just released SearchEngine 0.13.0. This version adds more validation regarding the search index field: there's a warning if the field is of wrong type, an option to automatically create it if it doesn't exist (i.e. it has been removed after the module was installed), and there's a notice in the module settings screen if the index field exists and is valid but hasn't been added to any templates yet.

Additionally there's a fix for an issue where FieldtypeTextareaLanguage wasn't recognised as a valid index field type in module settings. This could've resulted in the index field setting getting unintentionally cleared if/when module settings were saved.

  • Like 2
  • Thanks 1

Share this post


Link to post
Share on other sites

Hi @teppo, another gentle question for you 🙂

I'm trying to style the pagination, using this code:

$config->SearchEngine = [

    // This are the settings I already used, successfully, elsewhere in the site to style my pagination.

    'pager_args' => [
        'nextItemLabel' => "Next",
        'previousItemLabel' => "Prev",
        'listMarkup' => "<ul class='pagination'>{out}</ul>",
        'itemMarkup' => "<li class='pagelink {class}'>{out}</li>",
        'linkMarkup' => "<a href='{url}'>{out}</a>",
        'currentLinkMarkup' => "{out}",
        'currentItemClass' => "current",
    ],

	// other settings here stripped out in this example.
];

Those settings seem to be ignored though. I'm surely missing something, can you point me in the right direction? 🙂

Share this post


Link to post
Share on other sites
4 minutes ago, 3fingers said:

Those settings seem to be ignored though. I'm surely missing something, can you point me in the right direction? 🙂

If you're using the code you posted earlier, i.e. "$results = $pages->find(...)" and "$out.= $results->renderPager()", SearchEngine settings won't kick in at all. This is plain old native ProcessWire stuff, so you'd have to provide pager settings directly to the renderPager() method 🙂

SearchEngine pager_args only apply if you're using it's native rendering features. Though please let me know if I'm missing the point here!

Share this post


Link to post
Share on other sites

You right! Passing them directly to the renderPager() method now works. Just to clarify, SearchEngine settings kick in, but not those dedicated to pagination 🙂

Thanks @teppo, glad you're on the same time-zone as mine, you're always super fast 🙂

Share this post


Link to post
Share on other sites

Hey @teppo great module thanks!

Would it be possible to add support for ProFields: Table fields? 

Also is there / can I request a method to get the whole index for json output. I want to use this with a client side fuzzy search library.

Currently my needs aren't that complicated, I am indexing two templates so I can just get the index field from those and the other data I need around it manually which is perfectly fine. Perhaps this is the correct way to do it.

 

 

Share this post


Link to post
Share on other sites

Thanks @Mikie!

I'll definitely look into the table part.

There's no built-in way to get the entire index at the moment, but it might make sense to have one -- though what you're doing sounds like a perfectly good way to solve this for the time being. Anyway, I'll look into this as well.

Just for the record, the renderResultsJSON method was added primarily to solve client side search needs (but I'm guessing that's not exactly what you're after) 🙂

  • Like 1

Share this post


Link to post
Share on other sites
9 minutes ago, teppo said:

I'll definitely look into the table part.

Cheers! I am storing magazine style story credits (role, name, website url etc) in the Table. I feel that since Table only accepts text based fields this is an ok candidate for indexing. Can try to hack away at your module myself for now, no rush.

9 minutes ago, teppo said:

Just for the record, the renderResultsJSON method was added primarily to solve client side search needs (but I'm guessing that's not exactly what you're after) 🙂

Yeah I originally was thinking I wanted to use a client side library like to handle the search itself (fuzzy style to handle mis-spellings eg https://rawgit.com/farzher/fuzzysort/master/test.html ), but now I am thinking that is overboard and probably confusing. Will most likely stick with your module / pw search, and just do client side querying / rendering.

Cheers!

  • Like 2

Share this post


Link to post
Share on other sites

Hi @teppo come up against a weird bug with certain special characters. Its hard to isolate but for example the character à is throwing "SQLSTATE[22007]: Invalid datetime format: 1366 Incorrect string value" in ckeditor fields (but not title fields).

Share this post


Link to post
Share on other sites

Actually this isn't occurring on staging server. Feel link something is up with my mysql.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By joshua
      This module is (yet another) way for implementing a cookie management solution.
      Of course there are several other possibilities:
      - https://processwire.com/talk/topic/22920-klaro-cookie-consent-manager/
      - https://github.com/webmanufaktur/CookieManagementBanner
      - https://github.com/johannesdachsel/cookiemonster
      - https://www.oiljs.org/
      - ... and so on ...
      In this module you can configure which kind of cookie categories you want to manage:

      You can also enable the support for respecting the Do-Not-Track (DNT) header to don't annoy users, who already decided for all their browsing experience.
      Currently there are four possible cookie groups:
      - Necessary (always enabled)
      - Statistics
      - Marketing
      - External Media
      All groups can be renamed, so feel free to use other cookie group names. I just haven't found a way to implement a "repeater like" field as configurable module field ...
      When you want to load specific scripts ( like Google Analytics, Google Maps, ...) only after the user's content to this specific category of cookies, just use the following script syntax:
      <script type="text/plain" data-type="text/javascript" data-category="statistics" data-src="/path/to/your/statistic/script.js"></script> <script type="text/plain" data-type="text/javascript" data-category="marketing" data-src="/path/to/your/mareketing/script.js"></script> <script type="text/plain" data-type="text/javascript" data-category="external_media" data-src="/path/to/your/external-media/script.js"></script> <script type="text/plain" data-type="text/javascript" data-category="marketing">console.log("Inline scripts are also working!");</script> The type has to be "optin" to get recognized by PrivacyWire, the data-attributes are giving hints, how the script shall be loaded, if the data-category is within the cookie consents of the user. These scripts are loaded asynchronously after the user made the decision.
      If you want to give the users the possibility to change their consent, you can use the following Textformatter:
      [[privacywire-choose-cookies]] It's planned to add also other Textformatters to opt-out of specific cookie groups or delete the whole consent cookie.
      You can also add a custom link to output the banner again with a link / button with following class:
      <a href="#" class="privacywire-show-options">Show Cookie Options</a> <button class="privacywire-show-options">Show Cookie Options</button> This module is still in development, but we already use it on several production websites.
      You find it here: PrivacyWire Git Repo
      Download as .zip
      I would love to hear your feedback 🙂
      CHANGELOG
      0.1.1 Debugging: fixed error during uninstall 0.1.0 Added new detection of async scripts for W3C Validation 0.0.6 CSS-Debugging for hiding unused buttons, added ProCache support for the JavaScript tag 0.0.5 Multi-language support included completely (also in TextFormatter). Added possibility to async load other assets (e.g. <img type="optin" data-category="marketing" data-src="https://via.placeholder.com/300x300">) 0.0.4 Added possibility to add an imprint link to the banner 0.0.3 Multi-language support for module config (still in development) 0.0.2 First release 0.0.1 Early development
    • By bernhard
      --- Please use RockFinder3 ---
    • By MoritzLost
      Cacheable Placeholders
      This module allows you to have pieces of dynamic content inside cached output. This aims to solve the common problem of having a mostly cacheable site, but with pieces of dynamic output here and there.  Consider this simple example, where you want to output a custom greeting to the current user:
      <h1>Good morning, <?= ucfirst($user->name) ?></h1> This snippet means you can't use the template cache (at least for logged-in users), because each user has a different name. Even if 99% of your output is static, you can only cache the pieces that you know won't include this personal greeting. A more common example would be CSRF tokens for HTML forms - those need to be unique by definition, so you can't cache the form wholesale.
      This module solves this problem by introducing cacheable placeholders - small placeholder tokens that get replaced during every request. The replacement is done inside a Page::render hook so it runs during every request, even if the response is served from the template cache. So you can use something like this:
      <h1>Good morning, {{{greeting}}}</h1> Replacement tokens are defined with a callback function that produces the appropriate output and added to the module through a simple hook:
      // site/ready.php wire()->addHookAfter('CachePlaceholders::getTokens', function (HookEvent $e) { $tokens = $e->return; $tokens['greeting'] = [ 'callback' => function (array $tokenData) { return ucfirst(wire('user')->name); } ]; $e->return = $tokens; }); Tokens can also include parameters that are parsed and passed to the callback function. There are more fully annotated examples and step-by-step instructions in the README on Github!
      Features
      A simple and fast token parser that calls the appropriate callback and runs automatically. Tokens may include multiple named or positional parameters, as well as multi-value parameters. A manual mode that allows you to replace tokens in custom pieces of cached content (useful if you're using the $cache API). Some built-in tokens for common use-cases: CSRF-Tokens, replacing values from superglobals and producing random hexadecimal strings. The token format is completely customizable, all delimiters can be changed to avoid collisions with existing tag parsers or template languages. Links
      Github Repository & documentation Module directory (pending approval) If you are interested in learning more, the README is very extensive, with more usage examples, code samples and usage instructions!
    • By Craig
      I've been using Fathom Analytics for a while now and on a growing number of sites, so thought it was about time there was a PW module for it.
      WayFathomAnalytics
      WayFathomAnalytics is a group of modules which will allow you to view your Fathom Analytics dashboard in the PW admin panel and (optionally) automatically add and configure the tracking code on front-end pages.
      Links
      GitHub Readme & documentation Download Zip Modules directory Module settings screenshot What is Fathom Analytics?
      Fathom Analytics is a simple, privacy-focused website analytics tool for bloggers and businesses.

      Stop scrolling through pages of reports and collecting gobs of personal data about your visitors, both of which you probably don't need. Fathom is a simple and private website analytics platform that lets you focus on what's important: your business.
      Privacy focused Fast-loading dashboards, all data is on a single screen Easy to get what you need, no training required Unlimited email reports Private or public dashboard sharing Cookie notices not required (it doesn't use cookies or collect personal data) Displays: top content, top referrers, top goals and more
    • By daniels
      This is a lightweight alternative to other newsletter & newsletter-subscription modules.
      You can find the Module in the Modules directory and on Github
      It can subscribe, update, unsubscribe & delete a user in a list in Mailchimp with MailChimp API 3.0. It does not provide any forms or validation, so you can feel free to use your own. To protect your users, it does not save any user data in logs or sends them to an admin.
      This module fits your needs if you...
      ...use Mailchimp as your newsletter / email-automation tool ...want to let users subscribe to your newsletter on your website ...want to use your own form, validation and messages (with or without the wire forms) ...don't want any personal user data saved in any way in your ProcessWire environment (cf. EU data regulation terms) ...like to subscribe, update, unsubscribe or delete users to/from different lists ...like the Mailchimp UI for creating / sending / reviewing email campaigns *I have only tested it with PHP 7.x so far, so use on owners risk
      EDIT:
      Since 0.0.4, instructions and changelog can be found in the README only. You can find it here  🙂
      If you have questions or like to contribute, just post a reply or create an issue or pr on github, thanks!
×
×
  • Create New...