Jump to content

Search with spaces


davo
 Share

Recommended Posts

I'm using the following line to search for articles:

$matches = $pages->find("Meta_keywords|DMCcontact_Country|DMCcontact_Address_2|DMC_Email~=$q, template=dmc, limit=50");

but, where a user searches for something like 'Sri Lanka', it doesn't recognise the two words. One of the fields it searches is the keyword field, so it wouldn't be a problem if I put a hypen or something in like the 'Sri-Lanka' as long as the user could type 'sri lanka'.

Any suggestions how i could make this happen?

Link to comment
Share on other sites

Try this-

$keywords = preg_split("/[\s-]+/", $q);

$selector = '';

foreach($keywords as $kw){
    $selector .= "Meta_keywords|DMCcontact_Country|DMCcontact_Address_2|DMC_Email~=$kw, ";
}

$selector .= "template=dmc, limit=50";

//echo $selector;

$matches = $pages->find($selector);

You can uncomment the penultimate line for debugging (and/or understanding) purposes. That won't help with the regex, but, essentially the important bit is within the square brackets, where \s means a space (or similar) and the dash after means literally a dash. You could add others there.

Note Written in browser, but should work. (Now where's that fingers-crossed emoticon?)

  • Like 1
Link to comment
Share on other sites

  • 2 weeks later...

Try this-

$keywords = preg_split("/[\s-]+/", $q);

$selector = '';

foreach($keywords as $kw){
    $selector .= "Meta_keywords|DMCcontact_Country|DMCcontact_Address_2|DMC_Email~=$kw, ";
}

$selector .= "template=dmc, limit=50";

//echo $selector;

$matches = $pages->find($selector);

You can uncomment the penultimate line for debugging (and/or understanding) purposes. That won't help with the regex, but, essentially the important bit is within the square brackets, where \s means a space (or similar) and the dash after means literally a dash. You could add others there.

Note Written in browser, but should work. (Now where's that fingers-crossed emoticon?)

It was almost there i think. The debug output was this:

Meta_keywords|DMCcontact_Country|DMCcontact_Address_2|DMC_Email~=sri, Meta_keywords|DMCcontact_Country|DMCcontact_Address_2|DMC_Email~=lanka, template=dmc, limit=50

and yet, it didn't return the sri lanka page?

Link to comment
Share on other sites

and yet, it didn't return the sri lanka page?

Hmm, that would be because mySQL fulltext indexes don't include words under 4 letters long by default. So 'Sri' won't be found. If you have enough control of your mySQL setup, you can change that, probably to include 3 letter words (unless there are any possible shorter words you would need to find), although at the cost of increasing index sizes. (The limit is there for a reason.)

There are a couple of workarounds. The easiest is to not search for words under 4 letters long.

foreach($keywords as $kw){
    if(strlen($kw) >= 4){
        $selector .= "Meta_keywords|DMCcontact_Country|DMCcontact_Address_2|DMC_Email~=$kw, ";
    }
}

...which might introduce other undesirable side effects.

You could also vary the selector operator based on keyword length.

foreach($keywords as $kw){
    if(strlen($kw) >= 4){
        $op = "~=";
    } else {
        $op = "%=";
    }
    $selector .= "Meta_keywords|DMCcontact_Country|DMCcontact_Address_2|DMC_Email{$op}{$kw}, ";
}

(I don't think the curly braces are strictly necessary, but they do enhance readability.)

And, of course, this might also introduce other side effects, although my own hunch would be that this is likeliest to work well. 

I have another method in mind, but it would be a lot of code and might be unnecessary. (It includes 2 searches then combining/uniquing page arrays, and still might not work :rolleyes:

See how you get on with these variations and we'll take it from there.

  • Like 2
Link to comment
Share on other sites

  • 2 weeks later...

I'm  afraid I may have cheated and taken the easy option with this due to the unwanted to results...


$keywords = preg_split("/[\s-]+/", $q);

$selector = '';

foreach($keywords as $kw){

   if ($q ==  "sri lanka") {
				$selector= "id=1330,";
							}else{

    $selector .= "Meta_keywords|DMCcontact_Country|DMCcontact_Address_2|DMC_Email~=$kw, ";}
}

$selector .= "template=dmc, limit=50";

// echo $selector;

$matches = $pages->find($selector);

Thank you for your help Pete.

At the moment there is only one country that had this split short word issue, but if there are more I may have to build them into an array.

Link to comment
Share on other sites

Excuse me for butting in...

There are a couple of workarounds. The easiest is to not search for words under 4 letters long.

Whenever I see that I imagine what a client would make of that answer.

Eventually I'll have to solve this for an existing client with very specific ideas about how keywords should work. The current site (not PW) gives power users a choice of searching based on whole words, partial words, or phrases, and ignores certain punctuation characters (there's a whole list of things, some more idiosyncratic than others). The code builds some pretty ugly looking queries but it's not a high traffic site and performance is fine. I'm hoping to find some kind of API friendly way to add support for options like that. If anybody has any thoughts about this (or more ambitious things such as stemming) please send me a message or post here if relevant to Davo's question. Thanks.

Link to comment
Share on other sites

Hi Steve, just to clarify what's going on I'll explain my scenario.

Searching normally works just fine. A while ago though I changed the search operator to ~= as %= just brought back too many irrelevant results eg Oman inside rOMANian.

Using my lesson from Dave p I understand that even having split Sri Lanka, Sri is under the three letter limit of SQL, presumably to exclude ands and is etc.

Although rather crass my method of the simple "if this" then force the result should actually work for me. I'll probably eventually build this into an array either kept hard in the code or a set of pages in pw. Sometimes the client wants the search results to be quite manipulated.

Link to comment
Share on other sites

Sometimes the client wants the search results to be quite manipulated.

And that is entirely fair enough. Search should always return the 'right results', whether that means 'the most accurate & impartial results possible' (Google? Hah!) or 'what we want searchers to find' (say an ecommerce site), or somewhere in between.

I have probably mentioned this here before, but I once wrote the search logic for an ecommerce site for one of the biggest fishing tackle retailers in the UK. One of our major brands was 'Daiwa', but a fair proportion (nearly half) of searchers spelt that 'Diawa'. It would have been stupid for me not to have catered for that exact situation and return what people were expecting to find. (SteveB, it used the Porter Stemmer.)

  • Like 1
Link to comment
Share on other sites

  • 1 month later...

ok, I've worked on this and found a more hybrid solution.

I've created a list of search terms and corresponding results.

I test the search term to see if it matches any of the pre determined search terms and if it does it creates the selector. If it doesn't it continues with the organic search.

The part I've got stuck with is when someone inserts an ampersand. It breaks the page resulting in this:

Error: Exception: Unknown Selector operator: '~=&' -- was your selector value properly escaped? (in /home/content/s/o/m/somniaweb/html/dudmc/wire/core/Selectors.php line 247)

and yet the ajax request processes it fine?

any ideas please.


<?php

 


/**
 * Search template
 *
 */

$out = '';

if($q = $sanitizer->selectorValue($input->get->q)) {

	// Send our sanitized query 'q' variable to the whitelist where it will be
	// picked up and echoed in the search box by the head.inc file.
	$input->whitelist('q', $q); 

	// Search the title, body and country fields for our query text.
	// Limit the results to 50 pages. 
	// Exclude results that use the 'admin' template. 

// load my directed search results table


$keywords = preg_split("/[\s-]+/", $q);

$selector = '';

//turn the query to lowercase
$q = strtolower($q);

//load my list of specific search terms
$dictionary_words = $pages->find("template=search_results");

//loop through them to see if the query matches the any of them

	foreach($dictionary_words as $word){

//if it matches create the selector
						if($word->title==$q){
							$selector = "id=" . $word->search_result;}
}

//if it didnt match up break the term up 
if($selector==""){

foreach($keywords as $kw){



    $selector .= "Meta_keywords|DMCcontact_Country|DMCcontact_Address_2|DMC_Email~=$kw, ";
                          }
}


$selector .= "template=dmc, limit=50";

// echo $selector;

$matches = $pages->find($selector);









//	$matches = $pages->find("Meta_keywords|DMCcontact_Country|DMCcontact_Address_2|DMC_Email~=$q, template=dmc, limit=50"); 

	$count = count($matches); 

	if($count) {
		
                        if(!$config->ajax){$out .= "<h2>Found $count DMC matching your query:</h2>";} $out.= 
			"<ul class='nav'>";

		foreach($matches as $m) {
			if($m->DMC_represented==1){$rep="<span class='glyphicon glyphicon-star-empty'></span>";
			$out .= "<li  data-toggle='tooltip' data-placement='top' title='{$m->title} is a partner DMC of destinations UNLIMITED. Click the link for their full profile'><p><a href='{$m->url}'>{$rep} Full Partner -   {$m->title} - Representing {$m->DMC_country_represented->title}</a></p></li>";}else{$rep="";
			$out .= "<li  data-toggle='tooltip' data-placement='top' title='{$m->title} are not represented by destinations UNLIMITED therefore a suggestion only' id='nonrep' ><p>{$m->title} - Representing {$m->DMC_country_represented->title} - {$m->DMC_Email} </a></p></li>";}
		}

		$out .= "</ul>";

	} else {
		$out .= "<h2>Sorry, no results were found.</h2>";
	}
} else {
	$out .= "
                  ";
}

// Note that we stored our output in $out before printing it because we wanted to execute
// the search before including the header template. This is because the header template 
// displays the current search query in the search box (via the $input->whitelist) and 
// we wanted to make sure we had that setup before including the header template. 

if(!$config->ajax) include("./dmcheader.inc");

?>

 <div class="container">

	<div class"row">

		


		<div class="col-md-6">
		<?php if(!$config->ajax){ echo "<h2>Please enter a destination</h2>";} ?>

	
<form id='my_search_form' action='<?php echo $config->urls->root?>dmc-search/' method='get'>
				<input type='text' name='q' id='my_search_query' class='form-control' value='<?php echo htmlentities($input->whitelist('q'), ENT_QUOTES, 'UTF-8'); ?>' />
				<button type='submit' id='search_submit'>Search</button>
			</form>

			<?php echo "$out"; ?>
		</div>
	</div>	
</div>

<script type="text/javascript">

$(document).ready(function(){
    $("[data-toggle=tooltip]").tooltip({ placement: 'right'});
});



</script>

<?php



if(!$config->ajax) include("./dmcfooter.inc"); 

                        		

Link to comment
Share on other sites

got there in the end :)


<?php

 


/**
 * Search template
 *
 */

$out = '';

if($q = $sanitizer->selectorValue($input->get->q)) {

	// Send our sanitized query 'q' variable to the whitelist where it will be
	// picked up and echoed in the search box by the head.inc file.
	$input->whitelist('q', $q); 

	// Search the title, body and country fields for our query text.
	// Limit the results to 50 pages. 
	// Exclude results that use the 'admin' template. 

// load my directed search results table

$q = htmlentities($q);

$keywords = preg_split("/[\s-]+/", $q);

$selector = '';

//turn the query to lowercase
$q = strtolower($q);

//load my list of specific search terms
$dictionary_words = $pages->find("template=search_results");

//loop through them to see if the query matches the any of them

	foreach($dictionary_words as $word){

//if it matches create the selector
						if($word->title==$q){
							$value = htmlentities($word->search_result); 
							$selector = "id=" . $value . ", " ;}
}

//catch usa

if ($q == "usa") {
			$selector = "parent=1065, ";
}

//if it didnt match up break the term up 
if($selector==""){

foreach($keywords as $kw){



    $selector .= "Meta_keywords|DMCcontact_Country|DMCcontact_Address_2|DMC_Email~='$kw', ";
                          }
}


$selector .= "template=dmc, limit=50";

// echo $selector;

$matches = $pages->find($selector);









//	$matches = $pages->find("Meta_keywords|DMCcontact_Country|DMCcontact_Address_2|DMC_Email~=$q, template=dmc, limit=50"); 

	$count = count($matches); 

	if($count) {
		
                        if(!$config->ajax){$out .= "<h2>Found $count DMC matching your query:</h2>";} $out.= 
			"<ul class='nav'>";

		foreach($matches as $m) {
			if($m->DMC_represented==1){$rep="<span class='glyphicon glyphicon-star-empty'></span>";
			$out .= "<li  data-toggle='tooltip' data-placement='top' title='{$m->title} is a partner DMC of destinations UNLIMITED. Click the link for their full profile'><p><a href='{$m->url}'>{$rep} Full Partner -   {$m->title} - Representing {$m->DMC_country_represented->title}</a></p></li>";}else{$rep="";
			$out .= "<li  data-toggle='tooltip' data-placement='top' title='{$m->title} are not represented by destinations UNLIMITED therefore a suggestion only' id='nonrep' ><p>{$m->title} - Representing {$m->DMC_country_represented->title} - {$m->DMC_Email} </a></p></li>";}
		}

		$out .= "</ul>";

	} else {
		$out .= "<h2>Sorry, no results were found.</h2>";
	}
} else {
	$out .= "
                  ";
}

// Note that we stored our output in $out before printing it because we wanted to execute
// the search before including the header template. This is because the header template 
// displays the current search query in the search box (via the $input->whitelist) and 
// we wanted to make sure we had that setup before including the header template. 

if(!$config->ajax) include("./dmcheader.inc");

?>

 <div class="container">

	<div class"row">

		


		<div class="col-md-6">
		<?php if(!$config->ajax){ echo "<h2>Please enter a destination</h2>";} ?>

	
<form id='my_search_form' action='<?php echo $config->urls->root?>dmc-search/' method='get'>
				<input type='text' name='q' id='my_search_query' class='form-control' value='<?php echo htmlentities($input->whitelist('q'), ENT_QUOTES, 'UTF-8'); ?>' />
				<button type='submit' id='search_submit'>Search</button>
			</form>

			<?php echo "$out"; ?>
		</div>
	</div>	
</div>

<script type="text/javascript">

$(document).ready(function(){
    $("[data-toggle=tooltip]").tooltip({ placement: 'right'});
});



</script>

<?php



if(!$config->ajax) include("./dmcfooter.inc"); 

                        		

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...