KentBrockman Posted July 10, 2013 Posted July 10, 2013 Once again, i need your help. I integrated a search in a customers site. The search performs as aspected if i use only one "word", for example "foo".If i search a more complex string like "foo bar" or "foo-bar" the result is empty, although some pages contain these strings. if($q = $sanitizer->selectorValue($input->get->q)) { $input->whitelist('q', $q); $matches = $pages->find("teaserBody|contentHeadline|contentBody~=$q, limit=20"); The code is pretty much the same as in the basic site, i only adjusted the fields to search in. What am i doing wrong? Thank you!
Alessio Dal Bianco Posted July 10, 2013 Posted July 10, 2013 Hi Kent, have you tried my module ? After install, you search only in indexer field independently from number of fields you have. So this line: $matches = $pages->find("teaserBody|contentHeadline|contentBody~=$q, limit=20"); will be replaced by $matches = $pages->find("indexer~=$q, limit=20"); Let me know if you experienced some problems
KentBrockman Posted July 10, 2013 Author Posted July 10, 2013 Alessio, thank you for your hint. I looked up your module and read the support forum, but it is absolutely not clear to me, if the module will solve my problem with the "complex" searchstrings. I tried to install and use it, but when i will use "Reindex"-function to index all existing pages for the first time it won't work: Incorrect string value: '\xE3\xBCber ...' for column 'data' at row 1 Thanks.
Alessio Dal Bianco Posted July 10, 2013 Posted July 10, 2013 Uhm, have you got some PDF or doc on your site? I suspect that it throws an error because I don't extract the text correctly from files.
KentBrockman Posted July 10, 2013 Author Posted July 10, 2013 If your module searches all fields by default there will sure be some pdf files. The web is running on a Mircosoft ISS, perhaps this could be a problem too. But if it would run, would it solve my problem?
teppo Posted July 10, 2013 Posted July 10, 2013 Selector operator "~=" means that at least one of the fields you've defined (teaserBody, contentHeadline, contentBody) has to contain all specified words. Order doesn't matter, though. Are you absolutely sure that's the case? This could also have something to do with either stop words or length limitations, so you might want to test it with longer words and/or change selector from "~=" to "%=". This is a bit slower, but so far I've been using it pretty much everywhere without any severe performance issues. 2
KentBrockman Posted July 10, 2013 Author Posted July 10, 2013 Selector operator "~=" means that at least one of the fields you've defined (teaserBody, contentHeadline, contentBody) has to contain all specified words. Order doesn't matter, though. Are you absolutely sure that's the case? Yes. This could also have something to do with either stop words or length limitations, so you might want to test it with longer words and/or change selector from "~=" to "%=". This is a bit slower, but so far I've been using it pretty much everywhere without any severe performance issues. It works! Perfect, thank you for the hint. I read about it, but it wasn't clear to me, that the result will be different. I just though it is more compatible.
kongondo Posted July 10, 2013 Posted July 10, 2013 Is the a way to match non-adjacent partial words (note plural)...e.g. processw install. Haven't found a way so far. I know %=$q will match a single partial word.
kongondo Posted July 10, 2013 Posted July 10, 2013 Good tip, thanks Soma. If I get stuck, I'll pick your brains later...... Or, if you can't wait for me to pick your brains later, you might as well please point me (aka please post some sample code to get me going, hehe) to some code now, thanks
DaveP Posted July 10, 2013 Posted July 10, 2013 Unless you have many thousands of text-heavy pages, the %= selector isn't noticeably slower than any of the others. Just to expand on Soma's answer above, it's easy to achieve 'AND' or 'OR' functionality in selectors. Nik's Selector Test module (mods.pw/2r) is great for quickly testing selectors, BTW. 'OR' first... $pages->find("body%=process|examp"); will find pages including either of the part words either side of the pipe symbol '|'. 'AND' isn't really much trickier... $pages->find("body%=process,body%=examp"); Pages found must match all selectors, so that's an 'and'. I usually build my selector string without search terms, then explode() the search terms on space characters and foreach() them on to the end of the selector string. It's all about thinking the PW way. 4
kongondo Posted July 10, 2013 Posted July 10, 2013 Thanks DaveP. I am mulling how to implement this in a search function (building on search.php that comes with the default PW install)...
DaveP Posted July 10, 2013 Posted July 10, 2013 Ryan often points people to the Skyscrapers profile http://processwire.com/demo/ for some good ideas about implementing quite complex searches. You can see the source for search.php from that profile on GitHub https://github.com/ryancramerdesign/SkyscrapersProfile/blob/master/site/templates/search.php.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now