HerTha Posted May 18, 2019 Share Posted May 18, 2019 I am setting up a front-end form to allow visitors to do a fulltext site search. When external input arrives at the system to be processed in database queries, security is a main issue. Therefore, all the (text) input for this form goes through a $sanitizer->text(), as a first step. Then, I thought using a $sanitizer->selectorValue() would be a good idea for building the selector: $results = $pages->find("template=contentpage,title|headline|body%={$sanitizer->selectorValue($search_term)}"); Unfortunately, it turned out to be quite easy to crash the whole thing, for instance by typing a single ! in the search which gives a 500 server error for the visitor. Obviously, there are a number of scenarios (including the ! char) which are not catched by the sanitizer, but need to be taken care of. Anyone who can share a list of forbidden chars or an algorithm to handle the filtering more securely in such a situation? Of course, the whole filtering should not be overly strict, so the visitor can still find a few things... Link to comment Share on other sites More sharing options...
Autofahrn Posted May 18, 2019 Share Posted May 18, 2019 Does the error log tell you more about the 500 error? Just wondering since I basically do the same on the pwflex.team-tofahrn.de search: if($config->ajax && (($query = $input->get->q) != '')) { $query = $sanitizer->selectorValue($query); $search = "text_index~=$query"; $selector = "{$search}, limit=".($Limit+1); $matches = $pages->find($selector); Only difference may be, that my script requires three chars before it sends the query (makes no difference, just tried). 1 Link to comment Share on other sites More sharing options...
HerTha Posted May 18, 2019 Author Share Posted May 18, 2019 You are using "~=" while I'm prefering "%=" in this place - that's the only difference I can see right now. Should not make a difference, anyway? The error message, when admin, is like: Error: Exception: Unknown Selector operator: '%=!' -- was your selector value properly escaped? (in /home/abcdefg/public_html/wire/core/Selectors.php line 419) Other characters can also do such harm, for instance "?" or "," That site is running on PW 3.0.123, BTW Link to comment Share on other sites More sharing options...
Autofahrn Posted May 18, 2019 Share Posted May 18, 2019 The search on that site can actually do both, depending if the search phrase starts with a ". No, it does not make a difference (just tried). Just logged my resulting $selector and saw the ! is encapsulated in ": 'text_index%="!", limit=16' Which version of PW are you running (I'm on 3.0.123)? Maybe there's a fix in $sanitizer->selectorValue(). (Edit: Ok, should read the last paragraph as well). Link to comment Share on other sites More sharing options...
Autofahrn Posted May 18, 2019 Share Posted May 18, 2019 31 minutes ago, HerTha said: Other characters can also do such harm, for instance "?" or "," Seems like your $sanitizer->selectorValue() does not work at all for some reason. Could you check with Tracy? $search_term = '!'; $val = "template=contentpage,title|headline|body%={$sanitizer->selectorValue($search_term)}"; d($val); Should output: "template=contentpage,title|headline|body%="!"" (45) 1 Link to comment Share on other sites More sharing options...
HerTha Posted May 18, 2019 Author Share Posted May 18, 2019 Okay, as I have multiple calls to find() on that page, I tried to narrow it down a bit and removed all but one. If the search is for, e.g., "hallo" then all is fine. However, if searching for "!hallo", that particular error (see above) occurs. The offending selector, as Tracy bd() reports: "sort=name,limit=50,template=downfile,title|name|indexer%="!hallo"" I can't see anything wrong with it... Link to comment Share on other sites More sharing options...
Autofahrn Posted May 18, 2019 Share Posted May 18, 2019 39 minutes ago, HerTha said: I can't see anything wrong with it... neither me. Could you set $config->debug=true; in your site/config.php? Then there should be some more information in that error message. Link to comment Share on other sites More sharing options...
Zeka Posted May 18, 2019 Share Posted May 18, 2019 @HerTha I use such code: if (input('get', 'q')) { $query = input('get', 'q', 'text'); input()->whitelist('q', $query); $selector = [ ['template', 'post'], ['title|builder.post_wysiwyg|builder.blockquote_wysiwyg', '%=', $query], ['limit', '10'] ]; } and it also works with '!' without errors. Consider using selectors as arrays as it more clear way and have on a small benefit that you don't have to use $sanitizer->selectorValue() Quote Using array syntax, it's no longer necessary to use the $sanitizer->selectorValue() method to sanitize text that goes into selectors. That makes it simpler and safer when building a selector that involves user input. That's because the separation fields and values is handled at the PHP language level (as an array declaration), rather than as a runtime construction. Of course, you should still sanitize user input, but you no longer need to sanitize a value to be selector-friendly. 4 Link to comment Share on other sites More sharing options...
Autofahrn Posted May 18, 2019 Share Posted May 18, 2019 3 minutes ago, Zeka said: I use such code Nice, always forget... ? Selector Upgrades Link to comment Share on other sites More sharing options...
Zeka Posted May 18, 2019 Share Posted May 18, 2019 1 minute ago, Autofahrn said: Nice, always forget... ? Selector Upgrades Also functions API and docs Link to comment Share on other sites More sharing options...
HerTha Posted May 18, 2019 Author Share Posted May 18, 2019 @Autofahrn thanks for your input! Initially, I thought I am dealing with a general problem. Now, it seems more related to a special combination of details. @Zeka array syntax looks good - worth trying, I'd say! I'll report back once I find more... 1 Link to comment Share on other sites More sharing options...
HerTha Posted May 18, 2019 Author Share Posted May 18, 2019 I have rearranged the code to use array syntax for the selector: $a_sel = [ ['sort', 'name'], ['limit', $limit_per_section], ['template', 'downfile'], ['title|name|indexer', '%=', $a_terms] ]; $results = $pages->find($a_sel); Now, if $a_terms contains search term "!hallo" then all is fine. Problem solved!? Well, not really. Searching for "&hallo" still crashes: Quote Exception: Unknown Selector operator: '%=&' -- was your selector value properly escaped? field='title', value='hallo', selector: 'include=all, title%=&hallo' (in /home/abcdefg/public_html/wire/core/Selectors.php line 419) #0 /home/abcdefg/public_html/wire/core/Selectors.php(460): ProcessWire\Selectors->create('title', '%=&', 'hallo') #1 /home/abcdefg/public_html/wire/core/Selectors.php(142): ProcessWire\Selectors->extractString('') ... I understand this should not happen, my debugging skills are a bit limited, though... Link to comment Share on other sites More sharing options...
Autofahrn Posted May 18, 2019 Share Posted May 18, 2019 13 minutes ago, HerTha said: Searching for "&hallo" Thanks for the pointer, searching for "&something" does not work with AjaxSearch either, since it sends the field unescaped through a get request, so "something" effectively gets a second parameter and the search string is empty (does not contain the ampersand). But that's another story. Could you try this line (or better run $a_terms through $sanitizer->selectorValue()): ['title|name|indexer', '%=', "'$a_terms'" ] Link to comment Share on other sites More sharing options...
HerTha Posted May 18, 2019 Author Share Posted May 18, 2019 well, that mixed lot of quotes makes the error disappear ?️ <scratchinghead>how did you come up with that?</scratchinghead> Link to comment Share on other sites More sharing options...
Autofahrn Posted May 18, 2019 Share Posted May 18, 2019 6 minutes ago, HerTha said: how did you come up with that? It's just manually enforcing the string is escaped in quotes, but single quotes this time. Since I guess that double quotes are removed somewhere, you may try with this one as well: ['title|name|indexer', '%=', '"'.$a_terms.'"' ] In both cases you really need to ensure the search term does not contain quotes itself, searching for "it's" will fail now. That's normally something $sanitizer->selectorValue() takes care of. 1 Link to comment Share on other sites More sharing options...
Zeka Posted May 19, 2019 Share Posted May 19, 2019 @HerTha Just curious is this code fails? $a_sel = [ ['sort', 'name'], ['limit', $limit_per_section], ['template', 'downfile'], ['title|name|indexer', '%=', $sanitizer->selectorValue($a_terms)] ]; $results = $pages->find($a_sel); 3 Link to comment Share on other sites More sharing options...
HerTha Posted May 19, 2019 Author Share Posted May 19, 2019 25 minutes ago, Zeka said: $sanitizer->selectorValue($a_terms) that extra sanitizer removes the error, Tracy reports $a_sel to be: So, for this case, the documentation seems to be wrong: 19 hours ago, Zeka said: Using array syntax, it's no longer necessary to use the $sanitizer->selectorValue() method to sanitize text that goes into selectors. 1 Link to comment Share on other sites More sharing options...
HerTha Posted May 19, 2019 Author Share Posted May 19, 2019 further investigation - I simplified the code to: $a_sel = [ ['template', 'downfile'], ['title|name|indexer', '%=', $a_terms] ]; $results = $pages->find($a_sel); So, there are 3 fields to search: "title", "name" and "indexer" (a textarea field). Still using $a_sel = "&hallo". It turns out, that the error occurs only if: "name" field is present and at least one more field is present The combination "title" with "indexer" works just fine. I have never filed an issue report, but this one could be worth it. Anyone who can duplicate this error? 1 Link to comment Share on other sites More sharing options...
Autofahrn Posted May 19, 2019 Share Posted May 19, 2019 35 minutes ago, HerTha said: Anyone who can duplicate this error? yes, instantly! Probably because name is handled something special. Maybe worth triggering @ryan... Link to comment Share on other sites More sharing options...
dragan Posted May 19, 2019 Share Posted May 19, 2019 I found some other strange stuff (only tested in Tracy): $q = "*"; $query = $sanitizer->selectorValue($q); $pgs = $pages->find("parent=1041, include=all, vertec|title|name|remarks%=$query"); d($pgs); This returns everything. Every single page inside parent id 1041. ! returns a seemingly random amount of results (74 out of 1061) - completely wrong. ? does the same, but returns 73 instead of 74. So, I guess you'd have to manually sanitize the query yourself: Check if minimum string length is 3. Remove charactes such as ! & *. And of course, make it a habit to narrow results down (according to template, parent, not in trash, etc.). I would expect PW to throw some sort of error if a search string is less than 3 characters. Maybe the results may differ if I actually used it in a template (outside Tracy) and not logged in as superuser, or with debug off. But still, this behavior truly puzzles me. 1 Link to comment Share on other sites More sharing options...
Autofahrn Posted May 19, 2019 Share Posted May 19, 2019 $sanitizer->selectorValue("*"); effectively returns an empty string. So searching for nothing obviously isn't intended, but probably correct to return everything... ? Link to comment Share on other sites More sharing options...
dragan Posted May 19, 2019 Share Posted May 19, 2019 6 minutes ago, Autofahrn said: So searching for nothing obviously isn't intended, but probably correct to return everything I strongly disagree. An empty string is useless. (maybe "correct" in a strictly SQL / theoretical / technical view, but not in the context discussed here) It's funny that * returns empty string, but ? or ! return ""?"" and ""!"". I'm really surprised (knowing how important security is for @ryan). Does that mean that a malicious attacker could potentially bring a server to its knees by searching with wildcard automatically x times per second? Link to comment Share on other sites More sharing options...
LostKobrakai Posted May 19, 2019 Share Posted May 19, 2019 Using %= means your value will be escaped and wrapped in "%$value%". Using that with LIKE in sql means you'll get all results returned for an empty string value. Both – returning an empty or a full resultset – can be valid expected results for a filtering operation so it's best to align with the default sql behavior and not adding special cases, which is at least the less surprising option. If you don't want to allow searches for empty string is ultimately a business decision you have to take care for on your own. 29 minutes ago, dragan said: Does that mean that a malicious attacker could potentially bring a server to its knees by searching with wildcard automatically x times per second? If that's a concern I'm wondering if you don't set limit on your query. Limit (and in extension pagination) makes the problem of an empty string not filtering the results a non-issue. 1 Link to comment Share on other sites More sharing options...
dragan Posted May 19, 2019 Share Posted May 19, 2019 (edited) 2 hours ago, LostKobrakai said: If that's a concern I'm wondering if you don't set limit on your query. Limit (and in extension pagination) makes the problem of an empty string not filtering the results a non-issue. Oh, don't worry, I always do that, in addition to all sorts of user-input sanitation/validation. I was rather just speaking theoretically. Thankfully, PW code in the site profiles has plenty of useful examples with comments, which will hopefully prevent newbies to omit such checks. After all, we can't expect PW to handle everything for us; it's always up to the developer to take care of security and best practice. Edited May 19, 2019 by dragan re-phrasing Link to comment Share on other sites More sharing options...
HerTha Posted May 19, 2019 Author Share Posted May 19, 2019 Thank you guys for digging into this! This forum is just awesome! I am trying to summarize: We found out that array syntax is able to handle "!" properly. We also found that there are other workarounds for accepting (or filtering) "&". So, that are all interesting details, good to know. However, the question from my original post is still circulating my head: On 5/18/2019 at 12:21 PM, HerTha said: Obviously, there are a number of scenarios (including the ! char) which are not catched by the sanitizer, but need to be taken care of. Anyone who can share a list of forbidden chars or an algorithm to handle the filtering more securely in such a situation? I thought this has been done a thousand times... Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now