Jump to content

Using special characters in selectors


DrQuincy
 Share

Recommended Posts

This might be a silly question but wire('sanitizer')->selectorValue() seems to remove characters like ^ and = rather than escape them. Does that mean you cannot, for example, use pages()->get() to match a field that contains any of these characters? Or is there an escape function I'm missing?

I don't actually need (yet) to but I wondered if this was a limitation. If so, what characters are/aren't allowed? E.g. can you can only use a-z-Z0-9'"-_?

Thanks. 🙂

Link to comment
Share on other sites

Looking at the source and the docs it seems like you can't escape special characters and the following aren't allowed:

"\\0", "\\", "`", "|", '=', '*', '%', '~', '^', '$', '#',  '<', '>', '[', ']', '{', '}', "\r", "\n", "\t"

I guess it doesn't matter so much in a natural language search where these kinds of things are filtered out anyway but where you are finding pages using field=value selectors this could trip you up.

Is there a built in way to filter these characters out of a field when you save it so you know when you use exact match selectors on them it will be reliable?

E.g. product page with field 'bid' with a value '$100'. I run pages()->find('template=product, bid=' . wire('sanitizer')->selectorValue('$100')). This will fail to find my product won't it as it will looking for ' 100'' not '$100'. I know in the real world you probably wouldn't store the '$' but I am just using this as an example.

Or do you just assume that any exact match fields should be more predictable values (e.g. numbers, preset categories) and that anything that allows special characters would only ever be searched by a FULLTEXT index?

Thanks.

Link to comment
Share on other sites

I have thought about this and I think if this is the case there are a few options available:

  1. Call wire('sanitizer')->selectorValue() via a hook on save or in the template
  2. Limit the characters with regex in the text input disallowing the above
  3. Have an extra field that stores the unfiltered text and then have a hidden field that stores a filtered version (managed via hooks); show the unfiltered version in the front end but search via the filtered hidden one (this would mean, using my example, '100' and '$100' are the same when searching)
  4. If there aren't going to be loads of options use some kind of enumeration (1=$100, 2=$200) via another template, select options, etc and search the number instead of the value

If you are using FULLTEXT search I think this is irrelevant as it ignores these characters anyway (unless using BOOLEAN MODE, does PW support this?).

Can someone just confirm though that PW does not support exact match searching with the following?

"\\0", "\\", "`", "|", '=', '*', '%', '~', '^', '$', '#',  '<', '>', '[', ']', '{', '}', "\r", "\n", "\t"

I guess I am thinking about edge cases here as unusually filtered values are simple and anything more complex would be FULLTEXT.

Thanks.

Link to comment
Share on other sites

On 11/12/2020 at 7:16 AM, DrQuincy said:

Looking at the source

With the much improved docs, these days, this should not be your first point of call 😃.

On 11/12/2020 at 7:16 AM, DrQuincy said:

the docs

Yes, this should be the first place you check.

On 11/12/2020 at 7:16 AM, DrQuincy said:

the docs it seems like you can't escape special characters and the following aren't allowed:

"\\0", "\\", "`", "|", '=', '*', '%', '~', '^', '$', '#',  '<', '>', '[', ']', '{', '}', "\r", "\n", "\t"

By default, yes. However, there are options to refine how the sanitizer should work. Did you have a look at the white/blacklist options?

This works fine with a Textfield called bid with a value of $100.

<?php namespace ProcessWire;

$allow = ["$"];
$selector = $sanitizer->selectorValue("$100", ['whitelist' => $allow]);
$bids = $pages->find("bid=$selector");

 

Edited by kongondo
link to docs
  • Like 2
Link to comment
Share on other sites

Aha, I knew there must've been a simpler solution, thanks! I don't know how I missed the whitelist option.

After running a few tests, it seems though basically so long as your selector doesn't contain double quotes you can wrap it in double quotes and it will accept anything. And even then you can escape the double quote with a backslash.

$selector = '"This is a \"valid\" selector string \'^%$!"'; // This works as is

Is there an API function to prepare a string in this way? Unless I'm missing something wouldn't a simpler solution be to have an escapeSelectorValue() type function that adds " to the beginning and the end and escapes double quotes? I'm not being critical, just trying to understand the rationale behind the API.

Thanks!

Link to comment
Share on other sites

8 minutes ago, DrQuincy said:

After running a few tests, it seems though basically so long as your selector doesn't contain double quotes you can wrap it in double quotes and it will accept anything.

Glad it works! There's more info here about double versus single quotes and use of backslash.

12 minutes ago, DrQuincy said:

Is there an API function to prepare a string in this way? Unless I'm missing something wouldn't a simpler solution be to have an escapeSelectorValue() type function that adds " to the beginning and the end and escapes double quotes?

I don't think there is, but then again I am not 100% up to date. Not necessarily the same thing but selectorValue() has this:

Quote
  • useQuotes (bool): Allow selectorValue() function to add quotes if it deems them necessary? (default=true)

You can also use this:

Quote
  • quotelist (array): Additional characters that should always trigger quoted value. (default=[])

Someone more knowledgeable could chime in here 😊.

  • Like 1
Link to comment
Share on other sites

Ah yes, that explains it. It says:

Quote

Because the need to match double quotes is rare, a simpler approach is just to disallow double quotes from appearing in your selector values by filtering them out of user input

I can't help thinking just escaping the string rather than filtering things out (as you would do with standard SQL query) makes more sense.

Link to comment
Share on other sites

  • 1 year later...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...