Jump to content

Problems with Commas in Page Names


MatthewSchenker
 Share

Recommended Posts

Greetings,

I have been using front-end forms to create pages for in a few ProcessWire projects, and it works great!  However, I just created my first project that requires commas in the titles, and suddenly I get this error whenever I create or edit pages with a comma in the name:

Error Exception: Unknown Selector operator: '' -- was your selector value properly escaped? (in .../public_html/wire/core/Selectors.php line 165)

I have a form collecting information to create a page.  I then use an $np variable populate the fields for new pages.  As long as there is no comma in the title, everything is great.  Here are the coding approaches I have tried to handle this problem (I show the title line for clarity):

Attempt 1. Sanitizing the title, then just making the name = title:

$np->title = $sanitizer->text($input->post->title);
$np->name = $np->title;

Attempt 2. Sanitizing the name separately (as described in the API):

$np->title = $sanitizer->text($input->post->title);
$np->name = $sanitizer->name($input->post->title);

Attempt 3. Another approach via the API:

$np->title = $sanitizer->text($input->post->title);
$np->name = $sanitizer->pageName($input->post->title);

Attempt 4. Finally, I have tried not setting the "name" at all, since Ryan has suggested elsewhere that it is auto-generated from the title.

In each case, the error occurs if there is a comma in the title.  If there are any other non-url characters, there is no problem.

Is there another syntax I should be using in my templates to deal with commas in API-created pages?

Thanks,

Matthew

Link to comment
Share on other sites

Hi Matthew,

Commas need to be escaped because they are also used to separate fields in the Selector. The correct method is this:

$np->title = $sanitizer->selectorValue($input->post->title);

You should sanitize all values that you use in a selector like this.

Use text(), textarea() when you are outputting your data on the website, e.g. in form fields.

Btw you are right, Pw does set a name based on the title if you ommit setting ->name.

Cheers

  • Like 6
Link to comment
Share on other sites

Matthew, I'm assuming this this error occurs after you perform $pages->find(), $pages->get(), or $page->children(), type call somewhere that contains a selector that looks like "title=some title, with a comma in it"? Wanze is right that you would want to use selectorValue on the "some title, with a comma in it" before bundling it into the selector. But if you are just creating pages, rather than querying pages for a title, then let me know. 

  • Like 1
Link to comment
Share on other sites

Greetings,

AHA!

A little voice told me there must be something like this going on, because of two things:

1. It seemed odd to get a "selector" issue when creating pages.

2. There's always an easy solution to any "issues" in ProcessWire.

Here's what happened.  In this project, I added a method to check for duplicate titles upon submitting the page (creating or editing), and THAT is where the "selector" issue was happening.

For others dealing with this same idea, here's my code...

Originally, I had this, which DID NOT work:

$matchedtitle = $input->post->title;
$checktitles = $pages->find("parent=/parent/, title=$matchedtitle"); // Look for existing pages with requested title.

I edited it as follows, and IT WORKS:

$matchedtitle = $sanitizer->selectorValue($input->post->title); // "selectorValue" strips out commas from title search.
$checktitles = $pages->find("parent=/parent/, title=$matchedtitle"); // Look for existing pages with requested title.

The second code snippet is the one to use.

Thanks,

Matthew

EDIT: The code above is for the create page routine.  The edit page routine has a bit more going on.

Edited by MatthewSchenker
  • Like 1
Link to comment
Share on other sites

$matchedtitle = $sanitizer->selectorValue($input->post->title); // "selectorValue" strips out commas from title search.
$checktitles = $pages->find("parent=/parent/, title=$matchedtitle|$thistitle"); // Look for existing pages with requested title, including the current page title.

Just wanted to mention that $thistitle should also be sanitized as a selector value (if it isn't already). 

Link to comment
Share on other sites

  • 1 month later...

Dear Ryan and All,

I'm a wee bit confused about the $sanitizer->selectorValue() issue, versus using handrolled quotes or escapes.

I'm also running a routine to check for duplicates, on new pages (looking for account names).

So, when a user types an account name with a comma, like "Company, Inc.", the normal get routine breaks, as it did for Matthew.

However, the cheat sheet states that the selectorValue sanitizer will replace disallowed characters (which ones?) with spaces,

and then will place quotes if necessary, and will then limit the length to 100 characters.

That seems potentially harmful or at least inaccurate with some data values, like a company name.

Then, on the API page, it states:

If your selector value needs to contain a comma, you should surround your selector value in quotes, i.e.

  • body*="sushi, tobiko"

However... every example I've seen, of selectors, has the double quotes *outside* of the selector string, which is what I have

in my example code:

$check_field_dupe_id = $pages->get( "$field_name=$field_value, include=hidden, check_access=0" )->id;

It would seem that I need to place the quotes around the value, so does that mean I don't need the outside quotes, e.g.

$check_field_dupe_id = $pages->get( $field_name="$field_value", include=hidden, check_access=0 )->id;

Also, this doesn't take care of potential double quotes in the value, so I'm wondering if the other acceptable way to do it is to manually escape

the commas and double quotes? Are they the only two problematic characters in this type of query (after doing a normal text sanitization)?

$field_value_search = str_replace(',', '\,', $field_value);
$field_value_search = str_replace('"', '\"', $field_value);

$check_field_dupe_id = $pages->get( "$field_name=$field_value_search, include=hidden, check_access=0" )->id;

Thanks,

Peter

Link to comment
Share on other sites

Dear All,

The code above that I typed was incorrect, because it read the value of $field_value twice:

$field_value_search = str_replace(',', '\,', $field_value);
$field_value_search = str_replace('"', '\"', $field_value);

I modified it, and this seems to work:

$field_value_search = $field_value;
$field_value_search = str_replace(',', '\,', $field_value_search);
$field_value_search = str_replace('"', '\"', $field_value_search);

$check_field_dupe_id = $pages->get( "$field_name=$field_value_search, include=hidden, check_access=0" )->id;

I tried using quotes around the internal value, i.e. $field_name="$field_value", but it didn't work (this was with the $field_value not being escaped), and no outside quotes.

EDIT: I'm wondering now, with a front-end web app with many, many get() and find() calls that interact with data fields, if it's safer to always  escape commas and double quotes, as a habit. I realize that $sanitizer->selectorValue() exists, but as I mentioned above, it seems problematic to me if it replaces certain characters with a space, and truncates at 100 chars.

What characters in $sanitizer->selectorValue() get stripped?

Is it safe enough to just escape commas and double quotes?

I welcome any thoughts on this.

Peter

Link to comment
Share on other sites

Peter, have a look at /wire/core/Sanitizer.php and the selectorValue() function in there, as I think it'll answer your questions better than I can here. But to attempt an answer here, these are the lines where the character replacement occurs, which shows which characters it replaces. For the most part, these are characters that might be used as operators. Technically, it doesn't need to replace them anywhere other than at the beginning or end of the string, but it currently replaces them no matter where they are (it's on my todo list to optimize that): 

$value = str_replace(array('*', '~', '`', '$', '^', '+', '|', '<', '>', '!', '='), ' ', $value);
$value = str_replace(array("\r", "\n", "#", "%"), ' ', $value);

using $sanitizer->selectorValue() is going to be valuable primarily when dealing with user input. If you are dealing with API-level stuff that doesn't involve user input, then it should be just fine to sanitize yourself rather than using selectorValue. I would just quote your value and make sure it doesn't already have quotes in it:

if(strpos($field_value_search, '"') !== false) throw new WireException("Sorry value can't have quotes");

Quote the value in your selector. Since the value is already surrounded in quotes, PHP requires you to escape the embedded quotes: 

$check_field_dupe_id = $pages->get("$field_name=\"$field_value_search\", include=hidden, check_access=0" )->id;

In cases where you are using $sanitizer->selectorValue() and it surrounds the value with quotes, then you don't need to worry about escaping embedded quotes as they are already present in the string. Meaning, you can do this:

$cleanValue = $sanitizer->selectorValue($dirtyValue); 
$pages->get("field_name=$cleanValue"); 
Link to comment
Share on other sites

Dear Ryan,

Thank you for your input. I think the issue that I've run into is that I'm looking for duplicate values in the database, of a page title that is input by the user, where the page title can legally have a variety of characters that the sanitizer function strips out. My primary purpose for this is when I convert the page title to a page name (i.e. url), because I don't want to have duplicate page urls. I'm not allowing the users to type the page urls, in this case.

So, someone could type something for the page title that could have all types of characters, including commas, quotes, etc. Once I confirm that the page title doesn't exist, I convert that name to a url.

What I ran into specificially was that the get and find search functions choked on a comma in the page title.

I think that the sanitizer function would strip out too much data in the above scenario, so I'm wondering if, from the point of view of get() and find(), the characters that would break that search are simply the commas and the double quotes. Thus, I used:

$field_value_search = str_replace(',', '\,', $field_value_search);
$field_value_search = str_replace('"', '\"', $field_value_search);

and then I didn't quote the search string. (But thank you for clarifying the method of escaping the outer quotes!)

The page titles would allow double quotes and commas, so I don't want to reject those characters.

Am I missing something, or would the above two lines take care of the issue with get() and find(), when searching for duplicates?

Thanks again,

Peter

Link to comment
Share on other sites

Rather than querying by page titles (or some other thing), why not just query by page name? That is probably going to be a more reliable indicator of whether a title is actually taken. And since the goal is to have unique page names, you might as well start with that in the first place. 

$name = $sanitizer->pageName($title, true); 
$test = $pages->get("/path/to/parent/$name/"); 
if($test->id) {
  // page name/title is already taken, make user choose another
} else {
  // page name/title is unique, so good to go...
  $newPage = new Page();
  $newPage->template = 'something';
  $newPage->parent = '/path/to/parent/';
  $newPage->title = $title;
  $newPage->name = $name; 
  $newPage->save();
}
Link to comment
Share on other sites

Dear Ryan,

That's a good idea. I thought of that method too, and probably would have selected it, but I also don't want to have duplicate page titles, in most instances, so I have to check for those in any case. And, since the logic to convert a title to a page name is the same for each record, I'm hoping that it will resolve correctly.

I figured it would be easier to create a generic routine that would check any field for duplicate values, if that field is listed in the "no duplicates" array defined in a config file. I think that escaping commas and double quotes should do the trick, but if you think I'm incorrect in that, do let me know. :-)

Thanks!

Peter

Link to comment
Share on other sites

No need to escape commas. Just make sure your string is wrapped in quotes (whether double or single quotes), and that it doesn't already have those quotes within it. Chances are wrapping it in double quotes will suit you better since a page title is more likely to have an single-quote style apostrophe than a double quote.  

Link to comment
Share on other sites

  • 3 years later...

I know this is an old thread, but I ran into a similar issue today. I was trying to select a row in a profields Table by the contents of its text field, but selectorValue() was stripping out the "#" character from the selector value, preventing the match from being found.

I had always sort of assumed that selectorValue just escaped the value to make it safe for a selector, without actually altering the value. But now I see that this isn't the case (and it also limits the length to 100 characters by default).

It looks like the only way to achieve what I'm looking to do is to strip out double quotes from within the input string that I get and then wrap the whole string in double quotes (as Ryan describes above). If I understand correctly, this would be a safe way to escape the input.

The only problem with this is that double quotes then cannot be used in the text field, which could be an issue.

Is there any way to create a safe selector that can handle double quotes in addition to commas, hashes, and the other characters that selectorValue() strips out?

  • Like 1
Link to comment
Share on other sites

@thetuningspoon, the below seems to work well with any of the SQL LIKE operators. You could easily add it as a new $sanitizer method via hook if you wanted.

Test title:
2017-06-07_194149.png.e7482b4b8db616ab834926ea6b7671f0.png

Code:

function prepSelectorValue($str) {
    return '"' . addslashes($str) . '"';
}

$p = $pages(1107); // the page with the test title
$title = prepSelectorValue($p->title);
$items = $pages->find("title%=$title");

echo '<h2>Results</h2>';
echo $items->each("<p>{title}</p>");

Result:
2017-06-07_194239.png.099141356dea413e38ca50bd8ba4f4a3.png

  • Like 2
Link to comment
Share on other sites

5 hours ago, thetuningspoon said:

Is there a reason why this wouldn't work for the exact match operator (=) ?

I guess the SQL LIKE operators can recognise an escaped quote and treat it as such, whereas the other operators cannot. So that leaves you a bit stuck if you must use those other operators because you'll always have a potential clash of quote characters if you cannot escape quotes.

  • Like 2
Link to comment
Share on other sites

  • 2 months later...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...