ruuskju Posted March 3, 2023 Share Posted March 3, 2023 Hi, i have a template with allowed urlSegments. No other restrictions in the template. My test template is simple. Just trying to get and echo the url segment. But the problem is, if I use _ (or - or .) together with uppercase characters, my template file is never processed. The 404 redirect happens before. this works: /url/aa_aa (or /url/aa-aa) but these fail, also with - and . /url/Aa_aa /url/aA_aa /url/aa_Aa /url/aa_aA So it doesn't matter where the uppercase character is. <?php namespace ProcessWire; if($input->urlSegment1) { echo $input->urlSegment1 } Processwire version 3.0.210 with Php 8.2 $config->maxUrlSegments = 8; $config->pageNameCharset = 'UTF8'; $config->pageNameWhitelist = '-_.abcdefghijklmnopqrstuvwxyz0123456789æåäßöüđжхцчшщюяàáâèéëêěìíïîõòóôøùúûůñçčćďĺľńňŕřšťýžабвгдеёзийклмнопрстуфыэęąśłżź-FISEN'; My site is multilingual and this might have something to do with that. In my 404 logs i get something like this: 2023-03-03 12:35:19 41 /url/aa-aA page doesn't exist [IP: 127.xxx.xxx.xxx] [UA: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36] 2023-03-03 12:35:24 41 /fi/url/aa-aA page doesn't exist [IP: 127.xxx.xxx.xxx] [UA: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36] Any help appreciated. We actually have this bug in production at the moment... Link to comment Share on other sites More sharing options...
ruuskju Posted March 3, 2023 Author Share Posted March 3, 2023 $config->requestUrl() seems to stay as it should. But around line #110 in ProcessPageView.module it can't find a page. $page = $this->getPage(); Page is empty when it doesn't work. Link to comment Share on other sites More sharing options...
ruuskju Posted March 3, 2023 Author Share Posted March 3, 2023 It appears that pagesPathFinder.php checks for bad name in #519 and if there's a bad name then the response is set to 400. And although we have set the $config->pageNameCharset to UTF8 and have a $config->pageNameWhitelist it still doesn't work. Without pageNameCharset it works. So there's a difference in behavior with or without UTF8. Bad name is checked with $sanitizer->pageNameUTF8($name) and this one always returns a lowercase name. When compared with original uppercase one it ends up in the $badNames array. We had to move fast so we changed the $namePrev value to $namePrev = strtolower($name); foreach($parts as $n => $name) { $lastPart = $name; if(ctype_alnum($name)) continue; // $namePrev = $name; ORIGINAL $namePrev = strtolower($name); // QUICK FIX $name = $sanitizer->pageNameUTF8($name); $parts[$n] = $name; if($namePrev !== $name) $badNames[$n] = $namePrev; } if($result['response'] < 400 && count($badNames)) { $result['response'] = 400; // 400=Bad request $this->addResultError('pathBAD', 'Path contains invalid character(s)'); } Link to comment Share on other sites More sharing options...
Robin S Posted March 3, 2023 Share Posted March 3, 2023 10 hours ago, ruuskju said: together with uppercase characters According to the docs, uppercase letters are not valid for URL segments: https://processwire.com/docs/admin/setup/templates/#allow-url-segments Quote URL segments must follow the same format as page names. Meaning, they can be any combination of lowercase ASCII letters (a-z), numbers (0-9), dashes, underscores and periods. Link to comment Share on other sites More sharing options...
ruuskju Posted March 3, 2023 Author Share Posted March 3, 2023 Yeah, but it seems they can be used and don't work similarly with and without UTF8 settings? There was also this issue:https://github.com/processwire/processwire-issues/issues/488 1 Link to comment Share on other sites More sharing options...
ruuskju Posted March 3, 2023 Author Share Posted March 3, 2023 And isn't the idea of sanitizer here to sanitize the value to safe value. Aka to lowercase? For example this one works /url/AABB and it it sanitized to /url/aabb, so the only difference is underscore character, and it should work in UTF8 mode? Link to comment Share on other sites More sharing options...
martind Posted April 7, 2023 Share Posted April 7, 2023 i also see this underscore/uppercase caused 301 redirect behavior for my urlsegment based ajax requests after ugrading from 3.0.171 to 3.0.210 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now