Jump to content

Recommended Posts

Posted

Hi, 

i have a template with allowed urlSegments. No other restrictions in the template. My test template is simple. Just trying to get and echo the url segment. 
But the problem is, if I use _ (or - or .) together with uppercase characters, my template file is never processed. The 404 redirect happens before. 

this works: 
/url/aa_aa (or /url/aa-aa)

but these fail, also with - and . 
/url/Aa_aa
/url/aA_aa
/url/aa_Aa
/url/aa_aA

So it doesn't matter where the uppercase character is. 

<?php namespace ProcessWire;

if($input->urlSegment1) {
  echo $input->urlSegment1
}

Processwire version 3.0.210 with Php 8.2

$config->maxUrlSegments = 8;
$config->pageNameCharset = 'UTF8';
$config->pageNameWhitelist = '-_.abcdefghijklmnopqrstuvwxyz0123456789æåäßöüđжхцчшщюяàáâèéëêěìíïîõòóôøùúûůñçčćďĺľńňŕřšťýžабвгдеёзийклмнопрстуфыэęąśłżź-FISEN';

My site is multilingual and this might have something to do with that. In my 404 logs i get something like this:

2023-03-03 12:35:19	41	/url/aa-aA	page doesn't exist [IP: 127.xxx.xxx.xxx] [UA: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36]
2023-03-03 12:35:24	41	/fi/url/aa-aA	page doesn't exist [IP: 127.xxx.xxx.xxx] [UA: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36]

Any help appreciated. We actually have this bug in production at the moment...

  • ruuskju changed the title to urlSegments: Underscores together with uppercase characters cause 404 redirect
Posted

$config->requestUrl() seems to stay as it should.
But around line #110 in ProcessPageView.module it can't find a page. 

$page = $this->getPage();

Page is empty when it doesn't work. 

Posted

It appears that pagesPathFinder.php checks for bad name in #519 and if there's a bad name then the response is set to 400. And although we have set the $config->pageNameCharset to UTF8 and have a $config->pageNameWhitelist it still doesn't work. Without pageNameCharset it works. So there's a difference in behavior with or without UTF8.

Bad name is checked with $sanitizer->pageNameUTF8($name) and this one always returns a lowercase name. When compared with original uppercase one it ends up in the $badNames array. We had to move fast so we changed the $namePrev value to $namePrev = strtolower($name);

foreach($parts as $n => $name) {
  $lastPart = $name;
  if(ctype_alnum($name)) continue;
  // $namePrev = $name; ORIGINAL
  $namePrev = strtolower($name); // QUICK FIX
  $name = $sanitizer->pageNameUTF8($name);
  $parts[$n] = $name;
  if($namePrev !== $name) $badNames[$n] = $namePrev;
}

if($result['response'] < 400 && count($badNames)) {
  $result['response'] = 400; // 400=Bad request
  $this->addResultError('pathBAD', 'Path contains invalid character(s)');
}
Posted

And isn't the idea of sanitizer here to sanitize the value to safe value. Aka to lowercase? For example this one works /url/AABB and it it sanitized to /url/aabb, so the only difference is underscore character, and it should work in UTF8 mode? 

  • 1 month later...
Posted

i also see this underscore/uppercase caused 301 redirect behavior for my urlsegment based ajax requests after ugrading from 3.0.171 to 3.0.210

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...