Jump to content

URL hook with named parameters does not work with non-ascii characters


Recommended Posts

Posted (edited)

Dear community,

I am having a strange problem for which I don't have the time to investigte in detail and I'm hoping someone might know why that is or if this is simply something PW cannot do at all.

We have some redirect rules for a client's site which we have redone matching old URLs to new ones. In one of the rules - for blog posts, they are having German Umlauts (öäü) in their URLs (sometimes, but not always) which we need to redirect. There are a lot of posts and hard-coding every one of them is simply unfeasible. So I came up with this simple hook to solve all of the blog redirects at once. It works like a charm as long as there are no Umlauts in the URL.

<?php namespace ProcessWire;

/* this is in ready.php */

$wire->addHook('/post/(post_slug:[\wöäü]+)/?', function(HookEvent $hookEvent) {
    // find a matching to redirect to, or fall back with a 404
    $post = $hookEvent->arguments('post_slug');
	$postTranslated = $hookEvent->sanitizer->pageNameTranslate($post);
    $requestedPage = $hookEvent->pages->get('name=' . $hookEvent->sanitizer->selectorValue($postTranslated));
    if ($requestedPage->id) {
        $hookEvent->session->redirect($requestedPage->url);
        $hookEvent->return = true;
    } else {
		$hookEvent->return = false;
	}
});

The page name translation and funky regex are me trying to get the regex to match with Umlauts. But the problem is somewhere else because the hook doesn't get called at all if there are non-ascii characters in the URL.

Does someone know more details about this?

Thank you a lot!

Edited by poljpocket
fix missing selectorValue usage
Link to comment
Share on other sites

Hmm. What's your .htaccess like? Could this be related? Also referenced is the page name whitelist. I wonder if these are kicking in somehow for the slug.

  # PW-PAGENAME
  # -----------------------------------------------------------------------------------------------
  # 16A. Ensure that the URL follows the name-format specification required by PW
  # See also directive 16b below, you should choose and use either 16a or 16b.
  # -----------------------------------------------------------------------------------------------

  RewriteCond %{REQUEST_URI} "^/~?[-_.a-zA-Z0-9/]*$"

  # -----------------------------------------------------------------------------------------------
  # 16B. Alternative name-format specification for UTF8 page name support. (O)
  # If used, comment out section 16a above and uncomment the directive below. If you have updated
  # your $config->pageNameWhitelist make the characters below consistent with that.
  # -----------------------------------------------------------------------------------------------
  
  # RewriteCond %{REQUEST_URI} "^/~?[-_./a-zA-Z0-9æåäßöüđжхцчшщюяàáâèéëêěìíïîõòóôøùúûůñçčćďĺľńňŕřšťýžабвгдеёзийклмнопрстуфыэęąśłżź]*$"

  # END-PW-PAGENAME



 

  • Like 4
Link to comment
Share on other sites

Posted (edited)

@netcarver wow, perfect! It is exactly that.

I have switched to the UTF8 line and changed $config->pageNameCharset over to UTF8. Now it works perfectly. Even without any funky regex.

So follow-up question: Now, for new pages, the translation rules in InputfieldPageName are disabled - makes sense because we get UTF8 pages names now.

I guess we can have one of the two worlds, right? We absolutely don't want non-ascii page names on the new website. I guess I have to resort to some other form of redirecting outside of PW to accomplish both at once?

Edited by poljpocket
Link to comment
Share on other sites

Posted (edited)

For further reference, this is the hook to accomplish this with UTF8 mode for page names:

<?php namespace ProcessWire;

// we are in ready.php

$wire->addHook('/post/{post_slug}/?', function(HookEvent $hookEvent) {
    // find post to use, fallback 404:
    $post = $hookEvent->arguments('post_slug');
    $postTranslated = $hookEvent->sanitizer->pageNameTranslate($post);
    $requestedPage = $hookEvent->pages->get('name=' . $hookEvent->sanitizer->selectorValue([$postTranslated, $post]));
    if ($requestedPage->id) {
        $hookEvent->session->redirect($requestedPage->url);

        $hookEvent->return = true;
    } else {
        $hookEvent->return = false;
    }
});

Changes to above hook: it is using a simple named path argument and accepts both the non-translated and the translated version of the post slug (e.g. when the UTF8 mode was enabled after a few pages have been created already and have translated names).

Edited by poljpocket
fix missing selectorValue usage
Link to comment
Share on other sites

8 hours ago, poljpocket said:

So follow-up question: Now, for new pages, the translation rules in InputfieldPageName are disabled - makes sense because we get UTF8 pages names now.

I guess we can have one of the two worlds, right? We absolutely don't want non-ascii page names on the new website. I guess I have to resort to some other form of redirecting outside of PW to accomplish both at once?

I did not try, but maybe you could redirect any non-ascii url to some custom php file (like redirect.php in the site root) and that file could bootstrap PW and redirect to the new page? Something like this:

# Redirect URLs with umlauts to redirect.php
RewriteCond %{REQUEST_URI} [äöüÄÖÜ]
RewriteRule ^(.*)$ redirect.php?url=$1 [QSA,L]
<?php
// Include ProcessWire bootstrap
include "index.php";

// Get the URL from the GET parameter 'url'
$url = isset($_GET['url']) ? $_GET['url'] : '';

// Sanitize the name using pageNameTranslate
$pageName = $wire->sanitizer->pageNameTranslate($url);

// Find the page with the sanitized name
$page = $wire->pages->get("name=$pageName");

// Redirect to the found page
if ($page->id) {
  $session->redirect($page->url);
} else {
  // Handle the case where the page is not found
  $session->redirect('/');
}

 

  • Like 1
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...