Jump to content

Prepend date in page name by hook creates improper results on some existing pages


sebibu
 Share

Recommended Posts

Hey Procis, wrote this - and my very first - hook „Prepend date in page name“, yesterday.
Once you get it, this system and the API is magnificent. Thank you @ryan and @all contributers!

On new (news) pages it works great so far. E.g.:
Title: Testnews mit Ümläüten
Altered page name: 2024-09-24-testnews-mit-ümläüten

But on existing pages special characters are removed. E.g.
Title: Partnerschaftliche Unterstützung - Ihr werdet gesucht!
Altered page name: 2023-07-27-fans1991-partnerschaftliche-untersttzung-ihr-werdet-gesucht

Another example is attached.
There you can see, that the ProcessPageEdit-message shows the right path!?

I have this in my site/config.php to allow only german special characters:

$config->pageNameCharset = 'UTF8';
$config->pageNameWhitelist = '-_abcdefghijklmnopqrstuvwxyz0123456789äöüß';

Temporary uninstalled Module „PagePathHistory“ with no look.

The code of my hook (suggestions for improvement are very welcome!):

/* NEWS pages only: prepend date in page-name */
$wire->addHookAfter('Pages::saveReady', function($event) {
  // get current Page object — in this case this is the first argument for the $event object
  $page = $event->arguments(0);
  
  // Only for pages with news template
  if($page->template == 'news') {
    // Test if field day exists, otherwise exit
    if (!$page->template->hasField('day')) {
      $event->message("Field 'day' is missing!");
      return;
    }
    $title = $page->get('title');
	// Sanitize title and substitute special characters
    $optimizedTitle = wire('sanitizer')->pageNameUTF8($title);
    //$optimizedTitle = wire('sanitizer')->pageName($title, Sanitizer::okUTF8); // or translate option

	$date = wireDate('Y-m-d', $page->day);
	// Set output formatting state off, for page manipulation
	$page->of(false);
	$page->name = $date.'-'.$optimizedTitle;
	//$page->save('name');
	$event->message("New saved name is $page->name");
    //$event->message("Path of new saved page is $page->path");
  }
});

Any idea why this doesn't works on all existing news, too?

Thx and 👋
Sebastian

Processwire hook prepend date in pagename bug.png

Link to comment
Share on other sites

2 hours ago, sebibu said:

Hey Procis, wrote this - and my very first - hook „Prepend date in page name“, yesterday.
Once you get it, this system and the API is magnificent. Thank you @ryan and @all contributers!

Congrats and welcome to a new world of superpowers 😉 

I don't have details but I'm always using pageNameTranslate:

VG26qJz.png

Not sure why that would work differently on new or existing pages...

2 hours ago, sebibu said:

The code of my hook (suggestions for improvement are very welcome!):

Instead of if(...) { ... } you can use early exit strategy which is usually cleaner in hooks. This is also called "guard clause":

So you could write it like this:

/* NEWS pages only: prepend date in page-name */
$wire->addHookAfter('Pages::saveReady', function($event) {
  // get current Page object — in this case this is the first argument for the $event object
  $page = $event->arguments(0);
  
  // Only for pages with news template
  if($page->template != 'news') return;

  // Test if field day exists, otherwise exit
  if (!$page->template->hasField('day')) {
    $event->message("Field 'day' is missing!");
    return;
  }

  $title = $page->get('title');
  // Sanitize title and substitute special characters
  $optimizedTitle = wire('sanitizer')->pageNameUTF8($title);
  //$optimizedTitle = wire('sanitizer')->pageName($title, Sanitizer::okUTF8); // or translate option

  $date = wireDate('Y-m-d', $page->day);
  // Set output formatting state off, for page manipulation
  $page->of(false);
  $page->name = $date.'-'.$optimizedTitle;
  //$page->save('name');
  $event->message("New saved name is $page->name");
  //$event->message("Path of new saved page is $page->path");
});

The difference is minimal here but using if(...) { ... } is usually harder to read, because you could have an else { } somewhere further down the hook, so when trying to understand your hook you have more workload for your brain. When using if($template != 'whatever') return; you instantly know the hook only applies to 'whatever' templates.

  • Like 1
Link to comment
Share on other sites

Thank you @bernhard for helping optimizing the code!👍
For checking the template field I already used a guard clause, but for the template check it makes sense and is more readable to use it, too.

I would like to keep the german special characters like äöüß instead of exchanging (->pageNameTranslate) them with ae, oe, ue, ss.
So wire('sanitizer')->pageNameUTF8($title); should be the way to go. Right?

  • Like 1
Link to comment
Share on other sites

On 9/25/2024 at 2:13 PM, bernhard said:

When using if($template != 'whatever') return; you instantly know the hook only applies to 'whatever' templates.

For such cases, even more readability:

$wire->addHookAfter('Pages::saveReady(template=news)', function($event) {
   // [...]
}

This way, the hook doesn't even get called if template isn't news.

Also, I think you can even get rid of the second guard clause like this:

$wire->addHookAfter('Pages::saveReady(template=news,day!=)', function($event) {
   // [...]
}

If the page doesn't have a day field, the selector doesn't match and the hook isn't called. This is untested though 😊

Edited by poljpocket
  • Thanks 2
Link to comment
Share on other sites

  • 2 months later...

Unfortunately I have to come back to my still existing pagename problem.

This does not have anything to do with the hook.
In general the pagename is shown correct in the preview before saving:
1290604736_Processwirehookprependdateinpagenamebug2.png.e875d237ea3f4c589bf9c46fd3f9cc25.png

But after saving special characters are removed:

857664679_Processwirehookprependdateinpagenamebug3.png.8be7096e03c88ce7a54957962f6f37bd.png

What am I doing wrong?

I'm using this Extended Page Names feature: https://processwire.com/blog/posts/page-name-charset-utf8/

site/config.php:

$config->pageNameCharset = 'UTF8';
$config->pageNameWhitelist = "$-_.+!*'(),äöüßÄÖÜabcdefghijklmnopqrstuvwxyz0123456789";

.htaccess:

RewriteCond %{REQUEST_URI} "^/~?[-_/a-zA-Z0-9äöüß.+!*'(),]*$"

How to debug this? (Haven't configured PHP debug in VSC with DDEV.😆)

🙏👋

Link to comment
Share on other sites

This uses the PageNameTranslate rules which you can configure in Modules > Configure > InputfieldPageName. By default, it will replace ä with a, ö with o and ü with u.

One of the first things I change on new sites.

BTW: @bernhard's RockMigrations has an automation for this.

Edited by poljpocket
  • Like 1
Link to comment
Share on other sites

You are obviously right! I am sorry...

I tried your settings in a quick-and-dirty local Docker install using the blank site profile.

I can use your whitelist no problem:

image.png.9ecd42ba7f07c2fdd7b8f6fbc709637e.png

 

This is what I added to the files:

image.thumb.png.d25b2a916fd740b72ddd7cd38eaef70b.png

Edited by poljpocket
  • Thanks 1
Link to comment
Share on other sites

You can maybe start by looking into where the problem occurs. You can for example use Tracy's bd() function to output the page name before and after every saveReady and saved hook. This will at least let you see in which part of the code you have to go dig for more.

Do you have any hooks already in use for Page::saveReady and Page::saved?

On which PW version are you?

Edited by poljpocket
  • Thanks 1
Link to comment
Share on other sites

Thanks for testing, @poljpocket!
Great to see that it is basically working.

Testing with a fresh install would have been my next step to narrow it down.👍

Interesting. I see you are using in your .htaccess

RewriteCond %{REQUEST_URI} "^/~?[-_/a-zA-Z0-9äöüß.+!*'(),]

instead of

RewriteCond %{REQUEST_URI} "^/~?[-_/a-zA-Z0-9äöüß.+!*'(),]*$"

So *$" is missing, which is necessary only for resolving the page request. Not build the pagename.
👋

 

Link to comment
Share on other sites

1 hour ago, poljpocket said:

Do you have any hooks already in use for Page::saveReady and Page::saved?

Only one which is only for pages with one determined template.
But the problem occurs for pages with all templates and even if the hook is disabled.

1 hour ago, poljpocket said:

On which PW version are you?

The newest stable one 3.0.229
All modules up to date.😇

BTW.. PHP mbstring extension is enabled.

Link to comment
Share on other sites

1 hour ago, sebibu said:

Interesting. I see you are using in your .htaccess

RewriteCond %{REQUEST_URI} "^/~?[-_/a-zA-Z0-9äöüß.+!*'(),]

You are right, copy-pasta mistake. I just copied your stuff over. But I corrected it right now and it doesn't make a difference (and yes, it's only about the rewrite rules and doesn't affect the PW admin in any way). Maybe you will find something with the hooks I have talked about above.

  • Like 1
Link to comment
Share on other sites

I tested a lot the last days also with fresh latest stable installs with blank site profile local in DDEV and on the server.
Exported, changed the charset and reimported the database from utf8mb3 to utf8mb4.

Now the german special character äöü work, but strangely enough not always!😳🤔

✓ Die Kölner „Miteinander“ → die-kölner-miteinander
✗ Köln gegen Köln in Köln mit allen Köln Fans – die Kölner „Miteinander“ → köln-gegen-kln-in-kln-mit-allen-köln-fans-die-kölner-miteinander (two missing ö)

Furthermore:

✗ Fuß → s (?? should be fuß)
✗ Füße → füsse (should be füße)

I thought PageNameTranslate is not used when using

$config->pageNameCharset = 'UTF8';

 and ß is in the whitelist!?

@poljpocket and other: Are you so kind and test if this pagetitles generate the same pagenames in your instances?

Thanks a lot!🙏👋

Link to comment
Share on other sites

There you go running on the same test install from above:

image.png.388272219694a695ee29131ed8009c92.png

image.png.70c8face18f46810376697c4953bb03e.png

image.png.5fc8955bc30d07a29b78d182f093c6ee.png

image.png.22643bb259d8373e06c21f6c04e6dc5b.png

You are on to something here.. that looks very inconsistent! Note how the page name in the log message is always correct but shows wrong in the "Settings" tab.

Edited by poljpocket
  • Like 1
  • Thanks 1
Link to comment
Share on other sites

Thanks for testing, @poljpocket!
Interesting. Consistent inconsistent at least!😉

 

42 minutes ago, poljpocket said:

And even more. This is how the page names and titles show up in the database. Looks like it is using some non-standard punycode to encode the names and this is also where everything seems to break:

Yes, I discovered the punycode in the ascii-only name column.
But you are right, this seems to be non-standard when I compare it with e.g. with the results from punycoder.com (Punycode converter IDN converter).
I wonder whether the conversion to punycode is still appropriate at all!?

Link to comment
Share on other sites

Ah, did you tried the „ß“ examples?

14 hours ago, sebibu said:

✗ Fuß → s (?? should be fuß)
✗ Füße → füsse (should be füße)

I thought PageNameTranslate is not used when using

$config->pageNameCharset = 'UTF8';

 and ß is in the whitelist!?

Link to comment
Share on other sites

  • 4 weeks later...

What could be the reason generated pagenames via my Pages::saveReady-hook that are initiated

  1. by hand over the Processwire backend with a " in the title differs from this
  2. fired in TracyDebugger's console via API $p->save(); like this:
foreach($pages->find("template!=admin, has_parent!=2, include=all") as $p) {
  $p->save();
  echo "Page {$page->title} (#$p) updated to {$p->name}<br>";
}

E.g.
Title: "Foo!" ich muss los
Pagename via backend: 2025-01-10-foo-ich-muss-los ✓
Pagename via API: 2025-01-10-quot-foo-quot-ich-muss-los ✗

I would expect this to be the same result.

Any ideas?

Link to comment
Share on other sites

1 hour ago, sebibu said:
  • by hand over the Processwire backend with a " in the title differs from this
  • fired in TracyDebugger's console via API $p->save(); like this:

What exactly do you mean / did you do?

Link to comment
Share on other sites

25 minutes ago, bernhard said:

What exactly do you mean / did you do?

(While the Pages::saveReady-hook is placed in ready.php ...)

1. I click on the save button on the page with pagetitle "Foo!" ich muss los
2. I fire the API $p->save() via TracyDebugger's console and save the page with pagetitle "Foo!" ich muss los. And get a wrong pagename with quot in it where " has been.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...