Jump to content

LanguageLocalizedURL


mcmorry

Recommended Posts

LanguageLocalizedURL module

Localized URL generator and parser

You can find the last version here: http://modules.proce...-localized-url/

Or from the repository on github: https://github.com/m...e-localized-URL

This module is useful to generate localized url using the language code as first 'folder', and then the localized titles of the nested pages:

(removed previous instructions to prevent double maintaining.)

See more infos in readme of the module with instruction and informations.

https://github.com/m...aster/README.md

Edited by Soma
  • Like 11
Link to comment
Share on other sites

Nice job with this module. Thanks for your work here. Looks very well put together and seems to solve the need nicely.

Just to make sure I understand the usage correctly: This is meant to be combined with use of FieldtypePageTitleLanguage, FieldtypeTextLanguage and FieldtypeTextareaLanguage?

Does this look like the sort of site structure one would create with this module? Basically, a traditional structure, but then with the languages added as root level pages, but with no children:

/about/

/about/background/

/about/history/

/contact/

/products/

/product1/

/product2/

/en/ <- language: English

/es/ <- language: Spanish

/fr/ <- language: French

So if I hit the URL: /es/sobre/fondo/ it would know to display /about/background/ (using Spanish fields)?

Would I replace all of my $page->url() instances with $page->mlUrl() ?

Thanks,

Ryan

Link to comment
Share on other sites

Hi Ryan,

thanks for your nice words :)

Yes you caught exactly the point.

Initially I was thinking to hook the path() method to generate all the urls of the website, but I was not sure if it could be too invasive.

But yes, you could replace $page->url() with $page->mlUrl() without any problems. (to be tested of course).

Link to comment
Share on other sites

Hey mcmorry

Thanks again for putting this together. :) Some feedback.

I guessed it, but newcomers maybe could have problems understand that URLsegments needs to be enabled in PW template settings to make it work. :) Maybe suggest the proxy pages could also be made hidden, to avoid showing up in the navigation.

The module unfortunately doesn't work well with special chars in many languages. Like Umlauts ÜÖÄäöü ... àåéê and a lot more. I end up with /de/1001_-ber-uns/ instead of /de/1001_ueber-uns/.. Note that those conversion are used in the default title -> name relation when creating a new page, and can be configured in the page name inputfield module in processwire. Maybe there is a way to get that config from the module loaded to do the slug. Or if not, add it to this module locally.

The mlUrl() seems to work expect the link for homepage "/" doesnt work. It has "".

Also to avoid the problem of having doubles in alternative languages, id feature has to be always enabled. As there's nothing preventing me from entering same title for two pages under same tree. Or maybe there's an easy way to prevent it with a hook and do some additional validating on page title language fields.

Some idea I had would be to have an install method in the module to install the needed template with urlSegments enabled and prepare the default and other proxy pages according to the language settings in PW.

Link to comment
Share on other sites

The module unfortunately doesn't work well with special chars in many languages

The toSlug function that I used in the other thread came from a google search... I don't remember from where exactly. It's still that one that is being used, right?

It would definitely be better to use the same method that is used by PW.

edit: Have to go to bed, this dawn I will fly to the land of the Umlauts and Eszetts

Link to comment
Share on other sites

Thanks for your feedback. Very appreciated.

I guessed it, but newcomers maybe could have problems understand that URLsegments needs to be enabled in PW template settings to make it work. :) Maybe suggest the proxy pages could also be made hidden, to avoid showing up in the navigation.

....

Some idea I had would be to have an install method in the module to install the needed template with urlSegments enabled and prepare the default and other proxy pages according to the language settings in PW.

Yes it's the best solution. I'll try to implement it.

The module unfortunately doesn't work well with special chars in many languages. Like Umlauts ÜÖÄäöü ... àåéê and a lot more. I end up with /de/1001_-ber-uns/ instead of /de/1001_ueber-uns/.. Note that those conversion are used in the default title -> name relation when creating a new page, and can be configured in the page name inputfield module in processwire. Maybe there is a way to get that config from the module loaded to do the slug. Or if not, add it to this module locally.

I tried to use the same method used in the Pages->save() funciton:

$this->fuel('sanitizer')->pageName($localizedTitle, true);

but the behavior is the same. Instead the Javascript code of the admin panel contains a long list of characters to be replaced with latin ones.

It should be integrated in the Sanitazer object as well. In the meantime I'll implement it locally.

The mlUrl() seems to work expect the link for homepage "/" doesnt work. It has "".

Ops, fixed. I'll release the fix soon, together with more features.

Also to avoid the problem of having doubles in alternative languages, id feature has to be always enabled. As there's nothing preventing me from entering same title for two pages under same tree. Or maybe there's an easy way to prevent it with a hook and do some additional validating on page title language fields.

The best could be to store a localized unique name when inserting a localized title, using the same method that PW uses for page name. An important question is: what to do if you change a title? Should the localized name change accordingly? From a SEO point of view this could be a problem, and should be avoided. An option box could be added to force it, but I don't know how to add it in the FieldtypePageTitleLanguage.

Any suggestions?

Thanks

mcmorry

Link to comment
Share on other sites

The toSlug function that I used in the other thread came from a google search... I don't remember from where exactly. It's still that one that is being used, right?

It would definitely be better to use the same method that is used by PW.

Yes it's still your code. I'll improve it using the PW Sanitizer plus a map of replacement characters.

Link to comment
Share on other sites

You could get the page name inputfield replacements settings like this:

$pn = $modules->get("InputfieldPageName");

if(array_key_exists('replacements',$pn->data)) $replacements = $pn->data['replacements'];
else $replacements = InputfieldPageName::$defaultReplacements;

print_r($replacements);

Edit: There was some discussions about this subject and to wether sanitizer should support the replacements from page name inputfield. And it seems it's still open to make it consistent throughout the system. See thread here for Ryan's thinking http://processwire.c...t-module-issue/

Link to comment
Share on other sites

Hello mcmorry, are you still working on this?

Mind creating a repo on github for this module, so we can collaborate more easily on this? (If you want, I could create one on my account.

However...

While playing around with multilang and this module I found a simple way to use sanitizer (which Ryan already modified to allow replacements).

The slug method can be changed to:

public function toSlug($str) {
$str = mb_strtolower($str); // needed for uppercase cyrillic chars
$str = $this->sanitizer->pageName($str, Sanitizer::translate);
return $str;
}

Also has to change or add various code to support urlSegments and pageNum. So it doesn't throw an 404 but load the page before it reaches segments that aren't solved.

I also changed the hook to

$this->addHookAfter('Page::path', $this, 'mlUrl');

So the system internally also works, and no change is required to use mlUrl. It works great so far.

I kinda collaborated with MadeMyDay over the weekend on this, as he's doing a website using this language urls together with the multisite module. So we made similar changes, altought he got more I think. He got it also working using both modules. So this could turn into an interesting alternative to multilanguage sites.

Edit::

I also tried to find easy solution for defining published languages. So I created a page field that returns the proxy language pages as checkboxes. So you can turn a language off. The module then looks for this and throws an 404. To get this to work, navigation code, language switch, and all list generation has to consider those, but it works ok.

A language switch would be a simple as:

$lang = $user->language;
$langname = $lang->name == 'default' ? 'en' : $lang->name;

$user->language = $languages->get("default");
$st = $langname == 'en' ? " class='on'" : '';
echo $page->language_published->has($pages->get("/en/")) ? "<li><a$st href='{$page->url}'>EN</a></li>" : '';

$user->language = $languages->get("de");
$st = $langname == 'de' ? " class='on'" : '';
echo $page->language_published->has($pages->get("/de/")) ? "<li><a$st href='{$page->url}'>DE</a></li>" : '';

$user->language = $lang;
  • Like 2
Link to comment
Share on other sites

Wow, I'm really impressed on how much interest there is for this module :)

Sorry if I didn't work on it yet, but I'm focused on a important milestone of a different project that, I hope, should be released tomorrow.

I really would like to work with you to improve this module and of course if you could host it on GitHub will be the best solution.

I've seen that I don't have so much experience on PW, but I'll do my best.

Thanks again

  • Like 3
Link to comment
Share on other sites

I really would like to work with you to improve this module and of course if you could host it on GitHub will be the best solution.

I've seen that I don't have so much experience on PW, but I'll do my best.

I am - and I think others too - very impressed about this module. The issues you are solving are one of the hardest there is and go pretty deeply into the PW. So great respect from here and warm welcome to the community!

  • Like 2
Link to comment
Share on other sites

Mind creating a repo on github for this module, so we can collaborate more easily on this? (If you want, I could create one on my account.

I created it : https://github.com/mcmorry/PW-language-localized-URL

The slug method can be changed to:

public function toSlug($str) {
$str = mb_strtolower($str); // needed for uppercase cyrillic chars
$str = $this->sanitizer->pageName($str, Sanitizer::translate);
return $str;
}

Yes it works perfectly.

I also changed the hook to

$this->addHookAfter('Page::path', $this, 'mlUrl');

So the system internally also works, and no change is required to use mlUrl. It works great so far.

I tried it, but the administration get broken. It tries to load everything with /en/ when loading the list of pages. Did you resolved it already?

I'll try to understand better what is wrong.

Link to comment
Share on other sites

Thanks for creating it. I was just about to create on e today. :)

Ok I'll look at it again later but to get it work in admin you need to exclude admin template and it works like a charm.

Here's my mlUrl. Note the $page->template == 'admin' ..

I also fixed the homepage url in here.

public function mlUrl(HookEvent $event) {
	$page = $event->object;

	if($page->template == 'admin') return;

	$includePageId = false;
	$includeParentId = false;

	if (count($event->arguments) >= 0) {
		$includePageId = $event->arguments(0);
	}
	if (count($event->arguments) >= 1) {
		$includeParentId = $event->arguments(1);
	}

	// add the language code at the beginning of the url
	$lang = $this->user->language->name;
	if (!$lang || $lang=='default') $lang = $this->getDefaultLanguage();

	// if on homepage return current language root
	if($page->id === 1) return $event->return = "/$lang/";

	// generate the url using titles and, evetually id
	$path = '';
	$parents = $page->parents();
	foreach($parents as $parent) {
		if($parent->id > 1) {	  
			$path .= "/".($includeParentId?$parent->id.'_':'').$this->toSlug($parent->title);
		}
	}
	$url = $path . '/'.($includePageId?$page->id.'_':'').$this->toSlug($page->title) . '/';

	$event->return = '/'.$lang.$url;
}
Link to comment
Share on other sites

I would like to throw some ideas to the discussion.

  • As I understand, if you put the id on the segment, the search will be faster and less expensive, but the content will be the same as in the url without the id (am i right? I'm not sure...). Won't this duplicated content be bad for SEO? Maybe there can be an option on the module to turn this on and of. In this case you would do one thing or the other, but not both at the same time.
  • Would be nice to just use $page->url normaly. Isn't it possible to replace $page->url for these pages, instead of creating the new mlUrl method?
  • The main thing I don't like about this method is loosing the ability to use the urlSegments. I was wrapping my head around a way of, somehow, recognize that the segments are being used for this, and what segments are not. Maybe the module could allow the developer to declare on the template that the last (n) segments should be ignored and used normally --of course there would have to be some kind of reajustment to make $input->urlSegment(1) behave like $input->urlSegment(1+(number-of-segments-used-by-the-module)) on the template.
Link to comment
Share on other sites

Hi Diogo,

thanks for your suggestions.

I've just updated the github repository (https://github.com/mcmorry/PW-language-localized-URL) with the last changes from Soma.

Almost all your suggestions has been implemented.:

  • $page->url instead of $page->mlUrl()
  • the use of id based on the module settings
  • support for urlSegments. Thought the segments are not automatically readjusted. (on this I have to check better)

I'll wait for your feedback.

  • Like 1
Link to comment
Share on other sites

Thanks mcmorry for the pull.

I would like to throw some ideas to the discussion.

  • As I understand, if you put the id on the segment, the search will be faster and less expensive, but the content will be the same as in the url without the id (am i right? I'm not sure...). Won't this duplicated content be bad for SEO? Maybe there can be an option on the module to turn this on and of. In this case you would do one thing or the other, but not both at the same time.

I don't see this as an issue. It depends on how you make your urls on the site. If you consistently use one of the url approach, search engines will never know there is a url without the "1002_".

  • Would be nice to just use $page->url normaly. Isn't it possible to replace $page->url for these pages, instead of creating the new mlUrl method?

This has been mentioned on the last posts. It is now like this and you can use $page->url normally.

  • The main thing I don't like about this method is loosing the ability to use the urlSegments. I was wrapping my head around a way of, somehow, recognize that the segments are being used for this, and what segments are not. Maybe the module could allow the developer to declare on the template that the last (n) segments should be ignored and used normally --of course there would have to be some kind of reajustment to make $input->urlSegment(1) behave like $input->urlSegment(1+(number-of-segments-used-by-the-module)) on the template.

I don't see any problems here. URL segements are now supported and still can be used if you enable urlSegments on the template. I implemented something that won't throw an 404 if a segments isn't resolved to a page and urlsegments are enabled.

Of course it also contains the segments from the language root pages (/en/, /de/). But you can easily slice them of using the $page->parents count. So you end up with an array containing only the segments from the current page.

As example, put something like this in the head.inc.

$urlSegments = null;
if($page->template->urlSegments)
$urlSegments = array_slice($input->urlSegments,count($page->parents));

And later you can use the urlSegment array in your template code:

if($urlSegments){
echo "<br/>urlSegments: ";
print_r($urlSegments);
}

Or just use the $input->urlSegment as ususal, just have to be aware what level you are using it.

You could also simply use

$input->urlSegment(1 + count($page->parents));

And it also works now with using pageNum for pagination if enabled on template of the current page. No need to change anything.

There's some things to account for if using this module, and it has various side-effects , but so far most of it can be worked around with the excellent and system and simple API. I think for simple multilang sites, it will work quite well. I wouldn't recommend doing a big and complex site using this approach yet.

Link to comment
Share on other sites

Thanks mcmorry for the pull.

Thanks to you for your help ;)

For next push it could be better to work on "develop" branch. So that the master will be updated correctly with documentation too.

Now I'll update the readme file to include the new features.

Of course it also contains the segments from the language root pages (/en/, /de/). But you can easily slice them of using the $page->parents count. So you end up with an array containing only the segments from the current page.

Why you think that is not good to implement this behavior directly inside the module? maybe with another configuration setting to disable it.

There's some things to account for if using this module, and it has various side-effects , but so far most of it can be worked around with the excellent and system and simple API. I think for simple multilang sites, it will work quite well. I wouldn't recommend doing a big and complex site using this approach yet.

Could you give more info about this? What kind of side-effects it has? And why you wouldn't recommend it for big site?

Link to comment
Share on other sites

Ah, sorry didn't recognize you made a dev branch. You could have waited for me to change it or not just pull in to master but commit it manually to your dev branch. Sorry for the inconvenience.

urlsegments remapping

My idea was to do it in the module and I did it, but uncommented the code because it requires to add a method to the core WireInput class to allow to unset the urlSegments array, which is protected and can't be done without (or I don't know how). Maybe if Ryan is willing to implement a unsetUrlSegment($key) function this could work out. I'm not sure what this would mean to the system, and I haven't done much testing. So yes if it would come to be, there would have to be a option to disable or enable it. By default off, I don't know.

side-effects or drawbacks

It's not as worse as I orignally thought before this subject and module came to be. But there's still some things to consider.

Mainly just thinking about what could stop working or is not possible (for now) taking this approach having multiple languages on one page. For example access management, you can't manage language separate and give separate access to them.

The publishing has to be done with a workaround and will then have to take care of both in all template and possibly module code.

Third-party modules, for example AdminBar doens't work together with this module. Mostly front-end modules I think are affected depending how it's done. Having PW's ->url replaced does really solve a issue that would be worse when having to use ->mUrl().

The system outputs default language if alternative isn't populated. This behavior is, as I understand, because it's build for the admin. I don't wan't english text to be shown if a german text is left out. I haven't really looked into it, and how it should be treated.

The approach in generating the slug from the title of the language, could cause troubles. Changing the title is too easy, which will result in different url. Alternative language field aren't required.

Having "id_" will solve the issue of doubles under 1 tree. Though I'm not happy using it because it add an element which can stop some modules from working that use the url or path. So having then instead the $page->mUrl method back and use it only for navigation of the site, will still cause troubles in any module using the url or path.

Mainly I worried about replacing the url method of PW. It works better than I thought, when excluded from the admin template. But what will this mean for other development of modules. MadeMyDay uses this module, together with the Multisite module, which also does hook the path method. He seems to have solved it without much troubles. That also speaks for PW being a flexible and well thoughtout system Ryan has put together here, so most of the issues can be solved in one way or another. And Ryan is open to new stuff and will certainly help us and implement necessary functions or features.

Edit:

Another thing I noticed that if a parent page of the current page is unpublished the page isn't available too. This behavior is due to the way it parses the urlsegemnts.

Link to comment
Share on other sites

Thanks for explaining in detail all these problems.

I have just release an update on master branch with new documentation.

Plus, I've also added a hook to the Page object to retrieve the adjusted urlSegments.

I tried to add the hook it to WireInput so to have

$input->mlUrlSegments

but it doesn't work.

Link to comment
Share on other sites

As I wrote before, it doens't work without hacking the WireInput class, it doesn't have any hookable functions. And urlSegments array is protected.

To get it to work, uncomment my code and add this method to the Input.php WireInput around 226 after setUrlSegment($num, $value)

/**
* Unset a URL segment value
*
* @param int $num Number of this URL segment (1 based)
*
*/
public function unsetUrlSegment($num) {
unset($this->urlSegments[$num]);  
}

In PW all hook-able functions have "___" prefix.

Link to comment
Share on other sites

I didn't want to hook to a specific method, but I wanted to add a new property to that class.

How does it work when I want to create a new method or property using hooks to a specific object?

Of course I can't search for a function with "___" prefix.

EDIT: ... Probably the class should extends Wire. Is it correct?

Link to comment
Share on other sites

It's not possible I think. WireInput isn't extending Wire /WireData, not sure if that's correct, but I think.

Not sure what you mean with "I can't search for a function with "___" prefix?"

I don't know if adding a new property to the wireinput will be good. I'm eager to hear Ryans opinion on this also. As I'm not sure what would be best.

However you could also add a method to the module that can be used to have mlUrlSegments array in template

public function mlUrlSegments(){
   $urlSegments = null;
   if($this->page->template->urlSegments) 
       $urlSegments = array_slice($this->input->urlSegments,count($this->page->parents));
   return $urlSegments;
}

in the template call it like this.

$urlSegments = $modules->get("LanguageLocalizedURL")->mlUrlSegments();
Link to comment
Share on other sites

Ok never mind. I'm not the best in English. sorry :-[

Anyway I've already added the method as you just suggested, but I've created a Hook with addHookProperty to the Page class to get the urlSegments.

I'm not sure if it's the best approach. Probably is better to access directly to the module method as you told.

I'll change it.

If later Ryan will be available to create the unsetUrlSegment it will be the best.

P.S. I've seen that there is a limit of 4 urlSegments, and of course if you have a deep tree of localized pages you could be very limited and tun soon in some problems. Do you know why there is this limit and if is it possible to change/remove it?

Link to comment
Share on other sites

Thanks for the changes mcmorry! Sorry if I didn't understood that some things that I asked were already implemented. I must confess that the development of this module is way too far of my understanding of how PW works... I will keep following it mainly as an observer ;)

This has been mentioned on the last posts. It is now like this and you can use $page->url normally.

Yep, I'm also sorry for this. I guess I'm better at reading english, than at reading code :)

P.S. I've seen that there is a limit of 4 urlSegments, and of course if you have a deep tree of localized pages you could be very limited and tun soon in some problems. Do you know why there is this limit and if is it possible to change/remove it?

You can change this limit on the config.php file

$config->maxUrlSegments = 4;
  • Like 1
Link to comment
Share on other sites

I just tried to extend the WireInput, but no luck, as I don't understand it completely.

Yeah it could also be a method added to page.

PS:

Yes you can in the config.php. $config->maxUrlSegments = 4

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...