Jump to content

Progress on ProcessWire 2.2 and overview of multi-language support


ryan
 Share

Recommended Posts

I've been making good progress here on ProcessWire 2.2 which we're targeting for the end of the year. The main drive of this version is adding multi-language support to the admin. This multi language support is being sponsored by Avoine, thanks to Antti Peisa. I just wanted to outline the approach we're taking as well as get any feedback and suggestions you have. The direction taken with multi language support is based on conversations Antti and I had a few months ago as well as feedback from users in the forum.

What ProcessWire 2.2 focuses on


There are lots of components to supporting multiple languages in a CMS, and the area we're focused on with this version is supporting multiple languages for PW's admin, modules and 3rd party modules. This is so that your clients can make edits to their site without having to know English. While PW hasn't had specific multi-language tools for your front end, it can support multiple languages any number of ways there (as many users have done, with language trees, multiple fields, etc). But the back-end tools have always been in English without an obvious way around that. So the biggest need has been a way to support multiple languages for PW's admin tools, core modules, 3rd party modules, etc. Basically, the non-dynamic side. And that's what PW 2.2 tackles.

The direction being used


We originally looked at PHP's gettext tools (http://www.php.net/manual/en/book.gettext.php), as they've been successfully used to provide this capability for CMSs like WordPress and Drupal, and they are a ready-to-go solution that is built-in to PHP and adds multi language capability with little overhead. I like gettext() from the PHP side as it's a really easy solution (for coding at least), and it's really efficient from the PHP side. But I felt that it puts a lot of burden on the translators with a lot of technical jargon, file formats and other tools. I want the translation tools to have the same simplicity as the rest of PW, and it looked like that was going to be a hard sell with gettext.

Instead, we're using a home-brewed solution that essentially does the same thing as gettext, but is a whole lot simpler for translation. It's simple enough that if you find something that needs translation, you can just go and edit it in PW like you would edit anything else. It also means that these tools will be very simple for 3rd party module developers to use. They will also have use in your own sites and templates for your non-dynamic content.

The tradeoff is that it takes longer to code this way (for me) and that it can't possibly be as efficient as gettext (given that gettext is built-in to PHP). Translations take up memory. However, PW, doesn't have files with tens of thousands of lines of code and doesn't need to keep thousands of translations in memory at any given time. When it gets down to the needs of PW and the needs of those using it, I don't believe we'd ever benefit from the efficiency of gettext. So the home-brewed solution won out.

The actual solution is coded as a module. So if you don't need anything other than English in the admin, then you'll just leave the 'Languages' module uninstalled.

How it works from the admin side


The Languages module will be provided with the core, ready for a 1-click install. As soon as you click 'install', a 'Languages' tool is added to your Setup menu, a 'language' template is added to your templates, and a 'language' field is added to your fields and appended to the 'user' template with the default language (English) selected.

When you go to Setup > Languages, you'll see a list of languages that are installed. From here you can click to edit a language or click 'Add New' to add a new language. Each language is technically a page (just like with users), so you can add additional fields to the 'language' template should it suit your needs. By default, each language template has a name (ISO-639 code, i.e. 'en'), a title (i.e. 'English') and a 'translations' file field. Each file in this field contains translations for a file in ProcessWire (whether a module, core, template, etc.). These files are created by another module in ProcessWire (to be discussed below) but a files field is provided here so that you can easily share your translations with other people. Likewise, you'll be able to download new language packs from the ProcessWire site and just upload them here, ready to use.

How to make a translation


ProcessWire keeps track of language translations on a per-file basis. So if you want to translate a file in ProcessWire, you have to tell it which one. ProcessWire will then load the file into memory and look for function calls that indicate translatable text. Then it presents you with a screen of all the phrases it found for translation, along with inputs for providing a translation for each one. See the attached screenshot for an example of this. When you hit 'save', it saves those translations in a JSON file that PW's languages module uses for runtime translation. This JSON file can also be shared with others, distributed with a module you've created, or zipped up with others as part of a language pack.

As soon as the English version native to the translated file changes, the translated versions are considered out of date. We don't want to make guesses about the scope of the text change. So your translation screen will show you any existing translations that are considered 'out of date' along with new entry fields to provide new translations.

Any fields left blank on the translation screen are considered untranslated and thus the original language (English) is substituted for any untranslated phrases.

How it works from the code side


I mentioned earlier that I like how gettext works from the developer side, and ProcessWire works in a very similar manner in this regard. Meaning, indicating text as translatable involves a "_" function call followed by the text. ProcessWire also needs a context, so you call this function with '$this'. Here's an example:

$value = $this->_('Add New Page Here'); 

So $this->_('text') identifies the text as translatable. You have to use $this->_('text') rather than the gettext format of _('text') for two reasons: first is that the _('...') function is native to PHP and actually calls gettext, so that's already taken! Second is that ProcessWire needs a context to the function call… it needs to know what class or file it was called from, otherwise all translations would be a global namespace. So ProcessWire figures that out behind the scenes with a context of the calling class and uses the Reflection API to determine the file. If you didn't understand that last sentence, then don't worry because you don't need to–ProcessWire is taking care of the technical details for you.

In addition to the $this->_('text') you can use a function format that would most likely be more suitable for translations performed in your template files. In this case, because "_" is already taken by gettext, we use "__" like WordPress does. So you can do this:

echo __('Submit Form');

ProcessWire figures out the context of that call automatically and groups the translation with others from your template file. This type of call can also be used in 3rd party modules, but you'll have to tell PW the context, i.e.

<?php
echo __($this, 'Submit Form');  // these two lines do exactly the same thing 
echo $this->_('Submit Form'); 

Best practices for pre-translated text


Just like with gettext, when entering text that's ultimately going to be translated, you need to avoid putting in any dynamic things in it. For instance, a string like "Found $count products" is not good for translation because it contains a variable in it. Instead, you'd want to enter such a string as: sprintf("Found %d products", $count). For more details, see the 'Marking Strings for Translation' and 'Best Practices' sections in WordPress's i18n page. They use gettext, but the same applies to us:

Marking strings for translation

http://codex.wordpress.org/I18n_for_WordPress_Developers#Marking_Strings_for_Translation

Best practices

http://codex.wordpress.org/I18n_for_WordPress_Developers#Best_Practices

New API variables


The Languages module installs a new API variable called $languages. It's interface is identical to that of $users in that it can be iterated as a PageArray or you can use get() and find() calls to pull individual languages from it, i.e.

<?php
foreach($languages as $language) {
   echo $language->title . "<br />";
}

…and…

<?php
$french = $languages->get('fr'); 

The Languages module also adds a $language variable to every $user. So you can check what the current language is like this:

<?php
if($user->language->name == 'es') echo "Hola!"; 

Of course, $language is just a Page object, so it will also contain any other fields you have added to your 'language' template.  

Because ProcessWire's native language is English, the language is assumed to be English by default. So if you want it to assume a different language by default for your site, then you would just edit the 'guest' user and select a different language.

Next Steps


So that's the current state of ProcessWire 2.2 and how it's multilanguage support works. But it's not the only part of it. Because ProcessWire's admin is itself a site developed in PW, and because PW's output is not all non-dynamic, we need multilanguage support in our dynamic fieldtypes and inputfields in order to be truly multilingual. This is what I'll be focusing on in November, so will have more updates on that side of it then. Of course, this side may also be useful for the front-end (your sites) too, though my feeling is that a multi-tree approach is better for accessibility and SEO even if you do have multilingual fields. But even with a multi-tree approach, having multilingual fields will no doubt be useful with shared assets and more.

I'm also hoping to have a beta version ready for those that are interested in testing within a month (or in the next few days, if interested in testing before multilingual fields are in place).

Please post your questions, suggestions, feedback, etc. This is a work in progress and nothing is set in stone.

Edit: added 'New API variables' section.

post-1-132614279413_thumb.png

post-1-132614279439_thumb.png

  • Like 1
Link to comment
Share on other sites

Great to read such a detailed information, Ryan. This looks great!

Thanks for building this and can't wait to start translating pw!

EDIT: Read this again with more time, and gotta say that you really nailed this. I don*t find even a single shortcoming at least for use cases in our company.

Link to comment
Share on other sites

By default, each language template has a name (ISO-639 code, i.e. 'en')

It might be better to use IETF language tag as a name. As wikipedia says:

The tag system is extensible to region, dialect, and private designations.

http://en.wikipedia.org/wiki/Language_code

That means en-GB (for British English) and en-US (for US English), fi-FI for Finnish etc.. Allows even funny stuff like en-US-x-pirate for pirate talk ;)

But as languages are pages then this is more question about default naming convention and we can actually set the names however we want? Then it might be good to keep "en" as default and then if someones really cares they can create en-GB or even that pirate-stuff...

Link to comment
Share on other sites

It might be better to use IETF language tag as a name.

Good point–I think you are definitely right about that. I've not put too much thought into it because the eventual names are up to the user. When they add a language, it'll ask them to type in what they want the name to be. But I think suggesting a name format is a good idea. Not to mention, when we distribute language packs, we'll probably want to use the IETF rather than ISO name.

Read this again with more time, and gotta say that you really nailed this. I don*t find even a single shortcoming at least for use cases in our company.

Great! Glad it sounds good to you so far. Most of this is stuff you and I already figured out before in those emails a few months ago, so a lot of credit for the solution here goes to you. Though I've not yet covered everything we talked about, but do plan to. I actually have the modules working at this stage, but not yet exactly as described (though close). So part of the purpose in writing it all up was to outline what's already been done and what's left to do. I worked on the multi language capabilities for 3 days straight and have to work in something else the rest of the week so wanted to get it all written down while it was fresh on my mind. Also wanted to get feedback from others that work with multi language systems more than me just in case I'm missing anything major. My experience in this area is somewhat limited, so it's a learning process here. And it may take us a couple revisions to get everything right, but the goal is to provide the best multi language capabilities out there.

Link to comment
Share on other sites

Ryan, this is just great! I would be happy to beta-test it on my local server and maybe to begin translation if it's already possible.

Everything is clear, but I didn't quite understand where the translation is taken from during runtime. Is it kept in memory as an object or is it parsed from JSON file every time translation is needed?

Link to comment
Share on other sites

Thanks, I will definitely appreciate your help here. It keeps a separate JSON translation file for each .php or .module file that is translated. Meaning, if you translated a ProcessPageEdit.module, then it would keep a languageName.ProcessPageEdit.json file containing all the translations for ProcessPageEdit (in 1 language). When you edit a page, all the translations for the current language in ProcessPageEdit get loaded. So it's not as memory efficient as loading each phrase translation at the time it's needed, but it is of course much faster. ProcessWire is not very 'wordy' :) so there isn't actually a lot of translation to keep in memory at any given time. But if memory becomes an issue, the plan is to provide an option for each translation that delegates it as 'always-load' or 'on-demand'. Currently all are 'always-load'. But 'on-demand' would be useful for things like error messages that rarely appear. Perhaps I'll provide this is an optional second boolean param to the $this->_() function that lets you mark a given phrase as 'on-demand' for such cases.

Link to comment
Share on other sites

Hello,

New PW user here, just discovered it a few days ago. I like it, so as a french guy I plan to help with the french translation (fr-FR). I'm a newbie in PHP (I'm the java-for-work and python-for-fun type) but the post by Ryan reassure me about the simplicity of the translation work.

Link to comment
Share on other sites

Wow, it sounds great at this point. Here are some of my ideas and questions for this:

- I would like to have an official translation (maybe in a subpoint of "Download" on this site), which is updated with every new release. So there would be a central point for the core translations.

- A great way to add languages after the installation is used by "TextPattern". There you can download new languages directly into the admin area and install it. Maybe it's worth a look ;)

- I guess it's possible to create URLs like "www.example.com/en/..." with a language tree, isn't it?

Greets,

Nico

Link to comment
Share on other sites

- I would like to have an official translation (maybe in a subpoint of "Download" on this site), which is updated with every new release. So there would be a central point for the core translations.

That is definitely planned.

- A great way to add languages after the installation is used by "TextPattern". There you can download new languages directly into the admin area and install it. Maybe it's worth a look

Sounds good, I will take a look at that. What we have now may be close. The way it's currently setup is that you have a files field that you can drag in a ZIP of a full translation, or individual translation files. Then you can choose to edit the translations in the file if you want to. Or, you can create a new one just by clicking the 'new' link at the bottom. If you create a new one, then it'll appear in your files list and you can download and share it with others too. See the attached screenshots. I may add a URL input to the upload area of the files fieldtype so that one could just paste in the URL directly to the file (at processwire.com) and add it that way too. 

- I guess it's possible to create URLs like "www.example.com/en/..." with a language tree, isn't it?

Yes, I think this is the way many are already setting up multi language sites with PW.

I can do the British English translation - en-GB

Thanks Pete! And I'll take care of en-US :)

Well, I'm German so I would like to help with the German (de-DE) translation

As I promised earlier: pl-PL

This is great, I can't wait to get this finished and get these translations going. Thanks again to all of you that are offering to make translations.

post-1-132614279498_thumb.png

post-1-132614279767_thumb.png

post-1-132614279793_thumb.png

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...