Multi-language field translation export/import

In this post we cover the details of a new module that enables export and import capabilities for multi-language fields in ProcessWire.

Typically the way you translate page field values in ProcessWire is to edit a page, view the text in one language, and translate the text into another language, directly in the page editor.

This new ProcessLanguageFieldExportImport module provides an alternative to that process, moving the translation task out of the ProcessWire admin and enabling it to be completed with external tools. It does this by making the multi-language field values exportable and importable via JSON and/or CSV files. Translation tools can be as simple as a spreadsheet or something more dedicated that can work with the JSON translation files.

With this module installed, the translators do not need access to the ProcessWire admin environment in order to perform language translation. In addition, because a single translation file can cover any number of pages and fields, it also means that translation can likely be completed much more quickly than it could be in the admin environment.

Note that this is for exporting/importing page field values, and not static translations of text in PHP files — those already have a built-in export and import function.

Acknowledgements

Special thanks to Update AG and Oliver Arnoczky for providing the support and direction that made this module possible.

Export/import file formats

This module provides export and import of multi-language field translations in JSON or CSV format. Both files (downloadable and uploadable) and text (copy/paste) are supported for export or import.

The JSON format is useful for importing into external translation tools where the translations can be edited and then imported back into this module. The CSV format is useful if translating from within a spreadsheet.

The primary export/import format used by this module is functionally similar to the XLIFF format except that it is JSON rather than XML, and is applicable to the “web content” context, where each translation is for a specific page and field (or subfield).

Supported multi-language field types

The most common multi-language Fieldtypes in ProcessWire are currently supported by this module:

  • Text (and multi-language types derived from it)
  • Textarea (both plain and rich text/markup)
  • File/Image (description text only)
  • Repeater
  • FieldsetPage
  • PageTable
  • ProFields Repeater Matrix
  • ProFields Table
  • ProFields Textareas

Support for ProFields Combo (FieldtypeCombo) is also planned. Support for additional types are likely to be added in future versions.

Standard usage

  1. Access the module in your ProcessWire admin from: “Setup > Translation export/import.”

  2. Use the “Export” tab to define what you want to export. See the “Exporting” section of this post for more details. Download the exported file.

  3. Edit the exported file in another software to perform translation (or send it to your translator). Whoever performs the transation, they should update the file to fill-in the translated text.

  4. Once you have a translated file, return to this module and this time choose the “Import” tab. See the “Importing” section of this post for more details on all of the options available.

Exporting text for translation

This section describes the settings available on the Export tab. Exporting is a 1-step process where you select what you want to export and then click the submit button. An export file (or copy/paste text) is immediately generated.

Source language and target language

You select a source language and a target language. The source language is the language that the translator will be translating from and the target language is the language you are translating to. For instance, if translating from English to Spanish, the source language will be English and the target language will be spanish.

Selecting fields to export

You can optionally select which fields you want to include in your export. If you do not select any fields then all supported fields are included in the export.

Selecting pages to export

This module exports a set of pages matching a selector that you define (using the InputfieldSelector module). This is the same means of selection that you use for the "Filters" in ProcessWire's admin search engine (Lister).

Export format

Either JSON or CSV export formats are supported. And you can choose whether you want to receive them in a file download or visible text that you can copy/paste. If performing a large export matching hundreds of thousands of pages, chances are that you'll want to use the file download. Whereas copy/paste text is more useful for getting familiar with the format and testing exports.

Export options

  • Omit already translated rows
    If some text is already translated, check this option to exclude it from the export. This ensures that your export focuses exclusively on text that is not yet translated.

  • Use compact format (omits hierarchy indicators)
    Check this option for a smaller export size. It omits some indicators that are not needed for import, but that may be helpful (for developers or translators) to identify where the translation exists in the site. The compact format differs from the regular format mostly when exporting types with subfields or repeatable types, each with their own fields.

  • Repeat source value in target when not yet translated
    When this option is checked the untranslated text will be repeated in both the source and target value. This is useful if the translator prefers to translate by modifying existing text.

Importing translated text

Importing is a simple two-step process:

  1. In the first step you either upload or paste in the tranlated JSON or CSV content. When you click the Continue button your translated content is evaluated and is ready for import.

  2. In step 2 it will tell you how many pages and translations were found. From here you can tell it what you want to import and how you want it to import:

Select pages to import by template

A list of all found page templates will be shown, enabling you to select which pages you want to import (by template). If you do not select any templates then it will import all pages in the import data.

Import options

You can check one or more of these options to adjust how the import is performed:

  • Overwrite already translated field values?
    When checked, page fields that are already translated will be skipped rather than overwritten with a translation from the import.

  • Confirm that source value has not changed before importing?
    When checked the import tool will make sure that the source value is identical to the one in the import data before assuming that the target value is valid for import. This ensures that imported translations are not potentially out of date with the website content.

  • Show verbose debug notifications?
    When checked more details about the import process will be revealed via ProcessWire notifications. This can be helpful in particular for identifying why a particular translation may have been skipped during import.

  • Test import without committing changes?
    When checked it will perform a dry-run of the import without making any changes. This is how you can test an import before committing to it. It is a good idea to use this option to test any new translations before committing them to the database.

Translator guidelines

Exports use the UTF-8 character set and import files should maintain this character set.

When an item’s type is indicated as "markup" then HTML and entity encoding are allowed. Otherwise the translation (target) value should contain no markup or entity encoded characters.

Also when the type is "markup" the translator should maintain existing markup while translating the text within the markup.

File format examples

Note that some values have been truncated for brevity in these examples.

JSON format example

{
  "source_language": "default (English)",
  "target_language": "es (Español)",
  "version": 2,
  "exported": "2022-08-04 11:00:50",
  "items": [
    {
      "page": 3171,
      "field": "title",
      "type": "text",
      "source": "Mosel River Tour Through Time",
      "target": ""
    },
    {
      "page": 3171,
      "field": "browser_title",
      "type": "text",
      "source": "Mosel River Tour Through Time - Germany, France",
      "target": ""
    },
    {
      "page": 3185,
      "field": "title",
      "type": "text",
      "source": "What is a Bike & Barge Tour?",
      "target": "Que es un tour de Bici + barco?"
    },
    {
      "page": 3185,
      "field": "body",
      "type": "markup",
      "source": "<p><strong>It is about the boat…</strong></p>\n\n<p>Just the other day...</p>",
      "target": "<p><strong>Es acerca del barco…</strong></p>\n\n<p>Justo el otro día...</p>"
    },
    {
      "page": 3185,
      "field": "browser_title",
      "type": "text",
      "source": "What is a Bike & Barge Tour? - Travel Tips",
      "target": "Que es un tour en bici + barco? Tips para viajar"
    }
  ],
  "fields": {
    "title": "PageTitleLanguage",
    "body": "TextareaLanguage",
    "browser_title": "TextLanguage"
  }
}

CSV format example

page,field,type,"default (English)","es (Español)"
3171,title,text,"Kevin Purdy: Mosel River Tour Through Time",""
3171,browser_title,text,"Mosel River Tour Through Time - Germany, France",""
3185,title,text,"What is a Bike & Barge Tour?","Que es un tour de Bici + barco?"
3185,body,markup,"<p>It is about the boat…</p>","<p>Es acerca del barco…</p>"

Future

I'd like us to be able to also support the XLIFF 2.0 format but need help from someone more familiar with this format. Specifically, how to we retain our required "page" and "field" properties in XLIFF, and how do we provide a value for XLIFF's "file" and "skeleton" fields?

Are there any other formats that would be worthwhile to support? I'd be interested in learning about them to see if we might be able to support them.

Support for Combo fields is likely to come next (if there is need/interest). Support for other multi-language field types is also likely to be added where there is interest.

Download

This module is publicly available for download in the modules directory and on GitHub.

This module is also designed to work with several of the ProFields multi-language modules. VIP support is available in the ProFields or ProDevTools support boards.

Post a comment

 

PrevStarting a site with the “blank” profile

4

ProcessWire 3.0.200+ comes with just 1 site installation profile, the site-blank profile. This profile makes very few assumptions, making it a minimal though excellent starting point. Here’s how you might use it.  More 

Twitter updates

  • Introduction to an invoice application profile being built in ProcessWire: More
    30 September 2022
  • Stumbling upon a really nice ProcessWire-powered website, plus core updates including API improvements for ProcessWire forms— More
    9 September 2022
  • Useful new dot-and-bracket syntax options added for page.get() method— More
    2 September 2022

Latest news

  • ProcessWire Weekly #438
    In the 348th issue of ProcessWire Weekly we're going to cover the latest weekly update from Ryan, take a closer look at a couple of new third party modules, and more. Read on!
    Weekly.pw / 1 October 2022
  • Multi-language field translation export/import
    In this post we cover the details of a new module that enables export and import capabilities for multi-language fields in ProcessWire.
    Blog / 5 August 2022
  • Subscribe to weekly ProcessWire news

“I am currently managing a ProcessWire site with 2 million+ pages. It’s admirably fast, and much, much faster than any other CMS we tested.” —Nickie, Web developer