Skip to content

Internationalization (i18n) and localization (l10n)

Babel

Marking strings for translation

In order to translate strings, they have to be marked. This ensures that they will be picked up by the extraction process later on. Only strings which are visible in the UI have to be translated.

Python

Warning

Do not use f-strings for translatable text. Interpolation is not supported for these types of strings (see Issue).

- _(f"Hi {username}")
+ _("Hi {username}".format(username=username))

Marking strings in python is done via the Flask-BabelEx package. First, make sure to import the package and the *gettext function. We usually import it as _ to be consistent throughout different packages.

from flask_babelex import lazy_gettext as _

Now strings can be marked for translation:

error_message = _("Invalid scheme")

Certain data, identifiers or keywords do not have to be translated. This can be achieved with string interpolation:

_("Invalid {scheme}").format(scheme)

Adding notes for translators helps them to better understand the context where this string will be used:

# NOTE: This is displayed in ....
_("Invalid {scheme}").format(scheme)

Jinja Templates

The gettext function is made available in jinja templates as well:

<p>
    {{ _('Only published versions are displayed.') }}
</p>

JavaScript

Marking strings in JavaScript is done via the i18next module. First, make sure to import i18next from the module.

import { i18next } from "@translations/invenio_app_rdm/i18next";

Now strings can be marked for translation:

<p>
    {i18next.t('Only published versions are displayed.')}
</p>

For strings containg HTML/React nodes, the Trans component is the way to go (full documentation):

import { Trans } from '@translations/i18next';

const title = record.title;
<Trans>
    The record {{ title }} can <b>only</b> be accessed by <b>users specified</b> in the permissions.
</Trans>

For more complex strings, custom variables can be passed to the Trans component as well.

<Trans
    defaults="On <bold>{{ date }}</bold> the record and the files will automatically be made publicly accessible. Until then, the record and the files can <bold>only</bold> be accessed by <bold>users specified</bold> in the permissions."
    values={{ date: fmtDate }}
    components={{ bold: <b /> }}
/>
This will interpolate the nodes from the values and components properties and is a good approach, if the nodes are to be used multiple times.

Singular & Plural

Sometimes, certain words or phrases in a sentence change based on a number or the amount of objects. There is also support for this, by passing the count property.

const records = [{ title: "Record 1" }, { title: "Record 2" },];

<Trans
    i18nKey="numberOfRecords"
    count={records.length}>
    There are {{ count: records.length }} records.
</Trans>
This will generate the following two ids for translation:
"numberOfRecords": "There is one record."
"numberOfRecords_pluar": "There are {{ count }} records."

Lazy vs non-lazy

Coming soon

Storage of strings and timestamps

Strings should be stored in UTF-8. Timestamps should be stored in UTC and localized when displayed.

EDTF localization

The EDTF (Extended Date Time Format) is used when dealing with imprecise dates and for their region specific display.

Python

Here we make use of the babel-edtf module. It allows to pass the locale to the format_edtf function. As always, make sure to import it before you use it.

from babel_edtf import format_edtf

format_edtf('2020-01', locale='en')
'Jan 2020'

It also allows to pass a certain format (one of: short, medium, long, full):

format_edtf('2020-01/2020-09', format='long', locale='en')
'January – September 2020'

Extracting Strings

After strings have been marked for translation, it is now time to extract them. Instructions on how to do so can be found in the respective package in the file located at .tx/config. Depending on the packag, only Python or JavaScript may be available for translation.

Python

Configuration

  • In the root folder of your package update the .tx/config file:

    [main]
    host = https://www.transifex.com
    
    # python conf
    [<parent_namespace_of_your_transifex_projext>.<package_name>-messages]
    file_filter = <package_name>/translations/<lang>/LC_MESSAGES/messages.po
    source_file = <package_name>/translations/messages.pot
    source_lang = en
    type = PO
    

  • also in the root folder add babel.ini if not there:

    # Extraction from Python source files
    [python: **.py]
    encoding = utf-8
    
    # Extraction from Jinja2 templates
    [jinja2: **/templates/**.html]
    encoding = utf-8
    extensions = jinja2.ext.autoescape, jinja2.ext.with_
    

  • also in the root folder add these configuration line in setup.cfg if not there:

    [compile_catalog]
    directory = <package_name>/translations/
    use-fuzzy = True
    

  • remember to add this line for MANIFEST.in

    recursive-include <package_name> *.po *.pot *.mo
    

  • also the entypoint in setup.py

        entry_points={
        ....
                'invenio_i18n.translations': [
                'messages = <package_name>',
            ],
        },
    

  • remember to add your new language to the instance configuration by modifying I18N_LANGUAGES variable, e.g. I18N_LANGUAGES = [('de', _('German'))]

Usage

In the root folder of the package run

python setup.py extract_messages
This command will gather all the marked strings from .py and jinja template files, group them by their ID and place everything in the source file specified in the config (usually at <package_name>/translations/messages.pot).

Push translation sources (.pot and .po) to transifex. It will add the missing strings to the translation awaiting list.

tx push -s -t

NOTE: When there is no resource for this package on transifex, one will be created automatically on the first push.

For the next step you need to have some translations done on the Transifex website. Otherwise next step will not be successfully done, because there will be no translated content to work with. Translations are stored under the namespace defined in your .tx/config file.

Pull translations from Transifex:

tx pull -l <lang>

JavaScript

Configuration

Navigate to the <package_name>/theme/assets/semantic-ui/translations/<package_name> folder. Create following folder and file structure:

|-messages/
           |-index.js
|-scripts/
           |-compileCatalog.js
           |-initCatalog.js
|-i18next.js
|-i18next-scanner.config.js
|-package.json

in the root folder of your package update the .tx/config file:

[main]
host = https://www.transifex.com

# react conf
[<parent_namespace_of_your_transifex_projext>.<package_name>-messages-ui]
file_filter = <package_name>/assets/semantic-ui/translations/<package_name>/messages/<lang>/messages.po
source_file = <package_name>/assets/semantic-ui/translations/<package_name>/translations.pot
source_lang = en
type = PO

where: - i18next.js is your translation configuration - i18next-scanner.config.js is translation string scanner configuration - compileCatalog.js is a script transforming Transifex *.po output translations into .json files used by react - initCatalog.js is a script initializing a new language configuration - package.json configuration and dependencies of the translation mini-app

example content of above files available here:

Usage

Install i18n dev dependencies via (dependencies from package.json)

npm install
Now run to add a new language
npm run init_catalog lang <your_new_lang_two_letters_abbreviation>

Now run

npm run extract_messages

This command will gather all the marked strings from .js files, group them by their ID and place everything in the source file specified in the config (usually at <package_name>/theme/assets/semantic-ui/translations/<package_name>/translations.pot).

Update the ./messages/index.js file to add the configuration of your new language to the react translation configuration map

import TRANSLATE_<lang> from './<lang>/translations.json'
export const translations = {
...rest,
<lang>: { translation: TRANSLATE_<lang> }
}

If you don't already have it, install the python transifex client. You will need also the transifex.com account and your private API key

pip install transifex-client

Push translation sources (.pot and .po) to transifex. It will add the missing strings to the translation awaiting list.

tx push -s -t

NOTE: When there is no resource for this package on transifex, one will be created automatically on the first push.

For the next step you need to have some translations done on the Transifex website. Otherwise next step will not be successfully done, because there won't be no translated content to work with. Translations are stored under the namespace defined in your .tx/config file Pull translations from Transifex:

tx pull -l <lang>
You can use -f to overwrite existing .po files. After this step you should see new .po files pulled to package_name>/theme/assets/semantic-ui/translations/<package_name>/messages/<lang>/ folder. If there are no new files, the translations were empty.

Compile the .json files for react to use.

npm run compile_catalog
npm run compile_catalog lang <lang> # for specific language