Internationalization

优质
小牛编辑
132浏览
2023-12-01

Flarum features a powerful translation system (based on Symfony's translator and ICU MessageFormat) that allows the interface to display information in virtually any language. You should consider taking advantage of Flarum's translator as you develop your extension, even if you have no intention of using it in more than a single language.

For one thing, this system will allow you to change the information displayed by your extension without editing the actual code. It will also give you the tools needed to efficiently deal with phenomena such as pluralization and agreement for gender. And you never know: it may come in handy later if you decide you want to add more languages and make your extension available to users around the world!

This guide will show you how to create a locale file containing language resources for your extension, and how to use the translator to access those resources from within your code. You will learn how to deal with more complex translations involving variables and HTML tags, as well as pluralization and gender.

We will also describe the standard format to be followed when developing language resources for Flarum, and offer some tips that can help you make your language resources easier to localize. But first, let's begin with an overview of how Flarum prioritizes resources when displaying output for a third-party extension.

In your code, instead of hardcoding strings in a language, you will call the trans method on a translator object, providing a unique translation key and optionally some variables. Then, when users visit the page, Flarum will look in its registered translation files for a matching translation.

Flarum pulls translations from several sources:

  • Locale files included in extension source code.
  • Language packs, which include translations for multiple extensions.

As a rule, a Flarum site can only display translations corresponding to the language packs that have been installed on that site. But Flarum will do its level best — within this limitation — to render your extension's output in some sort of user-readable language:

  1. It will begin by looking for a translation in the user's preferred display language.
  2. If it can't find one, it will look for a translation in the forum's default language.
  3. As a last-ditch effort, it will look for a "generic" English translation of the output.
  4. If none of the above are available, it will give up and display a translation key.

Since English translations could be the only thing standing between forum users and unsightly translation keys, we strongly recommend including a complete set of English resources with every extension. (You will also need to include English resources if you wish to list your extension on the Flarum Marketplace, when that launches). Generally, extension developers will include English translations in their extension, and other language translations in a language pack.

Flarum's language resources use the YAML file format. The locale files for a third-party extension need to be stored in the extension's locale folder. Each locale file should be named using the ISO 639-1 code for the language it contains. For example, a file containing French translations should be named "fr.yml".

tip

If you wish to provide support for a specific locale, you can add an underscore followed by an ISO 3166-1 alpha-2 country code; the filename for French as spoken in Canada, for example, would be "fr_CA.yml". But you should be sure to include a locale file containing "generic" translations for the same language, so Flarum will have something it can fall back on in the event a user specifies a locale that you haven't provided for.

The extension skeleton includes a locale/en.yml template where you can put your extension's English translations. If you want to add resources for another language or locale, just duplicate the template and give it an appropriate filename. Then open the file and start adding your translations!

Each translation in the locale file must be preceded by a key, which you will use as the identifier for that translation. The ID key should be in snake_case and followed by a colon and a space, as shown below:


sample_key: This is a sample translation.

You can also use keys to namespace your translations. For starters, the first line of the locale file should consist of a single key that collects all the translations for your extension in a single namespace. This key should exactly match your extension's IDkebab-case and all.

Additional keys can be used to divide the extension namespace into groups. This comes in handy when you want to organize your translations according to where they appear in the user interface, for example. These intermediate namespacing keys should always be in snake_case.

Each namespacing key should also be followed by a colon. Keys should be nested according to the YAML outline format, adding two spaces of indentation for each level in the hierarchy. Put this all together, and the locale file for the Quick Start tutorial might look something like this:


acme-hello-world:                # Namespacing for the extension; unindented.
  alert:                         # Namespacing for alerts; indented 2 spaces.
    hello_text: "Hello, world!"  # Identifier/translation; indented 4 spaces.

Once you have this information in place, you can form the full translation key that you will use to access a translation by listing its keys in order from extension namespace to an identifier, with periods as delimiters. For example, the full translation key for the "Hello, world!" translation would be:


'acme-hello-world.alert.hello_text'

That's really all you need to know about the mechanics of key creation. Please be aware, however, that there is a standard format that developers need to follow when creating language resources for Flarum. The rules for namespacing translations and naming ID keys may be found in Appendix A.

The examples in the preceding section have already given you the basics: you type an ID key — followed by a colon and a space — then you type in the translation. It's that easy! Here we would just like to add a few details that will help you deal with longer and more complex translations.

Quotation Marks

You may have noticed that only one of the two sample translations in the previous section was enclosed in quotation marks. It is generally not necessary to delimit translations in this way. You should, however, use double quotes to enclose any translation that includes one or more of the following characters:


`  !  @  #  %  &  *  -  =  [  ]  {  }  |  :  ,  <  >  ?

Since Flarum uses curly brackets and angle brackets to denote placeholders for variables and HTML tags, respectively, it goes without saying that any translation that includes such placeholders will also need to be enclosed in double quotes.

Furthermore, you should use single quotes to enclose any translation that includes one or more double quote (") or backslash (\) characters. This rule takes precedence! So if a translation were to include both double quotes and one or more characters from the list above — as does this example, in which a variable placeholder is set off by quotation marks — you would need to enclose it in single quotes.

When you need a translation to appear as more than a single line of text, the translation should be added as a literal block. Enter a vertical bar (|) character where you would normally begin the translation, then add the translation below the ID key, indenting each line by an extra two spaces:


literal_block_text: |
  These lines will be displayed as shown here, line breaks and all.


      Extra indentation is also preserved: this line will be indented 4 spaces!


  Quote marks are unnecessary, even when the block contains special characters.

The literal block ends with the last line to be indented at least two spaces more than the ID key. Quotation marks are not needed because the block is effectively delimited by this extra two spaces of indentation.

Flarum's core language resources employ literal blocks mainly for email body content.

It's not uncommon to use the same bit of text in more than one location or context. Let's assume that you want your extension to display the phrase "Edit Stuff" in two locations in the user interface:

  • As a button that users can click when they want to edit some stuff
  • As the title of a dialog box displayed when users click that button

Your instinct might be to add a single translation — let's call it "edit_stuff" — and use that ID key twice in your code. This approach is efficient, but it lacks flexibility: in some languages, it may not be possible to use the same phrase for both the button and the dialog title! A better way would be to define two keys for use in your code, then set them both to reference the same translation, like so:


edit_stuff_button: => edit_stuff    # Used in the code that creates the button.
edit_stuff_title: => edit_stuff     # Used in the code that creates the dialog.


edit_stuff: Edit Stuff              # Not used in the code.

You can set one key to reference another by replacing its translation with an equal sign (=), a greater-than sign (>), and a space, followed by the full translation key to be referenced. When the extension is installed, Flarum's compiler will resolve these references to create a complete set of translations it can use.

There's more to be said about referencing — for one thing, we've totally ignored the issue of namespacing in the above example! And you may be wondering why we suggested creating three keys when two would suffice. For an explanation, see the rules for reusing translations in Appendix A.

Once you've added a translation to your locale file, with appropriate namespacing and identifier keys, you can use the app.translator.trans() method to reference that translation in your code. For instance, the js/forum/src/index.js file for the Quick Start tutorial might end up looking like this:


app.initializers.add('acme-hello-world', function() {
  alert(app.translator.trans('acme-hello-world.alert.hello_text'));
});

This shows the basic translation method, with no bells or whistles attached. Below you'll find examples of more complex translations involving things like variables and HTML tags. (Please note that we've omitted the namespacing in the following examples to keep them simple; if you look at the actual code, you'll find the translations are properly namespaced according to the standard format.)

You can include variables in translations. As an example, let's look at the code that creates the first item in Flarum's search results dropdown. This button quotes the search query entered by the user — information that is passed to the translator along with the translation key, as an additional parameter:


{LinkButton.component({
  icon: 'search',
  children: app.translator.trans('all_discussions_button', {query}),
  href: app.route('index', {q: query})
})}

A matching placeholder in the translation lets the translator know where it should insert the variable:


all_discussions_button: 'Search all discussions for "{query}"'

Since curly brackets are used to denote the placeholder, the translation as a whole needs to be enclosed in quotation marks. Normally, double quotes would be used; but since this particular translation is using double quotes to set off the search query, the single-quote rule takes precedence.

Abstracting translations from HTML can pose a unique challenge: How do you deal with HTML elements that affect just part of the sentence? Fortunately, Flarum gives you a way to add tags to your translations.

You begin by adding a key to the parameters argument for each element that you want the translator to handle. The following snippet — from the Edit Group modal of the admin interface — shows a translation key accompanied by a parameters object with one item.


<div className="helpText">
  {app.translator.trans('icon_text', {
    a: <a href="https://fortawesome.github.io/Font-Awesome/icons/" tabindex="-1"/>
  })}
</div>

Note that each parameter is defined using a single HTML tag, with a slash added before the closing angle bracket. You can then use HTML-style opening and closing tags in your locale file to specify which part of the translation is affected by each element. Once again, double quotes are required. You can see that not all tags are passed as an argument, only those who have attributes.


icon_text: "Enter the name of any <a>FontAwesome</a> icon class, <em>without</em> the <code>fa-</code> prefix."

Of course, you can give a parameter any name you like — you could use <fred> and </fred> to enclose your link text if you really wanted to! But we recommend sticking as close as possible to the actual HTML tags being represented, so your localizers will be able to understand what's going on.

Localizers can reorder elements as needed, and even choose to omit tags if they so desire. But they can't add any tags of their own: the translator will simply ignore any HTML-style tag that doesn't correspond to a properly defined parameter in the code.

On occasion, you may need to provide alternate versions of a translation to accommodate pluralization of a word or phrase. Flarum uses theICU MessageFormat syntax to support selecting versions of translations based on variables. The most frequent use case is pluralization and genderization.


const remaining = this.minPrimary - primaryCount;
return app.translator.trans(
  'choose_primary_placeholder',
  { count: remaining }
);

This example is from the Choose Tags modal of the Tags extension, where it tells the user how many more primary tags can be selected. Here's the English translation in ICU MessageFormat for the above code:


choose_primary_placeholder: "{count, plural, one {Choose a primary tag} other {Choose {count} primary tags}}"

In this case, we call the pluralization variable count. This isn't required, but we strongly recommend using count for consistency.

Of course English has only two variants: singular or plural. You may want to provide additional variants when creating translations for a language that has more than one plural form.

See the Symfony Translation docs for more complex examples.

Support for grammatical gender will be added in a future version of Flarum. Detailed instructions will be provided here once that functionality becomes available.

Translation is generally handled by Flarum's front-end client. However, you can use translations in your PHP code if you need to.

First, you'll need to get an instance of the Flarum\Locale\Translator class. You'll usually do this by typehinting this class in the constructor of your class so a translator instance is injected by the IoC Container. If dependency injection is not available, you can use resolve(Translator::class). Don't forget a use Flarum\Locale\Translator; statement at the top of your PHP file!

Then, the API is similar to the JavaScript Translator class. You can use $translator->trans like you'd use app.translator.trans in JavaScript. You can learn more about the Translator's methods in Symfony's Translator documentation, which Flarum's Translator extends.

There's one last thing you need to do before Flarum can use your translations. You need to register them. Fortunately, Flarum makes this pretty easy. Add the following to your extend.php:


new Extend\Locales(__DIR__ . '/locale'),

The following guidelines were created for use in the Flarum core and bundled extensions. Their purpose is to ensure that translation keys are organized and named in a consistent fashion, so Flarum's localizers will be able to create and maintain high-quality language packs with a minimum of difficulty.

Developers who wish to contribute to the development of Flarum are expected to follow these guidelines. Third-party developers may also wish to follow them, so experienced Flarum localizers who undertake the translation of third-party extensions will find themselves working in familiar surroundings.

All translations are to be organized in categories, using namespacing keys arranged in up to three levels. Each level provides localizers with an important bit of information about where the translation is used:

The namespacing for translation keys used in official Flarum components, including bundled extensions, should match the name of the language pack locale file for the component in question. The namespaces for Flarum's non-extension components are fixed as shown below:


core:        # Translations used by the Flarum core
validation:  # Translations used by Laravel's validator

Translation keys used in an extension — including any third-party extension — need to be namespaced using the extension's name in vendor-package format where the flarum- and flarum-ext- prefixes are stripped from the package (e.g, flarum-tags for the Tags extension and foo-bar for a foo/flarum-ext-bar extension).

There should be only one first-level prefix in any locale file; it should be the first line in the locale file.

Since Flarum doesn't have all that many interfaces, we've come up with a short list of second-level keys for you to choose from. We've included the more frequently used ones in the locale file template created with the extension skeleton. Below you will find the complete list, with explanations:


admin:       # Translations used by the admin interface.
forum:       # Translations used by the forum user interface.
lib:         # Translations used by either of the above.
views:       # Translations used outside the normal JS client.
api:         # Translations used in messages output by the API.
email:       # Translations used in emails sent by Flarum.

The first four keys correspond roughly to the directories containing the code where the translations in that namespace will be used. (Most of your keys will probably go in admin or forum.) The remaining two keys are a bit different: the api namespace is for translations used in messages output by the API, while the email namespace contains the resources for all emails sent by the forum.


ref:         # Translations referenced by more than one key.
group:       # Translations used as default group names.

These two keys don't correspond to interfaces; they're for translations that require special handling. We'll explain how to use the ref namespace when we talk about reusing translations. The group namespace holds the default group names, which are translated by the server rather than at the front end.

The keys in this level are not so rigidly defined. Their main purpose is to chop the UI up into manageable chunks, so localizers can find the translations and see for themselves how they are used by the software. (Third-level keys aren't used in the ref and group namespaces, which don't need chopping.)

If you're modifying an existing location — to add a new setting to the Settings page, for example — you should copy our namespacing so experienced Flarum localizers will know at a glance exactly where the new translations are displayed. See English translations in flarum/core and bundled extensions for examples.

If your extension adds a new location — such as a new dialog box — you should feel free to create a new third-level key to namespace the translations that go there. Take a couple minutes to familiarize yourself with the namespacing in the locale files linked above, then create a new key that fits in with that scheme.

As a general rule, third-level keys should be short — no more than one or two words — and expressed in snake_case. They should be descriptive of the locations where the translations are used.

Like the third-level namespacing keys, identifier keys should be expressed in snake_case. ID keys should be arranged in alphabetical order within each namespace, so they'll be easy for developers to find. (There is one exception to this rule! ID keys in the email namespace should be listed just as they appear in your mail client: subject first, then body.)

The typical ID key consists of two parts — a root and a suffix — each of which may be omitted in certain circumstances. Just as the namespacing keys tell localizers where the translation is used, each part of the ID key provides a further bit of information about the translation:

We'll start with the suffix because it's the most important part of the key name. It tells localizers what sort of object they should look for when trying to find the translation in the interface. For example, the suffixes in the following list are used for GUI objects more or less related to user operations:


_button:        # Used for buttons (including dropdown menu items).
_link:          # Used for links that are not shown graphically as buttons.
_heading:       # Used for headings in tables and lists.
_label:         # Used for the names of data fields, checkbox settings, etc.
_placeholder:   # Used for placeholder text displayed in fields.

These suffixes are used for informative or descriptive text elements:


_confirmation:  # Used for messages displayed to confirm an operation.
_message:       # Used for messages that show the result of an operation.
_text:          # Used for any text that is not a message, title, or tooltip.
_title:         # Used for text displayed as the title of a page or modal.
_tooltip:       # Used for text displayed when the user hovers over something.

And there are two suffixes that are used only in the email namespace:


_body:          # Used for the content of the email message.
_subject:       # Used for the subject line of the email message.

The above is a complete listing of the suffixes available for use in locale files. You should take care to use them consistently, as doing so will make life easier for localizers. If you feel something is missing from the list, please file an issue with the developers; we'll consider adding a new suffix if the situation warrants it.

Suffixes should be omitted from ID keys for reused translations in the ref: namespace. This is because you can't be sure these translations will always be used in the same context. Adding a new context would mean changing the key name everywhere it's referenced … so it's best to keep these translations generic.

In other words, it should be a brief summation of the translation's content. If the translation is a very short phrase — no more than a few words long — you may want to use it verbatim (in snake_case, of course). If it's very long, on the other hand, you should try to boil it down to as few words as possible.

In some cases, the summary may be replaced by a description of the object's function. This is commonly seen in buttons. The button that submits a form should be identified as a submit_button regardless of whether the translation says "OK" or "Save Changes". In a similar vein, the button that dismisses a dialog or message box is always a dismiss_button, even if it actually reads "OK" or "Cancel".

The root may also be omitted in certain cases — usually when the suffix alone is sufficient to identify the translation. For example, it's unlikely that a page or dialog box will have more than one title … and like as not, the content of the title is already given in the third-level namespacing! So the suffix can stand alone.

Suffixes are also sufficient to identify the subject and body components of an email message since each email will have only one subject line and one body. Note that the leading underscore is omitted in such cases: you would use just title, subject, or body as the ID key.

Flarum's unique key references fulfill the same role as YAML anchors, but they're better in one respect: you can even reference keys in a different file! For this reason, you need to use the full translation key when referencing. Here's a more realistic example of how referencing works, complete with namespacing:


core:


  forum:
    header:
      log_in_link: => core.ref.log_in


    log_in:
      submit_button: => core.ref.log_in
      title: => core.ref.log_in


  ref:
    log_in: Log In

As you can see, we want to reuse a single translation in three different contexts (including two locations): (1) as a link displayed in the site header, (2) as a button displayed in the Log In modal, and (3) as the title of that modal. So all three of these keys have been set to reference the same full translation key.

The reused translation is identified by a key that omits the suffix — as specified above — and is placed in the ref namespace. The latter measure is needed to avoid conflicts that can occur if a reused translation is given the same name as a second-level namespacing key. (The email keys are a case in point.)

The ref namespace also makes it easy to track translation reuse. Imagine what would happen if you set the scalars for the button and title to reference core.forum.header.log_in_link instead:


core:


  forum:
    header:
      log_in_link: => Log In


    log_in:
      submit_button: => core.forum.header.log_in_link  # Never reference keys
      title: => core.forum.header.log_in_link          # that aren't in "ref"!

It would very easy to change the translation for the header link without realizing that you're also changing things in the Log In modal — to say nothing of any extensions that might also be referencing that key! This sort of thing will be less likely to occur if you keep your reused translations in the ref namespace.

For this reason, any key references in the core resources must point to keys in the core.ref namespace. Key references in the resources for bundled extensions may point to either of two locations:

  • Extension-specific translations should go in the ref namespace of the extension's locale file.

  • Translations used by both the extension and the core should go in the core.ref namespace.

Third-party extensions are welcome to reference keys in the core.ref namespace, but please be aware that we cannot add translations to this namespace based on reuse in third-party extensions. A third-party dev who wants to reuse a translation from a namespace other than core.ref will need to add a properly keyed duplicate translation to the extension's locale file.

No extension should ever reference a key from another extension, as doing so will result in a dependency.

One final caution: translation keys in the ref namespace should never be inserted directly in code. This is because localizers may end up creating a translation to replace every reference to a reused translation key — in which case they would be within their rights to remove that key from the locale file. If such a key were being used in the code, it would end up without a matching resource to translate it!

Comments (and empty lines) should be added to locale files so localizers will find them easier to navigate.

We've included some block comments in the locale file template included with the extension skeleton. They are there to remind developers of some basic concepts: translation keys should not be used in more than one place; keys for reused translations should never be inserted directly in code; and so forth.

You should add comment lines above every second- or third-level namespacing key, to give localizers a more complete description of the location covered by that namespace. When copying existing keys, be sure to copy the comment as well (and modify it if necessary). If you add a new third-level key, remember to preface it with a comment to explain the location being added.

You may also wish to add inline comments after a specific translation to provide localizers with further information about that translation (such as the fact that a translation can be pluralized if necessary).

In this appendix, we'd like to offer a few tips that may help you to avoid some of the more common pitfalls in the internationalization process. Abstracting language from code is easily one of the more humdrum tasks a programmer has to deal with, but if you don't give due attention to the subtleties involved, you're likely to end up creating your localizers an unnecessary headache or two.

Of course, it's not just about making life easier for localizers. Because when they get headaches, they will come to you for help — often months (or even years) after you've put the project behind you and moved on to something else! It's the sort of situation where an ounce or two of prevention may indeed be worth several pounds of cure further down the road … for everybody involved.

It's probably impossible to avoid localization issues completely; there are just too many variables. But by following a few simple guidelines, you should be able to head off many issues before they happen.

This probably goes without saying. After all, if you're going to go to the trouble of extracting translations, you might as well finish the job, right? Well, yes, but that's easier said than done. It's really quite unusual to find a program that doesn't have at least a few bits of hardcoded text floating around somewhere.

Even tiny bits of text can cause problems for localizers. A comma here, a colon there … perhaps a pair of brackets inserted to make the page more legible: such things can and do cause issues for localizers. After all, not all languages use the same glyphs for these things! Just one hardcoded space can be a problem for someone trying to translate the interface into a language that doesn't use spaces to separate words.

Generally speaking, any displayed text that isn't supplied by a variable or the result of a calculation must be included in the locale files. That's easily said, but actually doing it takes a bit of perseverance.

Hardcoded text isn't the only way that code can create problems for localizers. Hardcoded syntax issues occur when the code forces words into a specific order that just won't work in other languages. They are generally the result of using two translations where just one would be more appropriate.

For an example, let's look at the line of text that's printed near the bottom of the Log In modal:

Don't have an account? Sign Up

We originally coded this line as two translations, which were separated by a hardcoded space:


before_sign_up_link: "Don't have an account?"
sign_up: Sign Up

There were good reasons for doing it this way. For one thing, it made it easy to turn the second half into a link. And since the second translation is reused elsewhere, keeping it separate seemed like a no-brainer.

But there were problems with this approach. The hardcoded space seemed likely to pose issues for some localizers, as mentioned above. And splitting this text into two strings would make it impossible to render the line as a single sentence with the link embedded in the middle:

If you don't have an account yet, you can sign up instead!

Since it seemed possible that a localizer might need this sort of flexibility, we opted to abstract the line as a single translation, using HTML-style tags to enclose the link text:


sign_up_text: "Don't have an account? <a>Sign Up</a>"

This puts the (formerly hardcoded) space in the translation, so localizers who don't need it can remove it. And the tags can be positioned freely within the translation, making this approach much more flexible.

The moral of this story is: when you've got two adjacent chunks of text which seem related to each other grammatically — or even semantically — you should think about finding a way to incorporate them both in a single translation. Your localizers may have cause to thank you!

We might also conclude that the presence of little bits of hardcoded text — such as the hardcoded space in this example — may be a sort of code smell indicating the presence of a deeper issue. This isn't always the case, but it's a possibility that's worth taking into consideration.

They are surprisingly easy to overlook. Of course, the need for pluralization support is fairly obvious here:

Toby likes this. You like this. Toby and Franz like this.

And the situation is complicated by the presence of the second-person pronoun, since it always takes the plural form in English, even when we're talking about one person. That's why the app.translator call is so complicated in the code that outputs the above sentences for the Likes extension.

Now look at this similar set of sentences, output by similar code in the Mentions extension:

Toby replied to this. You replied to this. Toby and Franz replied to this.

In English, the simple past tense is not affected by pluralization. Since the verb phrase is always the same, it would be fairly easy to ignore the plurals and use the app.translator.trans() method to produce the single necessary translation. Of course, that simply wouldn't work for the French localizer, who needs to inflect the verb differently in each of the above three sentences.

This is the sort of situation where the humdrum chore of language abstraction requires a bit of extra care and attention. Remember to ask yourself whether each noun (or pronoun) can be pluralized. If it can, make sure to pass an appropriate variable to the translator. Of course, you don't need to provide any variant translations in the English resources …


mentioned_by_text: "{users} replied to this."       # Can be pluralized ...
mentioned_by_self_text: "{users} replied to this."  # Can be pluralized ...

… but it would be a very good idea to add a comment after the translations in question, to alert localizers to the fact that the code will support the addition of such variants, should they be necessary.

If the namespacing keys combine to form a complete specification of where a translation is used, and the ID key specifies exactly how the translation is used and what it says, then it stands to reason that every full translation key must be unique. In other words: you should never use the same key more than once!

Although this may sound inefficient, there's a good reason for doing things this way: it's the easiest way to ensure that localizers will have the flexibility they need. If you reuse keys in your code, you'll eventually hit a snag. Your localizers will be unable to find a single expression that fits every context where you've used some key or other … and then they'll start bugging you to fix your code.

Fortunately, you can avoid many such issues if you simply take care to namespace translations correctly, name your ID keys appropriately, and always reuse translations instead of keys. Though it may seem like a bother, in the long run, the standard format will make localization much easier for everyone.