多语言站点的迁移

优质
小牛编辑
150浏览
2023-12-01

This page explains how migrate a translated Docusaurus v1 site to Docusaurus v2.

i18n differences

Docusaurus v2 i18n is conceptually quite similar to Docusaurus v1 i18n with a few differences.

It is not tightly coupled to Crowdin, and you can use Git or another SaaS instead.

Different filesystem paths

On Docusaurus v2, localized content is generally found at website/i18n/<locale>.

Docusaurus v2 is modular based on a plugin system, and each plugin is responsible to manage its own translations.

Each plugin has its own i18n subfolder, like: website/i18n/fr/docusaurus-plugin-content-blog

Updated translation APIs

With Docusaurus v1, you translate your pages with <translate>:

const translate = require('../../server/translate.js').translate;
<h2>
  <translate desc="the header description">
    This header will be translated
  </translate>
</h2>;

On Docusaurus v2, you translate your pages with <Translate>

import Translate from '@docusaurus/Translate';
<h2>
  <Translate id="header.translation.id" description="the header description">
    This header will be translated
  </Translate>
</h2>;
note

The write-translations CLI still works to extract translations from your code.

The code translations are now added to i18n/<lang>/code.json using Chrome i18n JSON format.

Stricter Markdown parser

Docusaurus v2 is using MDX to parse Markdown files.

MDX compiles Markdown files to React components, is stricter than the Docusaurus v1 parser, and will make your build fail on error instead of rendering some bad content.

Also, the HTML elements must be replaced by JSX elements.

This is particularly important for i18n because if your translations are not good on Crowdin and use invalid Markup, your v2 translated site might fail to build: you may need to do some translation cleanup to fix the errors.

Migration strategies

This section will help you figure out how to keep your existing v1 translations after you migrate to v2.

There are multiple possible strategies to migrate a Docusaurus v1 site using Crowdin, with different tradeoffs.

caution

This documentation is a best-effort to help you migrate, please help us improve it if you find a better way!

Before all, we recommend to:

danger

Don't try to migrate without understanding both Crowdin and Docusaurus v2 i18n.

Create a new Crowdin project

To avoid any risk of breaking your v1 site in production, one possible strategy is to duplicate the original v1 Crowdin project.

info

This strategy was used to upgrade the Jest website.

Unfortunately, Crowdin does not have any "Duplicate/clone Project" feature, which makes things complicated.

  • Download the translation memory of your original project in .tmx format (https://crowdin.com/project/<ORIGINAL_PROJECT>/settings#tm > View Records)
  • Upload the translation memory to your new project (https://crowdin.com/project/<NEW_PROJECT>/settings#tm > View Records)
  • Reconfigure crowdin.yml for Docusaurus v2 according to the i18n docs
  • Upload the Docusaurus v2 source files with the Crowdin CLI to the new project
  • Mark sensitive strings like id or slug as "hidden string" on Crowdin
  • On the "Translations" tab, click on "Pre-Translation > via TM" (https://crowdin.com/project/<NEW_PROJECT>/settings#translations)
  • Try first with "100% match" (more content will be translated than "Perfect"), and pre-translate your sources
  • Download the Crowdin translations locally
  • Try to run/build your site and see if there are any errors

You will likely have errors on your first-try: the pre-translation might try to translate things that it should not be translated (frontmatter, admonition, code blocks...), and the translated md files might be invalid for the MDX parser.

You will have to fix all the errors until your site builds. You can do that by modifying the translated md files locally, and fix your site for one locale at a time using docusaurus build --locale fr.

There is no ultimate guide we could write to fix these errors, but common errors are due to:

  • Not marking enough strings as "hidden strings" in Crowdin, leading to pre-translation trying to translate these strings.
  • Having bad v1 translations, leading to invalid markup in v2: bad html elements inside translations and unclosed tags
  • Anything rejected by the MDX parser, like using HTML elements instead of JSX elements (use the MDX playground for debugging)

You might want to repeat this pre-translation process, eventually trying the "Perfect" option and limiting pre-translation only some languages/files.

tip

Use mdx-code-block around problematic markdown elements: Crowdin is less likely mess things up with code blocks.

note

You will likely notice that some things were translated on your old project, but are now untranslated in your new project.

The Crowdin Markdown parser is evolving other time and each Crowdin project has a different parser version, which can lead to pre-translation not being able to pre-translate all the strings.

This parser version is undocumented, and you will have to ask the Crowdin support to know your project's parser version and fix one specific version.

Using the same cli version and parser version across the 2 Crowdin projects might give better results.

danger

Crowdin has an "upload translations" feature, but in our experience it does not give very good results for Markdown

Use the existing Crowdin project

If you don't mind modifying your existing Crowdin project and risking to mess things up, it may be possible to use the Crowdin branch system.

caution

This workflow has not been tested in practice, please report us how good it is.

This way, you wouldn't need to create a new Crowdin project, transfer the translation memory, apply pre-translations, and try to fix the pre-translations errors.

You could create a Crowdin branch for Docusaurus v2, where you upload the v2 sources, and merge the Crowdin branch to master once ready.

Use Git instead of Crowdin

It is possible to migrate away of Crowdin, and add the translation files to Git instead.

Use the Crowdin CLI to download the v1 translated files, and put these translated files at the correct Docusaurus v2 filesystem location.