ContentMultilanguagePlugin Development

This is the topic to discuss development of download ContentMultilanguagePlugin

help If you need support, go to Support.ContentMultilanguagePlugin where you can ask questions and find answers to previously asked questions. warning If you want to report a bug, or a feature request, go to Tasks.ContentMultilanguagePlugin where you can see already submitted issues and where you can submit a new bug report or feature request.

Active Items

Discussion

Discussion about content multilanguage abilities in Foswiki

Since we have a couple of international companies as our customers we are looking for an easy content multilanguage solution within Foswiki. We are facing several problems with the existing solutions.

TopicTranslationsPlugin connects topics via virtual "multilanguage bundles" using a suffix in the topic name (-EN, -DE, -FR etc.).

If people actually make use of the topic name, e.g. by entering it into the URL bar or search forms, it's hard to use this plugin from point of view other than the "base language". For example, suppose English is the base language and translations are created from the English articles. In that case, presumably TopicTranslationsPlugin will work well since a lot of people can actually understand ContinuousImprovementProcess-DE. If, however, the base language is German, chances are that much less English users of the wiki will know what KontinuierlicherVerbesserungsProzess-EN means.

The situation gets better if you are using TopicTitle meta fields to provide different titles for each topic, but you have to do a lot of work to system topics like indices, searching, etc.

So, doing that, we are facing the following issues:
  1. Difference between topic name and topic title: we have to hack a lot since the native search and sorting functions are pointing to the physical topic name
  2. What happens if the topic is moved? The whole content multilanguage bundle gets broken!

Goals

  • Retain the ability to give each translated version a completely individual topic name
  • Store information about which articles are translations of other articles in a robust and hopefully elegant way
  • Add UI to make it easy to navigate and extend multilanguage content
  • Add UI/hooks to prevent possible trip-ups (e.g. when copying/moving articles, make sure that no erroneous language relationships are created

Brainstorming new approaches

Data structure for translation mappings

This section focuses on how to represent the actual mapping between different versions of a topic.

Idea:

Here, we simply keep references to every other translated version of a topic in its metadata (e.g. "Tree" has a reference to "Baum" and "Arbre"; "Baum" has a reference to "Tree" and "Arbre" etc.).

Pros
  • Good performance
Cons
  • What happens if topics are touched in the frontend / backend?
  • Meta data grows polynomially in the number of languages
  • Hard to maintain, especially with many languages; desyncs pretty much guaranteed
  • Lots of effort if one of the topics is moved

Translation set IDs

Idea:

Every topic gets a translation set ID in its metadata (e.g. "Baum" and "Tree" both get the TSID 1, "Car" and "Auto" get the TSID 2, and so on). Translations can be fetched by finding other topics containing the same translation set ID. Each topic also identifies the language it is written in using another meta field.

Pros
  • High robustness: the mapping does not break if the topic is renamed or moved
Cons
  • Poor performance? Every time a topic is displayed, we need to perform a search to find the translated versions
  • What happens if the topic is copied (either via the web interface or directly on the server)? Suddenly we have a translation set with too many topics!

Database index

Idea:

The translation table is stored within a database. That is, each topic gets a unique ID and we map pairs of translated articles together (or we store translation set IDs in the database; see roughly the next idea).

Pros
  • Good performance
Cons
  • Possibly prone to desyncs: what happens if topics are touched in the backend? If we move a topic, the ID is no longer properly associated with it... unless we also store it in the topic itself, and then we have to figure out which source of information is authoritative and we again have more potential for desyncs
  • Another DB to maintain

Using a "proxy topic" to join translations

Idea:

Instead of using a database to speed up lookups, have a separate, hidden web that contains "translation set topics". These topics are placeholders that store, presumably in their metadata, links to all the topics in the translation set. The name could be auto-generated à la TranslationSet00001.

I think this is essentially what Sven suggested, too, based on his explanations on IRC, ca. here. -- JanKrueger - 09 Jul 2012
  • y, my 2 suggestions are implementations of this idea, both trying to show different ways to use existing functionalities - Sven

Pros
  • Good performance
  • No extra database needed
Cons
  • Translation sets web will be unwieldy
    • Not necessarily - shouldn't be too hard to build a wikiapp to help users manage the translation sets, change management and editing of topic content,

Presenting translated content to the user

Here, the focus is on how we make sure that users see the right language in each article.

Translated content in a separate subweb / web and a plugin to map content (SvenDowideit)

ie System/de web would be the german version of all the topics, and a en->de translation table that the plugin uses to show alternative content

Pros
  • uses a known topic translation mapping system - ala TWikiCompatibilityPlugin
  • relatively simple to create a wikiapp to simplify management and updating of translations
  • Good performance?
  • Simplifies search and search results by searching only the selected language topics
Cons
  • need something really clever wrt WebSearch and indexing
  • possible additional difficulties if the different translations should be controlled by different workflows etc. -- JanKrueger - 09 Jul 2012

Dividing the topics into separate webs per language doesn't seem to be strictly necessary for this to work... -- JanKrueger - 09 Jul 2012

Translated content in a separate subweb / web and a skin to map content (SvenDowideit)

ie System/de web would be the german version of all the topics, and a en->de translation table that some clever skin manipulation to show alternative content - similar to TalkContrib

Pros
  • uses a known topic translation mapping system - ala TalkContrib
  • relatively simple to create a wikiapp to simplify management and updating of translations
  • Good performance?
  • Simplifies search and search results by searching only the selected language topics
Cons
  • need something clever wrt WebSearch and indexing
  • possible additional difficulties if the different translations should be controlled by different workflows etc. -- JanKrueger - 09 Jul 2012
Dividing the topics into separate webs per language doesn't seem to be strictly necessary for this to work... -- JanKrueger - 09 Jul 2012

Just show the requested version (JanKrueger)

Similar to Wikipedia, just show the topic that was requested. For example, if the user requested (or was linked to) Main.Baum, show that (the German version), and provide a list of translated versions (e.g. Main.Tree) somewhere in the UI.

Pros
  • no extra work
    • Sven recons you're wrong about this smile
Cons
  • might not feel quite as seamless (but still well-known to users from Wikipedia et al)
  • still needs a way to disambiguate translations when languages clash on the same topic name within the same translation set (BugItem12EN, BugItem12DE)
  • Seach results will contain multiple languages and results, or you need to make a complicated and plausibly slow filter on searching and results
  • change management UI and translation linking tools still need to be written
  • Sven thinks you have essentially the same wiki app complexity to the partitioned approach, plus more scalability issues.
Topic revision: r6 - 11 Jul 2012, SvenDowideit
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy