[GNC-dev] How to manage documentation translations

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[GNC-dev] How to manage documentation translations

Geert Janssens-4
Somewhere in the long thread about future documentation directions the issue
of translating documentation was raised. Rightfully so, because this is
currently a very challenging task.

The initial translation, while a huge job, is relatively straight forward: one
takes the English docbook files and translates each paragraph and header, one
by one. In the end one ends up with the same docbook files in your language.

Documentation updates on the other hand are more challenging. Whenever the
English document is changed the only clue as to what has changed is in the git
logs. Git, while very powerful, is not very translator-friendly. It's
targeting a completely different use case.

For translations other tools exist. The best-known to us is gettext. This
works by creating a message catalog of all translatable strings with tooling
to help in translating them. Whenever an original string changes it is
relatively easy for the translator to figure out which string has changed and
make the necessary changes in the translation as well.

These two methods behave very differently in case of partially translated
documentation:

* With our old method only documentation that has been translated in available
in that language. Untranslated parts are not available at all.
* With the gettext method the full documentation is always presented to the
user. For parts that have been translated, this translation will be shown. For
parts that have not been translated, the original English text will be shown.

Also in the old method documentation doesn't move unless a translator makes
modifications. In the gettext method the translation may change whenever the
English original changes. And even a simple change of punctuation would hide a
translation from the end user (at least that's what happened in our current
Italian translation, which is based on the gettext method).

I honestly don't know which end result is preferred by non-English speaking
end users. Perhaps we should poll for this in our non-English mailing lists.

That's what we have now and I think we should be able to do better. Both for
our translators as for our end users.

In the light of the upcoming major rework of the guide, option one will end in
a lot of translator frustration. Translators will be required to interpret git
logs and diffs to learn what has happened. As said, git is not a very good
tool for non-developers to deal with.

So I'm inclined to look for improvements in the gettext method.

For the direct issues mentioned above:
1. losing translations on something as simple as a punctuation change.
We could avoid this by not running gettext extraction automatically. In a way
that would make the gettext method a hybrid between the two methods. The
workflow would become:
a. a translator runs gettext
b. the translator looks for new/changes text and updates the translation.
c. this will be used from now on
d. until the translator reruns gettext.
=> technical note, this really means we should copy/cache the original English
documentation for each language we support. This copy/cache should only be
updated on request of the translator.
The advantages of this approach are
* all of gettext tooling is available to support the translator
* the translated documentation the end user sees will always be what the
translator intended and never change automatically behind the translator's
back.
Drawbacks:
* if the translation was not complete, there will be English parts in there
still.
* translator needs to be aware of the requirement to rerun gettext (or more
precisely to update the copy/cache)

Another approach would be to tweak gettext behaviour to not hide slightly
altered (fuzzy) translations.

2. The presence of English text in partially translated documentation
We could again wrap around gettext and filter out untranslated parts. That may
give odd results though so I wouldn't recommend it.

Or we could leverage automated translations for example via google translate.
We would have to add a note the translation may be more unreliable in that
case. But perhaps poor translation is better than no translation at all ?

That's as far as I got for now. More input and other ideas are welcome.

Regards,

Geert


_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: [GNC-dev] How to manage documentation translations

John Ralls-2


> On Sep 11, 2018, at 1:45 PM, Geert Janssens <[hidden email]> wrote:
>
> Somewhere in the long thread about future documentation directions the issue
> of translating documentation was raised. Rightfully so, because this is
> currently a very challenging task.
>
> The initial translation, while a huge job, is relatively straight forward: one
> takes the English docbook files and translates each paragraph and header, one
> by one. In the end one ends up with the same docbook files in your language.
>
> Documentation updates on the other hand are more challenging. Whenever the
> English document is changed the only clue as to what has changed is in the git
> logs. Git, while very powerful, is not very translator-friendly. It's
> targeting a completely different use case.
>
> For translations other tools exist. The best-known to us is gettext. This
> works by creating a message catalog of all translatable strings with tooling
> to help in translating them. Whenever an original string changes it is
> relatively easy for the translator to figure out which string has changed and
> make the necessary changes in the translation as well.
>
> These two methods behave very differently in case of partially translated
> documentation:
>
> * With our old method only documentation that has been translated in available
> in that language. Untranslated parts are not available at all.
> * With the gettext method the full documentation is always presented to the
> user. For parts that have been translated, this translation will be shown. For
> parts that have not been translated, the original English text will be shown.
>
> Also in the old method documentation doesn't move unless a translator makes
> modifications. In the gettext method the translation may change whenever the
> English original changes. And even a simple change of punctuation would hide a
> translation from the end user (at least that's what happened in our current
> Italian translation, which is based on the gettext method).
>
> I honestly don't know which end result is preferred by non-English speaking
> end users. Perhaps we should poll for this in our non-English mailing lists.
>
> That's what we have now and I think we should be able to do better. Both for
> our translators as for our end users.
>
> In the light of the upcoming major rework of the guide, option one will end in
> a lot of translator frustration. Translators will be required to interpret git
> logs and diffs to learn what has happened. As said, git is not a very good
> tool for non-developers to deal with.
>
> So I'm inclined to look for improvements in the gettext method.
>
> For the direct issues mentioned above:
> 1. losing translations on something as simple as a punctuation change.
> We could avoid this by not running gettext extraction automatically. In a way
> that would make the gettext method a hybrid between the two methods. The
> workflow would become:
> a. a translator runs gettext
> b. the translator looks for new/changes text and updates the translation.
> c. this will be used from now on
> d. until the translator reruns gettext.
> => technical note, this really means we should copy/cache the original English
> documentation for each language we support. This copy/cache should only be
> updated on request of the translator.
> The advantages of this approach are
> * all of gettext tooling is available to support the translator
> * the translated documentation the end user sees will always be what the
> translator intended and never change automatically behind the translator's
> back.
> Drawbacks:
> * if the translation was not complete, there will be English parts in there
> still.
> * translator needs to be aware of the requirement to rerun gettext (or more
> precisely to update the copy/cache)
>
> Another approach would be to tweak gettext behaviour to not hide slightly
> altered (fuzzy) translations.
>
> 2. The presence of English text in partially translated documentation
> We could again wrap around gettext and filter out untranslated parts. That may
> give odd results though so I wouldn't recommend it.
>
> Or we could leverage automated translations for example via google translate.
> We would have to add a note the translation may be more unreliable in that
> case. But perhaps poor translation is better than no translation at all ?
>
> That's as far as I got for now. More input and other ideas are welcome.

FWIW searching "translate docbook documentation" produced a lot of pages involving gettext and none involving our whole-document approach.

Regards,
John Ralls

_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel