OpenXcom Forum

Contributions => Translations => Topic started by: karvanit on November 22, 2012, 08:20:24 am

Title: Translator feedback required
Post by: karvanit on November 22, 2012, 08:20:24 am
I am attempting to create a more language-agnostic OpenXcom, by making more full sentences part of the language strings instead of building them from parts.
In the process string markers will be used, as well as plural forms.
So you have IDs like (for plural forms):
Code: [Select]
STR_DAYS_LEFT_0
No days left.
STR_DAYS_LEFT_1
Just {N} day left.
STR_DAYS_LEFT_2
Only {N} days left.

And argument substitution:
Code: [Select]
https://Engilsh (US)
STR_FAKE_craftweapon_MESSAGE_crafttype_EXAMPLE
This is a {1} for {2}.
https://Engilsh (UK)
STR_FAKE_craftweapon_MESSAGE_crafttype_EXAMPLE
Only {2} can use {1}.

Plural forms CAN be combined with argument substitution, but only like this:
Code: [Select]
STR_FAKE_crafttype_WARNING_0
No {1} in base
STR_FAKE_crafttype_WARNING_1
{N} {1} left in base
STR_FAKE_crafttype_WARNING_2
{1} in base: {N}

The plural forms work for 0 (_0) and for language  specific cases.
If the 0 case is not found, the language rules (eg plural for English, singular for French) are used.
Can the translators (or other speakers) please provide me with the different cases (number of forms and the numbers they refer to, not actual text needed)?
I currently have the rules for:

All other languages (now) use the English rules.
Title: Re: Translator feedback required
Post by: Fenyő on November 22, 2012, 08:57:37 am
Can the translators (or other speakers) please provide me with the different cases (number of forms and the numbers they refer to, not actual text needed)?
I'm not sure what you are asking.
"1st, 2nd, 3rd, 4th ..."
Is this what you want in other languages?
If not, could you please be a little more specific?
Title: Re: Translator feedback required
Post by: karvanit on November 22, 2012, 09:51:33 am
No, something more basic.
It's better to use an example:
In English, Greek and French, words can be either singular (1 of something) or plural (more than one).
English and Greek use plural for 0 as well, French uses singular.

So:
English, Greek: 2 forms, n == 1, n !=1
French: 2 forms, n < 2, n >1

So English and Greek language files would contain id strings with suffixes _0 (optional, used when n == 0), _1 (required, used when n == 1), _2 (required, used when n != 1) in the language file, while French would _0 (optional, used when n == 0), _1 (required, used when n < 2), _2 (required, used when n > 1).

 What I want is exactly these rules for other languages.
Title: Re: Translator feedback required
Post by: kkmic on November 22, 2012, 10:07:38 am
I could had helped you with Romanian, but I see the seat is taken :)
Title: Re: Translator feedback required
Post by: Fenyő on November 22, 2012, 02:26:32 pm
No, something more basic.
It's better to use an example:
In English, Greek and French, words can be either singular (1 of something) or plural (more than one).
English and Greek use plural for 0 as well, French uses singular.
So, like this:
0 apples.
1 apple.
2 apples.

If that's it, then it's in Hungarian:
They are all the same, SINGULAR:
0 alma.
1 alma.
2 alma.

(alma=apple, almák=apples)

BTW, why do you even bother with every language on your own?
Why not just modify the language files, appending with the thing, and let the translators "fix" the strings? (and they send it within a PR)
Title: Re: Translator feedback required
Post by: karvanit on November 22, 2012, 03:03:25 pm
If that's it, then it's in Hungarian:
They are all the same, SINGULAR:
0 alma.
1 alma.
2 alma.

(alma=apple, almák=apples)
If you have alma and almák, then it's not all singular, is it?
I don't want the actual words, I want the different cases, as in "Sentences (nouns, verbs, whatever) will take one of X forms, where form 1 is for value A-B, form 2 for values C-F", etc.

BTW, why do you even bother with every language on your own?
Why not just modify the language files, appending with the thing, and let the translators "fix" the strings? (and they send it within a PR)

The thing requires coding support, because I implement it with looking for differently suffixed key (eg STR_1, STR_2). So I need to know how many keys, and when to use the proper suffix. Then I'll be done with basic coding support, and the hard work of replacing fragment sentence assembly with proper placeholders will begin.
Hopefully, the basic support will be accepted upstream, and we'll change the texts incrementally, perhaps after proding by translators for the worst examples (eg see post about laser rifle manufacture).
Title: Re: Translator feedback required
Post by: radius75 on November 22, 2012, 03:56:41 pm
I met with the similar solution in the UFO:AI. (.po and .pot language file)
This method is OK, gives the more possibility for translators.
I do not see any problem here :)

example from ufo:ai .po file

Quote
msgid "Scientist"
msgid_plural "Scientists"
msgstr[0] "Naukowiec"
msgstr[1] "Naukowców"
msgstr[2] "Naukowców"


msgid "Construction time:\t%i day\n"
msgid_plural "Construction time:\t%i days\n"
msgstr[0] "Czas budowy:\t%i dzień\n"
msgstr[1] "Czas budowy:\t%i dni\n"
msgstr[2] "Czas budowy:\t%i dni\n"


msgid "%s was transfered to %s."
msgstr "Przesłano %s do %s."


msgid "Recovered %s from the battlefield. UFO is being transported to %s."
msgstr "Pozyskano %s z pola bitwy. UFO jest transportowane do %s."
Title: Re: Translator feedback required
Post by: karvanit on November 22, 2012, 03:58:54 pm
That's the idea, although the code side is closer to Qt's QTranslator.
Too bad gettext requires LOTS of additional libraries in Windows...
Title: Re: Translator feedback required
Post by: radius75 on November 22, 2012, 04:10:15 pm
or example from pioneerspacesim

Quote
JETTISONED_1T_OF_X
    Jettisoned 1 tonne of %commodity
JETTISONED_1T_OF_X
    %commodity: 1 tona wyrzucona za burtę
   
   
FUEL_SCOOP_ACTIVE_N_TONNES_H_COLLECTED
    Fuel scoop active. You now have %quantity tonnes of hydrogen.
FUEL_SCOOP_ACTIVE_N_TONNES_H_COLLECTED
    Dren paliwowy włączony. Masz %quantity ton wodoru.
   
Title: Re: Translator feedback required
Post by: Fenyő on November 22, 2012, 04:55:22 pm
If you have alma and almák, then it's not all singular, is it?
When used as specifying how many of them, they're all singular.
We use the plural only when no number is specified:
Examples:
There are apples on the tree.   ->   A fán almák vannak.
2 apples left.   ->   2 alma maradt.
1 apple left.   ->   1 alma maradt.
0 apples left.   ->   0 alma maradt.
There are no apples left.   ->   Nem maradt alma.
Title: Re: Translator feedback required
Post by: karvanit on November 22, 2012, 05:54:35 pm
Thank you, now Hungarian support is also in my PR.
Title: Re: Translator feedback required
Post by: daggerstab on November 24, 2012, 08:52:11 pm
Why don't you use gettext? It has automatic support for multiple plural forms:
https://www.gnu.org/software/gettext/manual/gettext.html#Plural-forms

It's originally a POSIX/Linux library, but it also has ports for Windows and MacOS.
https://gnuwin32.sourceforge.net/packages/gettext.htm
Title: Re: Translator feedback required
Post by: karvanit on November 24, 2012, 09:10:23 pm
Because OpenXcom design calls for translation to be based on string IDs, not the full text.
Title: Re: Translator feedback required
Post by: daggerstab on November 24, 2012, 09:18:14 pm
Well, "you" was in the more general meaning of "the OpenXcom project", and perhaps this was the wrong place to ask. :)
Title: Re: Translator feedback required
Post by: grzegorj on January 20, 2013, 12:36:24 pm
Karvanit, is this topic closed with no results? :)

Summing up, there is not only problem with zero. AFAIK, in order to use really correct language forms, you should use more than those three forms you have mentioned at the start. And what is used in different games, is also not correct. You have assumed incorrectly that the only possible differences are between a singular and a plural form. This is not true in languages with dual (Slovene has different forms used with 2) and with cases (most Slavic, for example) as various case forms may be used with such or another numeral. As a Polish native speaker who know also Russian enough well, I can share my observations with you.

Let's take units and soldiers. In order to make fully correct forms, we should have at least the following variants:
so at minimum 5 different variants. The list of 5 different forms is going to be universal for now (and not only Polish-oriented) but it is quite possible that would not be enough when we added Slovene which uses separate dual forms.

I have separated 21, 31, 41 ... from 5, 6, 7 etc., taking Russian under consideration: in that language numerals 21, 31, ... 91 behave just like 1 (so with singular, not like in Polish). Be aware of 11, 12, 13 and 14 - they behave differently than 21, 22, 23 and 24 (in both Polish and Russian).

However... as you can see, there are words used in 3 different forms (sztuka : sztuki : sztuk) and words with only two forms (żołnierz : żołnierzy). As now, STR_DAY_1, STR_DAY_2, STR_DAY_3 are enough for fully correct Polish - as "day" has only 2 forms here. But it is still not enough for Russian! - as I can see in the language file, now the "general plural" form is used there - but this form is, frankly speaking, incorrect to use with any numeral ({N} Дни instead of 0 дней, 2-4 дня, 5-20 дней, 21 день, 22-24 дня, 25-30 дней and so on, with only 1 день fully correct).

Which is more, if you want to translate "Units: {N}", you do not need numerous forms, as one general form may be used with all quantities. However, in order to translate "{N} units" correctly, you will have to use those 5 listed above, universal different strings.

Reassuming: I do not think there is a need for creating many different variants for every string with a quantity. Perhaps you should rely on translators' requests, and when there is such a need, make variants of a given string only, as many as needed.
Title: Re: Translator feedback required
Post by: karvanit on January 20, 2013, 01:03:23 pm
This is exactly why this thread exists. In the code I use the rules for each language to look up special strings when a number is involved.

I have a way to specify the rules in code (and the rules for Russian are already there), exactly as you mention them.

If the special treatment is asked for a key based on a number n (eg STR_SOLDIERS_IN_CRAFT), the following are tried:
If n is 0 the STR_SOLDIERS_IN_CRAFT_0 is selected, if it exists. This happens for all languages.
Based on each language different rules are used, and one of keys STR_SOLDIERS_IN_CRAFT_1 to _K are selected, based on the language. _1, _2 and the rest may have different meaning for each language, they are NOT the number n. It is simply the k-form in the rules for the specific language. So, while English and French both use _1 and _2, for English _1 is used only for singular and _2 is used for plural and for zero (if _0 is not found). For French _1 is used for singular and as the zero fallback, while _2 is used for plural.

Please check the rules in the source code (file src/Engine/Language.cpp) it should be easy enough to understand the various "getSuffix" functions for the different languages.

The problem is that a lot of the strings are created from smaller pieces (eg words) and thus the actual game code needs to change to use the new translation facility and make sentenses that the translators can modify. As a fictional example:
Now the code is like:
Code: [Select]
message = translate("STR_THERE_ARE") + n + translate(n == 1 ? "STR_SOLDIER" : "STR_SOLDIERS") + translate("STR_IN_THE_CRAFT")
and the translators only see the "STR_" strings.

It should be changed to
Code: [Select]
message = translate("STR_THERE_ARE_N_SOLDIERS_IN_CRAFT", n)

and then the translator for (Rusian?) will see:
STR_THERE_ARE_N_SOLDIERS_IN_CRAFT_0
Some rusian for żołnierzy or sztuk
STR_THERE_ARE_N_SOLDIERS_IN_CRAFT_1
maybe {N} sztuka or {N} żołnierz
STR_THERE_ARE_N_SOLDIERS_IN_CRAFT_2
maybe some {N} sztuki or {N} żołnierzy
STR_THERE_ARE_N_SOLDIERS_IN_CRAFT_3
maybe some {N} sztuk or {N} żołnierzy

while the translators for French will see:
STR_THERE_ARE_N_SOLDIERS_IN_CRAFT_0
French for no soldiers in craft
STR_THERE_ARE_N_SOLDIERS_IN_CRAFT_1
French for a single soldier in craft
STR_THERE_ARE_N_SOLDIERS_IN_CRAFT_2
French for {N} soldiers in craft

And the English text will be:
STR_THERE_ARE_N_SOLDIERS_IN_CRAFT_0
No soldiers in craft
STR_THERE_ARE_N_SOLDIERS_IN_CRAFT_1
A single soldier in craft
STR_THERE_ARE_N_SOLDIERS_IN_CRAFT_2
{N} soldiers in craft

And if the _0 form is missing in English, the _2 form will be tried, but for French the _1 form will be tried.
Title: Re: Translator feedback required
Post by: Fenyő on February 11, 2013, 07:57:44 pm
@karvanit: Could you please take a look at the Hungarian plurality thing?
Something is wrong. When i land with a craft on an UFO, and immediately press Abort:
On the Mission Abort screen i see "STR_n_UNITS_IN_EXIT_AREA". :(
But it should show the correct string.
I've tried to debug it myself, but i have no luck.
In the language file i have both STR_n_UNITS_IN_EXIT_AREA_0 and STR_n_UNITS_IN_EXIT_AREA_1.
I really don't know what is the problem here...
Title: Re: Translator feedback required
Post by: karvanit on February 11, 2013, 08:03:46 pm
Are you SURE the language file with the variant strings is in the proper place?
Do you see any line mentioning STR_n_UNITS_IN_EXIT_AREA in the logs?
Can you also add a STR_n_UNITS_IN_EXIT_AREA_2 with a dummy value (eg "This should not appear EVER!") and check if you get that value?

I'm pretty much drowning at work right now, I won't be able to take a look before the weekend.
Title: Re: Translator feedback required
Post by: Fenyő on February 11, 2013, 08:13:39 pm
Of course i tried _2. It had no effect. :(
This part of the language file is:
Code: [Select]
STR_SELECT_SQUAD_FOR_craftname
Válassz osztagot: {1}
STR_n_UNITS_IN_EXIT_AREA_0
{N} egység elhagyásra kész
STR_n_UNITS_IN_EXIT_AREA_1
{N} egység elhagyásra kész
STR_n_UNITS_OUTSIDE_EXIT_AREA_0
{N} egység elhagyásra nem kész
STR_n_UNITS_OUTSIDE_EXIT_AREA_1
{N} egység elhagyásra nem kész
STR_ABANDON_GAME_QUESTION
ABBAHAGYOD A JÁTÉKOT?


EDIT:
Quote
Do you see any line mentioning STR_n_UNITS_IN_EXIT_AREA in the logs?
Of course:
Code: [Select]
[11-02-2013 19:15:21] [WARN] STR_n_UNITS_IN_EXIT_AREA not found in Hungarian
Title: Re: Translator feedback required
Post by: karvanit on February 11, 2013, 08:21:07 pm
The string matching is verbatim, so can you re-type the ID strings in the language file, AFTER deleting the lines that contain them?
Like so:
1. Delete the line that reads "STR_n_UNITS_IN_EXIT_AREA_0".
2. Retype the STR_n_UNITS_IN_EXIT_AREA_0 in a new line at the proper place.
3 and 4. Do 1 and 2 for STR_n_UNITS_IN_EXIT_AREA_1.

If this works then some special characters got in the language file when I edited it. I don't really think it probable, but the code looks ok.
Title: Re: Translator feedback required
Post by: Fenyő on February 11, 2013, 09:43:24 pm
I've tried to replace the whole section(Copy-Paste) from the English.lng and leave the English strings, but the result is the same!
Title: Re: Translator feedback required
Post by: karvanit on February 22, 2013, 09:24:00 pm
Ok, the code is fixed and the PR is send. Thank you for spotting this.
The error was in the actual translation code, not in the language files.
Title: Re: Translator feedback required
Post by: Fenyő on February 23, 2013, 04:03:07 am
Thank you!