logo Autopackage - Easy Linux Software Installation

Translators guide

  1. Introduction
  2. Rosetta
  3. Getting started
  4. Translating
    1. Fuzzy Strings
    2. Advanced Strings
    3. Plural Strings
  5. Some things to think about
  6. Testing the translation
  7. Links to other resources
  8. Translation related tools

Introduction

To be really user friendly, an application should be displayed in the users native language. This is accomplished through a system called gettext, that is used by most projects in the free software world. The basic concept is that all English strings that are displayed to the user is isolated and extracted so that they can be translated by someone who knows the specific language.

To translate a program is not a difficult task, nor does it require any technical skills. All that is necessary is some patience and basic understanding of English. If you are interested, keep on reading.

Rosetta

The translations in Rosetta are currently very outdated. I have tried to upload new templates for the 1.2 release, but for some reason, it doesn't show up. The only way to translate 1.2 (recommended!) is to download the templates and work locally on your computer.

Rosetta is a web based tool for easy translation. It can be used directly from your web browser without having to download any files or extra programs. However, Rosetta is a fairly new thing, and there are some problems that you may encounter if you decide to use it. We've had reports of Rosetta not allowing you to translate the first ten strings for instance. This is not a huge problem, since those strings can be manually fixed after the translation is finished.

If you decide to use Rosetta, you must create an account on the Ubuntu web page before you can start translating. Autopackage's Rosetta translation page can be found at: https://launchpad.ubuntu.com/rosetta/products/autopackage.

Most of this guide applies to Rosetta translation as well. So even though there are some technical details that can be ignored or may be different on Rosetta, you will probably find some useful information here.

Getting started

Language code

Every language has a two-letter language-code which is defined by the ISO-639 standard. In this guide, I will refer to this language code with the string "xx". When you see that, you should replace it with your code.

Some languages are different depending on the country. For instance, there is a difference between British English and American English. In these situations, the xx language code should be suffixed by the country code as defined by ISO-3166

For instance, American English would have en_US, while British English would have en_UK.

Note: It is almost always a good idea to not include the country code in the translation name. This is because the translation system will choose the best correct match it can find. So, if there is an de_DE translation available, it will be used for German people, but not for people in living in Austria, even though they speak mostly the same language. Both Germany and Austria will, however, use a translation named only de.

Getting the files

First of all, you need to get the files that contains the English strings. These files are called pot-files. In Autopackage, there are two pot-files that should be translated, one for the main program and one for the graphical frontend.

The translations and templates can be found on the translation status pages. That page also contains links to older versions of Autopackage and its translations.

Download the templates (or translations if you are continuing a previous translation) you want to work on and place them in separate directories. This is important since all translations are named xx.po regardless of what program they are translated for. The next section will describe how to create the xx.po file from the templates.

Now, you need to find a suitable tool for translation. Any text editor with UTF-8 support is sufficient, but there are tools that are specific translation editors. See the tools section for a short listing. The important thing is that all translations must be in UTF-8 format! See the manual for your specific tool on how to do this! Note that ASCII encoding is a subset of UTF-8, so if your language only contains the letters a-z, standard encoding will be the same as UTF-8.

Verify by running:
$ file xx.po xx.po: UTF-8 Unicode English text
or
$ file xx.po xx.po: ASCII English text
You can convert files between different characters sets using the iconv program:
iconv --from-code ISO-8859-1 --to-code UTF-8 -o new.po old.po

Translating

In this section, I will assume that a normal text editor is used. If you are using a dedicated translation app, some things may look different. The basic principle is the same though.

First of all, you must create a .po-file from the downloaded .pot-files. This can be done with a program called msginit, which you should run in the same directory as your .pot-file. It will create the header by guessing some fields based on your environment variables and create a file called xx.po for you. (xx is your language code)

If msginit doesn't work for you, just copy the autopackage.pot to xx.po (xx again being your language code) and fill in the header manually. autopackage.pot contains the following header by default:

msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2004-12-30 10:52-0500\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=CHARSET\n"
"Content-Transfer-Encoding: 8bit\n"
These should be filled in with the appropriate values. For instance, I have the following header in the Swedish translation:
msgid ""
msgstr ""
"Project-Id-Version: Autopackage 0.7\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2004-12-29 23:04+0100\n"
"PO-Revision-Date: 2004-12-29 15:19+0100\n"
"Last-Translator: Isak Savo <my-email>\n"
"Language-Team: Swedish <sv@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"

After you've created the header, you may start translating. All strings are listed in pairs of msgid and msgstr. The first is the English1, original string while the latter is where you should enter the translated string.

Let's begin with a simple example. In autopackage.pot, we have the following:

#: apkg-defs:38
msgid "Action aborted."
msgstr ""
The first line is a comment that provides some basic info about that string. In this example it says where the original string is located in the sources: the file "apkg-defs", line 38.
The next line is the msgid line, with the original string. Translate that string and enter it into the msgstr field so that it looks something like:
#: apkg-defs:38
msgid "Action aborted."
msgstr "Åtgärd avbruten"
Some longer strings may be distributed across several lines, it can look like:
#: apkg-defs:90
msgid ""
"If you give the --prefix option, you need to specify a prefix as the next "
"argument"
msgstr ""
"Om du angivit alternativet --prefix så måste nästföljande argument vara en "
"katalog"
As you see, the strings are placed on a separate line, and even spanning multiple lines. Just mimic the original string when you're in doubt.

1Actually, it doesn't have to be English, but the common practice is to use English as "base" language.

Fuzzy translations

The term "Fuzzy translations" means that there was a change somewhere (a string has been added, removed or modified) and xgettext did a guess on what the translation should be. This guess is most likely not entirely correct, but it is often very close. Your initial translation will not contain any fuzzy strings, but as the autopackage code changes and you do updates on your translation this will most likely happen.

A fuzzy translation is fixed by simply finding the offending translation, which is marked as fuzzy, and correct it.

Example:
#: apkg-defs:38
#, fuzzy
msgid "Action aborted."
msgstr "Some fuzzy translation"

Check the translation, fix it if needed and remove the fuzzy tag from the comments. (You can remove the entire comment-line if it only says 'fuzzy'):

#: apkg-defs:38
msgid "Action aborted."
msgstr "No longer fuzzy translation"

Advanced strings

In some cases, you will come across strings that contain certain characters that are of special meaning to the program. It is very important that you pay attention to these and translate them accordingly.

The most common special characters are the dollar character ($) and percentage sign (%). Lets take an example of the dollar character first:

#: apkg-defs:57
#, sh-format
msgid "Installing $SHORTNAME data files..."
msgstr "Installerar datafiler för $SHORTNAME..."
As you see, it has a comment, sh-format, that indicates that there is something special with this string. $SHORTNAME in this example is a variable that will expand to the name of a package. Often, the name of the variable gives an indicator on what to expect will be printed. You must have all variables in your translation, but the order and position is up to you as a translator to decide.

The percentage sign is used by the function printf in both C and shell mode. It is always followed by a letter or a combination of letters and numbers. These are called conversion characters.
Some common conversion characters:

An example:
#: apkg-defs:71
msgid "Linking from %s to %s"
msgstr "Skapar länk från %s till %s"
Just make sure that you have the same number of %'s as the original and you should do fine. Please note that the order of the %-signs cannot be changed unless some extra information is added:

Assume we have the English string "%d Mb available on %s", which could become "10Mb available on /dev/hda". This is correct English, but so is the string "%s has only %d Mb left" (note change of argument order). This, unfortunately, is not a correct translation. Why? you ask. Well, if we put it in context:

printf ("%d Mb available on %s", 10, "/dev/hda")
is ok, but
printf ("%s has only %d Mb left", 10, "/dev/hda")
will not work for more or less obvious reasons. In any case, we need to tell the program to change places of the arguments.

This is accomplished by using the special position reorder conversion. In our example, the correct translation would be:

msgid "%d Mb available on %s"
msgstr "%2$s has only %1$d Mb left"

I'm using English here so that you all will understand the concept, of course the msgstr should contain the string in your language!

There may be some other special characters that I have forgot to mention, but these are the most common. You can always send questions to the mailing list.

Plural Strings

In almost all languages, words in a sentence change depending on the number of items referred to in that sentence. A simple example is the English sentences "I have one cat" and "I have five cats". In English (and many other Western European languages), there are two plural forms: singular and plural, where singular is 1 (one) item and plural is anything else. This is not true for all languages, some have 3, 4 or even more plural forms.

This is where the plural string handling comes in. First of all, you need to tell gettext how many forms you have and the algorithm for deciding what form to use. This is set in the po header and might look something like this:

"Plural-Forms: nplurals=2; plural=(n != 1);\n"

The above line is for Swedish (and also English), and it basically says: "There are two forms of plural, and any number that is not one (1) is plural, rest is singular". (for those who know the C language, the plural form number is determined by the evaluation of the '(n != 1)' expression, where 'n' is the number of items the string is referring to). The expression will of course be as advanced as the language requires. The following expression is for Polish, which has 3 forms:

"Plural-Forms: nplurals=3; plural=(n==1 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2);\n"

It is left as an exercise to the reader to decipher the above formula. ;-)

Don't worry too much about writing down your formula, there are people who has done that for almost all languages already. Sometimes, msgfmt -cv xx.po -o /dev/null will tell you the correct format - otherwise, try find another projects translation in your language and look at that. Google will probably also be able to help you.

After you have put your formula in the po-header, you are ready to start translating plural strings. Let's take a look at a fictional example:

msgid "I have %d cat"
msgid_plural "I have %d cats"
msgstr[0] "Jag har %d katt"
msgstr[1] "Jag har %d katter"

Just fill up with the number of plural forms applicable to your language.. msgstr[0] is first form ("singular" in English), msgstr[1] the 2:nd form ("plural" in English) and msgstr[2] the 3:rd form etc.

When done

A must before sending in your translation is to make sure it is correct spelling and grammar. Run it through your spell check program. If you have the possibility, send it for review to a couple of friends that speak your language. Many language teams have dedicated mailing lists for reviewing and commenting on translation work. The list xx@li.org, where xx is your language code, is probably a good start. Do some googling on the issue!

Remember that the developers have no way of verifying the correctness of your translation.

Some things to think about

To summon up the previous section, there are some things to think of:

Testing your translation

It is always good practice to test your translation before you send it to us. At a minimum, you should use the msgfmt program on your translation file:
msgfmt --statistics -c xx.po -o /dev/null

It will print out some information about your translation, such as number of translated messages, number of fuzzy messages, eventual errors etc. If all went well, you should see something like:

167 translated messages.

If you also want to try your translation in action, you need to install and compile the translation file. This is done by the same msgfmt program as we used above to verify the syntax. First, you need to find out where autopackage stores its translation files.

Run the command which package in the terminal and get the prefix. Most likely, this will print:
/usr/bin/package

From this, we know that /usr is the prefix, and your translation should be in /usr/share/locale/xx/LC_MESSAGES directory.
Now run, as root if needed:

Autopackage Main:
msgfmt xx.po -o /usr/share/locale/xx/LC_MESSAGES/autopackage.mo

Autopackage GTK+ Frontend:
msgfmt xx.po -o /usr/share/locale/xx/LC_MESSAGES/autopackage-gtk.mo

Autopackage QT Frontend:
msgfmt xx.po -o /usr/share/locale/xx/LC_MESSAGES/autopackage-qt.mo

If you have some other location for the package command, you need to replace "/usr/share" with that prefix instead. For instance, if you have ~/.local/bin/package, then the translation should go into ~/.local/share/locale/xx/LC_MESSAGES.

Links to other resources

There are plenty of translation resources available on the Internet. Here are a couple of them:

Translation Related Tools

There are a couple of tools that are good to use with translations. Of course, any text editor that supports UTF-8 can be used, but many tools contain special features for translations, such as syntax highlighting or other convenient functions.

Some tools, in no particular order:

There are probably many many more tools and editors out there. These are the ones I know of and if none of them works for you, do a google search and tell us about it.


This guide is written by Isak Savo <isak savo () gmail com> (isaks on IRC).
© 2005 The Autopackage Project