Erlang Central

Gettext - An internationalization package.

Revision as of 20:32, 21 July 2006 by Tmster (Talk | contribs)

Contents

Gettext

Introduction

The gettext application makes it possible to internationalize your application. For example, if you have a Web application where you want to present information in different languages, you can accomplish that with the gettext application.

The name gettext comes from the GNU package with the same name. Note however that the only thing they have in common is the format of the PO-files, i.e the files containing the text that can be translated. A PO file contains the Original String and the Translated String.

Example of a PO entry generated by gettext:

 #: esmb_gettext.erl:13 
 msgid ""
 "Hello World"
 msgstr ""
 "Hej Värld"

The string(s) following the msgid tag is the Key, i.e the Original String. The string(s) following the msgstr tag is the Value, i.e the Translated String that will be presented in place of the Original String.

Using gettext you can create an initial PO-file containing all the strings of your application that should be possible to translate. By translating the strings into some other language and loading the new PO-file into the gettext DB you can adapt your application for different languages.

Note that the very first entry of the PO file is a bit special since it contains meta-information. Especially important is the charset information. It is important that you set this right when doing a translation. If you store the strings in utf-8 format then put that info in the PO-file.

Example of the charset info in the PO-file.

 ...removed some lines here...
 "MIME-Version: 1.0\n"
 "Content-Type: text/plain; charset=utf-8\n"
 "Content-Transfer-Encoding: 8bit\n"

Installing gettext

The easiest is to install it using erlmerge. Just run the command: erlmerge -i gettext (as root if your Erlang installation is done as root).

If you are not using erlmerge (why not?). The you will find gettext in the jungerl repository of sourceforge.

Importing from jungerl

An alternative to use erlmerge is to import gettext into your own source code repository. Here follows a suggestion for how to do this.

First, if you have write access to jungerls CVS repository, then set a tag in jungerl to make it easier for you to diff any future changes:

 cd .../lib/gettext
 cvs tag gettext-2006-July-21

Next, create another directory where you export gettext from your tag:

 mkdir aaa
 cd aaa
 cvs -z3 -d ${CVSROOT} export -r gettext-2006-July-21 jungerl
 cd jungerl/lib/gettext

Now, import the code into your own source code repository:

 cvs -d ${MYCVSROOT} import ${MYPROJ}/lib/gettext jungerl gettext-2006-July-21

You're done !

If you don't have write access to jungerl then you have to import without setting a tag. Just checkout jungerl anonymously, remove the CVS dirs, and import.

 cvs -z3 -d :pserver:anonymous@jungerl.cvs.sourceforge.net:/cvsroot/jungerl \
    co -d ${HOME}/jungerl jungerl
 cd ${HOME}/jungerl/lib/gettext
 rm -rf CVS */CVS
 cvs -d ${MYCVSROOT} import ${MYPROJ}/lib/gettext jungerl ${TODAYS-DATE}

If you add the ${TODAYS-DATE} in the import comment you should be able to diff against future jungerl changes using the date information.

Using gettext

In your Erlang or Yaws file, whenever you have a string that should be possible to present in a different language, you wrap it with one of the gettext macros ?TXT/1 or ?TXT2/1.

Example of how to use the TXT macros.

 -include(".../gettext/include/gettext.hrl").
 
 hello(LangCode) ->
   ?TXT2("Hello World", LangCode).

The ?TXT macro will be substituted with a call to: gettext:key2str("Hello World", LangCode). This function will try to lookup the string "Hello World" in the LangCode-DB (database). If no such DB can be found the string will just be returned as is. This way no string will ever disappear since it (at least) always will fallback to the original string.

To make it extra convenient, you can use the ?TXT macro which just takes one argument and expands to: gettext:key2str("Hello World", get(gettext_language)). As you can see it assume that you already have put the language code into the process dictionary using the key: gettext_language. This is useful for example in a Yaws application where you perhaps get the prefered language code from the headers of the HTTP request. Just put the language code into the process dictonary and you are done!

The gettext database

There are two ways you can control where the gettext database is located. The first way is to set the environment variable: GETTEXT_DIR to point to the directory where you want gettext to store its data. The second way is to provide a callback function named: gettext_dir/0. You specify the module, either by setting the environment variable GETTEXT_CBMOD, or by giving it as an argument in your supervisor code that setup the gettext_server (see example in gettext_sup.erl).

Example, the callback module.

 M:gettext_dir/0 ==> "/tmp/gettext_DB"

The directory structure will look like shown below:

The GETTEXT_DIR directory structure.

 $(GETTEXT_DIR)/lang/gettext_db.dets
 $(GETTEXT_DIR)/lang/default/$(GETTEXT_DEF_LANG)/gettext.po
 $(GETTEXT_DIR)/lang/custom/$(LANG)/gettext.po

The dets file contains the actual lookup database. The default directory will only contain one subdirectory with the name of your default language (e.g "en" as in english). The custom directory will contain one subdirectory for each language you have, each subdirectory containing a translated gettext.po file.

If you want add a translated PO-file to your DB you call the function: gettext_server:store_pofile(LanguageCode, BinPOfile) See also the example section below.

How to create the 'initial' PO-file.

This step requires you to setup some Makefile support in your build environment.

The jungerl version of gettext has a simple example prepared for you to look at. The top gettext Makefile contains a target jungerl_example which will run make using the Makefile.gettext makefile.

The Makefile.gettext file:

 ##
 ## Add the directories here that should be gettext'ified
 ##
 TXT_DIRS = ../esmb  # ../eldap
 
 ##
 ## Set this to a directory where we have write access.
 ## This directory will hold all po-files and the dets DB file.
 ##
 GETTEXT_DIR=/tmp/gettext_example
 
 ##
 ## Set the language code of the default language, i.e the
 ## language you are using in the string arguments to the
 ## ?TXT macros.
 ##
 GETTEXT_DEF_LANG=en
 
 ##
 ## Set this to an arbitrary name (or leave it as is :-)
 ## It will create a subdirectory to $(GETTEXT_DIR) where
 ## the intermediary files of this example will end up.
 ##
 GETTEXT_TMP_NAME=tmp
 
 ##
 ## We setup some dependencies here so that we only
 ## perform any processing iff anything has changed.
 ##
 DEP_FILES1=$(TXT_DIRS:%=%/priv/docroot/*.{yaws,js})
 DEP_FILES2=$(TXT_DIRS:%=%/src/*.{erl,hrl})
 DEP_FILES=$(shell grep -s -l TXT $(DEP_FILES1) $(DEP_FILES2))
 
 gettext_example: $(GETTEXT_DIR)/lang/$(GETTEXT_TMP_NAME)/epot.dets 
 
 $(GETTEXT_DIR)/lang/$(GETTEXT_TMP_NAME)/epot.dets: $(DEP_FILES)
 	@(export gettext_tmp_name=$(GETTEXT_TMP_NAME); \
         export gettext_dir=$(GETTEXT_DIR); \
         export gettext_def_lang=$(GETTEXT_DEF_LANG); \
         export ERL_COMPILER_OPTIONS="[gettext]"; \
         rm -f $(GETTEXT_DIR)/lang/$(GETTEXT_TMP_NAME)/epot.dets; \
         for xdir in $(TXT_DIRS); do \
               ( cd $$xdir; $(MAKE) gettext ) \
         done; \
         erl -noshell -s gettext_compile epot2po; \
         install -D $(GETTEXT_DIR)/lang/$(GETTEXT_TMP_NAME)/$(GETTEXT_DEF_LANG)/gettext.po    
 
 $(GETTEXT_DIR)/lang/default/$(GETTEXT_DEF_LANG)/gettext.po; \
         rm -rf $(GETTEXT_DIR)/lang/$(GETTEXT_TMP_NAME))
 
 gettext_clean:
         @(for xdir in $(TXT_DIRS); do \
               ( cd $$xdir; $(MAKE) gettext_clean ) \
           done)

Take a look at the Makefile.gettext file. At the top you specify the directories that should be processed, i.e where there are code containing ?TXT macros. As you can see from the example; we have specified that it is the esmb application that should be processed. We also specify where the 'gettext' data dir is located.

Then, for each directory to be processed, we run make with a special target named gettext. If you look into the esmb application you can see that its top Makefile has got such a target. All this target needs to do is to remove any dependency files (e.g beam files) and then run make to compile them again.

All source files that are using the ?TXT macro are also including the gettext/include/gettext.hrl file which contains a parse-transform. This parse-transform will store all strings that was wrapped with the ?TXT macro into a temporary database.

Going back to the Makefile.gettext file again you can see that when all processing is finished, we call a last generation step. Where we will extract all data from the temporary DB and generate the 'initial' PO-file.

Example session.

Here follows an example session showing how to use gettext. We start by creating our database directory and generate the initial PO-file.

Create the database:

 # cd jungerl/lib/gettext
 
 # make jungerl_example
   (compilation printouts removed here)
 
 # cat /tmp/gettext_example/lang/default/en/gettext.po
   (look at the nice PO-file we have got)

Before we start our Erlang system we setup an environment variable pointing to our gettext database directory:

An Erlang session:

 export GETTEXT_DIR=/tmp/gettext_example
 
 # erl -pa ../esmb/ebin
 
 %% This fill create and populate the dets file
 1> gettext_server:start().
 {ok,<0.35.0>}
 
 %% We are calling our example application
 2> esmb_gettext:start().
 "Hello World"
 
 %% Changing to a unknown (at the moment) language code
 3> put(gettext_language, "swe").
 undefined
 
 %% Falling back to the original string
 4> esmb_gettext:start().        
 "Hello World"
 
 %% Now read in a Swedish translation
 5> {_, Bin} = file:read_file("swedish.po").
 {ok,<<35,32,83,79,77,69,32,68,69,...>>}
 
 %% Store the translation
 6> gettext_server:store_pofile("swe", Bin).
 ok
 
 %% Now look at that...nice !!
 7> esmb_gettext:start().                   
 "Hej Värld"
 
 %% Change to another language
 8> put(gettext_language, "en").            
 "swe"
 
 %% Perfect !!
 9> esmb_gettext:start().       
 "Hello World"
 
 %% Get the character set used for a language.
 10> gettext_server:lang2cset("swe").
 {ok,"iso-8859-1"}

Final remarks.

You can easily write some code on top of this to make it possible to export the initial PO-file and to import translated PO-files. Also, take a look at the iso639.erl file which can be helpful if you want to present some standardized language codes and their full language names.

If you run a Web application, it is important that you tell the Web browser what character set you are using. To support this, you can use the function: gettext_server:lang2cset(LanguageCode) This way you can make sure that the browser can display your pages correctly. It is even possible to convert from the character set you have to what the Web browser wants if you make use of the 'iconv' library that exist in jungerl. But that is another story...

Download xml

gettext.xml