PApp::I18n - internationalization support for PApp


   use PApp::I18n;
   # nothing expoted by default
   my $translator = PApp::I18n::open_translator("/libdir/i18n/myapp", "de");
   my $table = $translator->get_table("uk,de,en"); # will return de translator
   print $table->gettext("yeah"); # better define __ and N_ functions


This module provides basic translation services, .po-reader and writer support and text and database scanners to identify tagged strings.


A ``language'' can be designated by either a free-form-string (that doesn't match the following formal definition) or a language-region code that must match the following regex:

 /^ ([a-z][a-z][a-z]?) (?:[-_] ([a-z][a-z][a-z]?))? (?:\.(\S+))? $/ix
     ^                  ^  ^    ^
    "two or three letter code"
                       "optionally followed by"
                          "- or _ as seperator"
                               "two or three letter code"
                                                    "optionally followed by"
                                                       ". as seperator"
                                                         "character encoding"

There is no charset indicator, as only utf-8 is supported currently. The first part must be a two or three letter code from iso639-2/t (alpha2 or alpha3), optionally followed by the two or three letter country/region code from iso3166-1 and -2. Numeric region codes might be supported one day.

set_base $path
Set the default i18n directory to $path. This must be done before any calls to translate_langid or when using relative i18n paths.

normalize_langid $langid
Normalize the language and country id into it's three-letter form, if possible. This requires a grep through a few kb of text but the result is cached. The special language code ``*'' is translated to ``mul''.

translate_langid $langid[, $langid]
Decode the first langid into a description of itself and translate it into the language specified by the second langid (the latter does not work yet). The output of this function also gets cached.

locale_charsets $locale
Returns a list of character sets that might be good to use for this locale. This definition is necessarily inprecise ;)

The charsets returned should be considered to be in priority order, i.e. the first charset is the best. The intention of this function is to provide a list of character sets to try when outputting html text (you can output any html text in any encoding supporting html's active characters, so this is indeed a matter of taste).

If the locale contains a character set it will be the first in the returned list. The other charsets are taken from a list (see the source of this module for details).

Here are some examples of what you might expect:

   de          => iso-8859-1 iso-8859-15 cp1252 utf-8
   rus_ukr     => koi8-u iso-8859-5 cp1251 iso-ir-111 cp866 koi8-r iso-8859-5
                  cp1251 iso-ir-111 cp866 koi8-u utf-8
   ja_JP.UTF-8 => utf-8 euc-jp sjis iso-2022-jp jis7 utf-8

This function can be slow and does NOT cache any results.


open_translator $path, lang1, lang2....
Open an existing translation directory. A translation directory can contain any number of language translation tables with filenames of the form ``language.dpo''. Since the translator cannot guess in which language the source has been written you have to specify this by adding additional language names.

Return all languages supported by this translator (in normalized form). Can be used to create language-selectors, for example.

$table = $translator->get_table($languages)
Find and return a translator table for the language that best matches the $languages. This function always succeeds by returning a dummy trable if no (physical) table can be found. This function is very fast in the general case.

$translation = $table->gettext($msgid)
Find the translation for $msgid, or return the original string if no translation is found. If the msgid starts with the two characters ``\'' and ``{'', then these characters and all remaining characters until the closing '}' are skipped before attempting a translation. If you do want to include these two characters at the beginning of the string, use the sequence ``\{\{''. This can be used to specify additional arguments to some translation steps (like the language used). Here are some examples:
  string      =>    translation
  \{\string   =>    \translation
  \{\{string  =>    \{translation
  \{}string   =>    translation

To ensure that the string is translated ``as is'' just prefix it with ``\{}''.

Flush the translation table cache. This is rarely necessary, translation hash files are not written to. This can be used to ensure that new calls to get_table get the updated tables instead of already opened ones.


As of yet undocumented

\%trans = fuzzy_translation $string, [$domain]
Try to find a translation for the given string in the given domain (or globally) by finding the most similar string already in the database and return its translation(s).

scan_init $domain, $languages
scan_str $prefix, $string, $lang
scan_field $dsn, $field, $style, $lang
export_dpo $domain, $path, [$userid, $groupid, $attr]
Export translation domain $domain in binary hash format to directory $path, creating it if necessary.


CLASS PApp::I18n::PO_Reader

This class can be used to read serially through a .po file. (where ``po file'' is about the same thing as a standard ``Portable Object'' file from the NLS standard developed by Uniforum).

$po = new PApp::I18n::PO_Reader $pathname
Opens the given file for reading.

($msgid, $msgstr, @comments) = $po->next;
Read the next entry. Returns nothing on end-of-file.

CLASS PApp::I18n::PO_Writer

This class can be used to write a new .po file. (where ``po file'' is about the same thing as a standard ``Portable Object'' file from the NLS standard developed by Uniforum).

$po = new PApp::I18n::PO_Writer $pathname
Opens the given file for writing.

$po->add($msgid, $msgstr, @comments);
Write another entry to the po file. See PO_Reader's next method.


 Marc Lehmann <>