unixdev.net


Switch to SpeakEasy.net DSL

The Modular Manual Browser

Home Page
Manual: (OSF1-V5.1-alpha)
Page:
Section:
Apropos / Subsearch:
optional field



l10n_intro(5)							l10n_intro(5)



NAME

  l10n_intro, l10n, locales, LOCPATH - Introduction to localization (L10N)

DESCRIPTION

  Localization refers to the process of	establishing information within	a
  computer system specific to each supported language, cultural	data, and
  coded	character set (codeset)	combination.  Each such	combination gives
  rise to the definition of one	locale.	The abbreviation L10N is often used
  to stand for localization as there are 10 characters between the beginning
  "L" and the ending "N" of that word.

  See i18n_intro(5) for	introductory information about internationalization
  and how to use system	commands to set	a locale. For information about
  creating locales, refer to localedef(1), charmap(4), and locale(4). For
  information about creating locales and writing applications that use
  locales, refer to Writing Software for the International Market.

  The current release of the operating system supports the following
  languages with locales. Each language	is discussed separately	in its own
  reference page:

       Catalan
       Chinese (Simplified and Traditional)
       Czech
       Dutch
       Finnish
       French
       German
       Greek
       Hebrew
       Hungarian
       Icelandic
       Italian
       Japanese
       Korean
       Lithuanian
       Norwegian
       Polish
       Portuguese
       Russian
       Slovak
       Slovene
       Spanish
       Swedish
       Thai
       Turkish

  For some of the languages, more than one codeset and country or territory
  are supported. Hence,	multiple locales are supported for certain languages.
  The following	list names and describes all the supported locales. For
  information about the	character encoding used	by a particular	locale,	refer
  to the reference page	for the	codeset	specified in the last part of the
  locale name or, for those that end in	.UTF-8,	to Unicode(5).

  ca_ES.ISO8859-1
	  Catalan locale for Spain (uses the Latin-1 codeset)

  ca_ES.ISO8859-15
	  Catalan locale for Spain (uses the Latin-9 codeset)

  ca_ES.UTF-8
	  Catalan locale for Spain (uses the UTF-8 codeset)

  cs_CZ.ISO8859-2
	  Czech	locale for Czech Republic (uses	the Latin-2 codeset)

  da_DK.ISO8859-1
	  Danish locale	for Denmark (uses the Latin-1 codeset)

  da_DK.ISO8859-15
	  Danish locale	for Denmark (uses the Latin-9 codeset)

  da_DK.UTF-8
	  Danish locale	for Denmark (uses the UTF-8 codeset)

  de_CH.ISO8859-1
	  German locale	for Switzerland	(uses the Latin-1 codeset)

  de_CH.ISO8859-15
	  German locale	for Switzerland	(uses the Latin-9 codeset)

  de_CH.UTF-8
	  German locale	for Switzerland	(uses the UTF-8	codeset)

  de_DE.ISO8859-1
	  German locale	for Germany (uses the Latin-1 codeset)

  de_DE.ISO8859-15
	  German locale	for Germany (uses the Latin-9 codeset)

  de_DE.UTF-8
	  German locale	for Germany (uses the UTF-8 codeset)

  el_GR.ISO8859-7
	  Greek	locale for Greece (uses	the ISO	Greek codeset)

  el_GR.UTF-8
	  Greek	locale for Greece (uses	the UTF-8 codeset)

  en_GB.ISO8859-1
	  English locale for Great Britain (uses the Latin-1 codeset)

  en_GB.ISO8859-15
	  English locale for Great Britain (uses the Latin-9 codeset)

  en_GB.UTF-8
	  English locale for Great Britain (uses the UTF-8 codeset)

  en_EU.UTF-8@euro
	  English locale that includes the euro	character (uses	the UTF-8
	  codeset)

	  This locale both supports the	euro character and defines the
	  decimal point	as a comma (,) and the thousands separator as a
	  period (.). Therefore, this locale is	useful in many European	coun-
	  tries, not just those	for which English is the native	language,
	  when assigned	only to	the LC_MONETARY	locale category	or
	  environment variable.

  en_US.ISO8859-1

	  English locale for the U.S. (uses the	Latin-1	codeset)

  en_US.ISO8859-15
	  English locale for the U.S. (uses the	Latin-9	codeset)

  en_US.cp850
	  English locale for the U.S. (uses cp850 encoding)

	  Use this locale with data that contains accented characters and
	  that was generated on	a PC using the cp850 code page for character
	  encoding. This character encoding is usually the default for the
	  DOS and Windows operating systems in Europe. The en_US.ISO8859-1
	  and en_US.cp850 locales encode English characters the	same way but
	  use different	values for accented and	other non-English characters
	  in the Latin-1 character set.

  en_US.UTF-8

  en_US.UTF-8@euro
	  English locales for the U.S. (use the	UTF-8 codeset)

	  The @euro variant defines the	local currency sign to be the euro
	  character and	the international currency sign	to be EUR. See also
	  en_EU.UTF-8@euro.

  es_ES.ISO8859-1
	  Spanish locale for Spain (uses the Latin-1 codeset)

  es_ES.ISO8859-15
	  Spanish locale for Spain (uses the Latin-9 codeset)

  es_ES.UTF-8
	  Spanish locale for Spain (uses the UTF-8 codeset)

  fi_FI.ISO8859-1
	  Finnish locale for Finland (uses the Latin-1 codeset)

  fi_FI.ISO8859-15
	  Finnish locale for Finland (uses the Latin-9 codeset)

  fi_FI.UTF-8
	  Finnish locale for Finland (uses the UTF-8 codeset)

  fr_BE.ISO8859-1
	  French locale	for Belgium (uses the Latin-1 codeset)

  fr_BE.ISO8859-15
	  French locale	for Belgium (uses the Latin-9 codeset)

  fr_BE.UTF-8
	  French locale	for Belgium (uses the UTF-8 codeset)

  fr_CA.ISO8859-1
	  French locale	for Canada (uses the Latin-1 codeset)

  fr_CA.ISO8859-15
	  French locale	for Canada (uses the Latin-9 codeset)

  fr_CA.UTF-8
	  French locale	for Canada (uses the UTF-8 codeset)

  fr_CH.ISO8859-1
	  French locale	for Switzerland	(uses the Latin-1 codeset)

  fr_CH.ISO8859-15
	  French locale	for Switzerland	(uses the Latin-9 codeset)

  fr_CH.UTF-8
	  French locale	for Switzerland	(uses the UTF-8	codeset)

  fr_FR.ISO8859-1
	  French locale	for France (uses the Latin-1 codeset)

  fr_FR.ISO8859-15
	  French locale	for France (uses the Latin-9 codeset)

  fr_FR.UTF-8
	  French locale	for France (uses the UTF-8 codeset)

  he_IL.ISO8859-8
	  Hebrew locale	for Israel (uses the ISO Hebrew	codeset)

  hu_HU.ISO8859-2
	  Hungarian locale for Hungary (uses the Latin-2 codeset)

  is_IS.ISO8859-1
	  Icelandic locale for Iceland (uses the Latin-1 codeset)

  is_IS.ISO8859-15
	  Icelandic locale for Iceland (uses the Latin-9 codeset)

  is_IS.UTF-8
	  Icelandic locale for Iceland (uses the UTF-8 codeset)

  it_IT.ISO8859-1
	  Italian locale for Italy (uses the Latin-1 codeset)

  it_IT.ISO8859-15
	  Italian locale for Italy (uses the Latin-9 codeset)

  it_IT.UTF-8
	  Italian locale for Italy (uses the UTF-8 codeset)

  iw_IL.ISO8859-8
	  Hebrew locale	for Israel (uses the ISO Hebrew	codeset)

	  This locale name is supported	for backward compatibility. The
	  recommended name to use for the ISO Hebrew locale is
	  he_IL.ISO8859-8.

  ja_JP.SJIS
	  Japanese locale for Japan (uses the Shift JIS	codeset)

  ja_JP.deckanji
	  Japanese locale for Japan (uses the DEC Kanji	codeset)

  ja_JP.eucJP
	  Japanese locale for Japan (uses the Japanese EUC codeset)

  ja_JP.sdeckanji
	  Japanese locale for Japan (uses the Super DEC	Kanji codeset)

  ja_JP.UTF-8
	  Japanese locale for Japan (uses the UTF-8 codeset)

  ko_KR.deckorean
	  Korean locale	for Korea (uses	the DEC	Korean codeset)

  ko_KR.eucKR
	  Korean locale	for Korea (uses	the Korean EUC codeset)

  ko_KR.UTF-8
	  Korean locale	for Korea (uses	the UTF-8 codeset)

  lt_LT.ISO8859-4
	  Lithuanian locale for	Lithuania (uses	the Latin-4 codeset)

  nl_BE.ISO8859-1
	  Flemish locale for Belgium (uses the Latin-1 codeset)

  nl_BE.ISO8859-15
	  Flemish locale for Belgium (uses the Latin-9 codeset)

  nl_BE.UTF-8
	  Flemish locale for Belgium (uses the UTF-8 codeset)

  nl_NL.ISO8859-1
	  Dutch	locale for the Netherlands  (uses the Latin-1 codeset)

  nl_NL.ISO8859-15
	  Dutch	locale for the Netherlands  (uses the Latin-9 codeset)

  nl_NL.UTF-8
	  Dutch	locale for the Netherlands  (uses the UTF-8 codeset)

  no_NO.ISO8859-1
	  Norwegian locale for Norway  (uses the Latin-1 codeset)

  no_NO.ISO8859-15
	  Norwegian locale for Norway (uses the	Latin-9	codeset)

  no_NO.UTF-8
	  Norwegian locale for Norway  (uses the UTF-8 codeset)

  pl_PL.ISO8859-2
	  Polish locale	for Poland (uses the Latin-2 codeset)

  pt_PT.ISO8859-1
	  Portuguese locale for	Portugal (uses the Latin-1 codeset)

  pt_PT.ISO8859-15
	  Portuguese locale for	Portugal (uses the Latin-9 codeset)

  pt_PT.UTF-8
	  Portuguese locale for	Portugal (uses the UTF-8 codeset)

  ru_RU.ISO8859-5
	  Russian locale for Russia (uses the ISO Cyrillic codeset)

  sk_SK.ISO8859-2
	  Slovak locale	for Slovakia (uses the Latin-2 codeset)

  sl_SI.ISO8859-2
	  Slovene locale for Slovenia (uses the	Latin-2	codeset)

  sv_SE.ISO8859-1
	  Swedish locale for Sweden (uses the Latin-1 codeset)

  sv_SE.ISO8859-15
	  Swedish locale for Sweden (uses the Latin-9 codeset)

  sv_SE.UTF-8
	  Swedish locale for Sweden (uses the UTF-8 codeset)

  th_TH.TACTIS
	  Thai locale for Thailand (uses the TACTIS codeset)

  tr_TR.ISO8859-9
	  Turkish locale for Turkey (uses the Latin-5 codeset)

  zh_CN.dechanzi
	  Simplified Chinese locale for	the People's Republic of China (uses
	  the DEC Hanzi	codeset)

  zh_CN.GBK
	  Simplified Chinese locale for	the People's Republic of China (uses
	  the GBK codeset, an extension	of the GB 2312-80 codeset)

  zh_CN.GB18030
	  Simplified Chinese locale for	the People's Republic of China (uses
	  the GB18030 codeset, which extends GBK by means of 4-byte encoding)

  zh_CN.UTF-8
	  Simplified Chinese locale for	the People's Republic of China (uses
	  the UTF-8 codeset)

  zh_HK.big5
	  Traditional Chinese locale for Hong Kong (uses the BIG-5 codeset)

  zh_HK.dechanyu
	  Traditional Chinese locale for Hong Kong (uses the DEC Hanyu
	  codeset)

  zh_HK.dechanzi
	  Simplified Chinese locale for	Hong Kong (uses	the DEC	Hanzi codeset

  zh_HK.eucTW
	  Traditional Chinese locale for Hong Kong (uses the Taiwanese EUC
	  codeset)

  zh_HK.UTF-8
	  Traditional Chinese locale for Hong Kong (uses the UTF-8 codeset)

  zh_TW.big5
	  Traditional Chinese locale for Taiwan	(uses the BIG-5	codeset)

  zh_TW.dechanyu
	  Traditional Chinese locale for Taiwan	(uses the DEC Hanyu codeset)

  zh_TW.eucTW
	  Traditional Chinese locale for Taiwan	(uses the Taiwanese EUC
	  codeset)

  zh_TW.UTF-8
	  Traditional Chinese locale for Taiwan	(uses the UTF-8	codeset)

	  This locale supports Simplified Chinese as well as Traditional
	  Chinese.

  For the zh_CN.dechanzi locale, the @pinyin, @radical,	and @stroke variants
  are available	for sorting by pinyin, radical,	and stroke, respectively. For
  the zh_TW.big5, zh_TW.dechanyu, and zh_TW.eucTW locales, the @chuyin,	@rad-
  ical,	and @stroke variants are available for sorting by chuyin, radical,
  and stroke, respectively.  These variant locale names	(those including the
  @collation_modifier suffix) are available for	assignment to the LC_COLLATE
  variable.

  The locales whose names end in .UTF-8	support	file code and internal pro-
  cess code according to the ISO 10646 and Unicode standards.  The
  universal.UTF-8 locale is also available (for	use by applications rather
  than end users) and supports the complete set	of characters in the Univer-
  sal Character	Set (UCS). For .UTF-8 locales, file code may include charac-
  ters encoded in more than one	byte, so these locales should not be used by
  applications that do not use wide-character functions	for data manipula-
  tion.

  For some locales that	use traditional	UNIX and proprietary codesets, there
  are also corresponding @ucs4 locale variants available for use by applica-
  tions	that require internal process code to be in UCS-4 format while file
  code remains in the format of	the traditional	UNIX or	proprietary codeset.
  In other words, both UTF-8 and @ucs4 locales use UCS-4 format	for internal
  process code,	but differ in terms of file code support. Refer	to Unicode(5)
  for more information about encoding formats of the @ucs4 and .UTF-8
  locales.

  The .UTF-8 and .ISO8859-15 locales are the only locales that include the
  euro (C=) monetary sign in the coded character set. The *.UTF-8@euro
  locales also define the local	currency sign to be the	euro character and
  the international currency sign to be	EUR. See euro(5) for more information
  about	the euro character and how it is supported.

  You can use the -a option with the locale command to list all	the locales
  available on the system. Note	that the POSIX (or C) locale is	always avail-
  able because it must exist on	all systems that conform to The	Open Group's
  UNIX specifications. The POSIX locale	is the default locale when locale
  variables are	not set.

				     Note

       The dxterm terminal emulator does not support locales based on the
       following codesets:

	 +  Unicode (UTF-8)

	 +  Latin-9 (ISO8859-15)

  Use dtterm, the default terminal emulator for	the Common Desktop Environ-
  ment (CDE), with locales based on the	Latin-9	and UTF-8 codesets.

  Environment Variables	Related	to Localization


  The following	system environment variables can be set	(usually only by
  installed applications or by programmers who are testing applications	or
  converters under development)	to override the	default	search path for	cer-
  tain kinds of	localized files:

  LOCPATH
      Specifies	the search path	for locales and	codeset	converters.  Note
      that this	environment variable is	not defined by current industry	stan-
      dards. For more information, refer to the	iconv_intro(5),
      iconv_open(3), and setlocale(3) reference	pages.

      Because the LOCPATH variable is not defined by standards,	it is recom-
      mended for use only when testing locales or converters under develop-
      ment and not as a	systemwide method for finding installed	converters or
      locales.	When you set LOCPATH, make sure	that the search	path is	valid
      for both locales and converters. Otherwise, application and system
      software will be able to find only locales or only converters in
      environments where both kinds of files are required.

  NLSPATH
      Specifies	the search path	for message catalogs, which contain
      translated text for programs. This variable is used primarily by the
      catopen()	function. Refer	to the catopen(3) reference page for detailed
      information on NLSPATH.







  Customizing Locales


  Partial source files,	along with an associated Makefile, are available for
  many locales in the /usr/lib/nls/loc/src directory. By editing one of	these
  source files and using the Makefile to rebuild the locale (make
  locale_name),	you can	customize one or more of the following features:

    +  The format of affirmative and negative responses	(LC_MESSAGES section)

    +  Rules and symbols for formatting	monetary numeric information
       (LC_MONETARY section)

    +  Rules and symbols for formatting	nonmonetary numeric information
       (LC_NUMERIC section)

    +  Rules and symbols for formatting	date and time information (LC_TIME
       section)

  The LC_CTYPE and LC_COLLATE sections of these	locale sources are not cus-
  tomizable. This means	that you cannot	use one	of these sources to change
  how characters are classified	or collated.  By implication, this also	means
  that you cannot add a	new character to a locale that does not	already	sup-
  port it.  For	example, you cannot add	the European monetary character
  (euro) to a locale that does not already support that	character.  However,
  you can edit the LC_MONETARY section to define a string identifier for euro
  by using characters that the locale does support.  For example, you could
  replace the existing monetary	symbol with EUR.

  For more information on a locale source file,	see locale(4).

				    Caution

       Customized versions of locales that are provided	with the operating
       system are not preserved	when the operating system is reinstalled,
       even when an update installation	procedure is used. Therefore, it is
       important to back up files for customized locales and their sources
       before reinstalling the operating system. After the reinstallation is
       complete, you will need to restore your customized locales to the sys-
       tem. If the newly installed sources have	revisions when compared	to
       the old sources,	it might be preferable to apply	your customizations
       to the newly installed sources and rebuild your customized locales.

SEE ALSO

  Commands: locale(1), localedef(1)

  Functions: catopen(3)

  Files: charmap(4), locale(4)

  Others: Catalan(5), Chinese(5), Czech(5), dechanyu(5), dechanzi(5),
  deckanji(5), deckorean(5), Dutch(5), eucJP(5), eucKR(5), eucTW(5), euro(5),
  Finnish(5), French(5), GB18030(5), GBK(5) ,German(5),	Greek(5), Hebrew(5),
  Hungarian(5),	i18n_intro(5), i18n_printing(5), Icelandic(5),
  iconv_intro(5), iso2022(5), iso2022jp(5), iso8859-1(5), iso8859-2(5),
  iso8859-4(5),	iso8859-5(5), iso8859-7(5), iso8859-8(5), iso8859-9(5),
  iso8859-15(5), Italian(5), Japanese(5), jiskanji(5), Korean(5),
  Lithuanian(5), Norwegian(5), Polish(5), Portuguese(5), Russian(5),
  sbig5(5), sdeckanji(5), shiftjis(5), Slovak(5), Slovene(5), Spanish(5),
  Swedish(5), TACTIS(5), telecode(5) Thai(5), Turkish(5), Unicode(5)

  Writing Software for the International Market