unixdev.net


Switch to SpeakEasy.net DSL

The Modular Manual Browser

Home Page
Manual: (OSF1-V5.1-alpha)
Page:
Section:
Apropos / Subsearch:
optional field



i18n_intro(5)							i18n_intro(5)



NAME

  i18n_intro, i18n, LANG, LC_ALL, LC_COLLATE, LC_CTYPE,	LC_MESSAGES,
  LC_MONETARY, LC_NUMERIC, LC_TIME - Introduction to internationalization
  (I18N)

DESCRIPTION

  Internationalization refers to the process of	developing programs without
  prior	knowledge of the language, cultural data, or character-encoding
  schemes that the programs are	expected to handle. In other words, interna-
  tionalization	refers to the availability and use of interfaces that let
  programs modify their	behavior at run	time for operation in a	specific
  language environment.	 The abbreviation I18N is often	used to	stand for
  internationalization as there	are 18 characters between the beginning	"I"
  and the ending "N" of	that word.

  The I18N interfaces and utilities provided in	Tru64 UNIX conform to Issue 4
  of X/Open CAE	specifications.

  A concept related to internationalization is localization (L10N), which
  refers to the	process	of establishing	information within a computer system
  for each combination of native language, cultural data, and coded character
  set (codeset). A locale is a database	that provides information for a
  unique combination of	these three components.	However, locales do not	solve
  all of the problems that localization	must address. Many native languages
  require additional support in	the form of language-specific print filters,
  fonts, codeset converters, character input methods, and other	kinds of spe-
  cialized software.

  For additional introductory information on topics related to international-
  ization, refer to the	following reference pages:

  l10n_intro(5)
	  For more information on localization and locales

  iconv_intro(5)
	  For an introduction to codeset conversion

  i18n_printing(5)
	  For a	summary	of printer support for native languages

  Characters, Character	Sets, and Codesets


  A character is a member of a set of elements used for	the organization,
  control, or representation of	data.

  A character set is a set of alphabetic or other characters used to con-
  struct the words and other elementary	units of a native language or com-
  puter	language.  A character set only	specifies the characters that are
  included in the set.	ASCII, CNS 11643 and DTSCS are examples	of character
  sets.

  A coded character set	(codeset) is a set of unambiguous rules	that support
  one or more character	sets and establishes the one-to-one relationship
  between each character and its bit representation. In	other words, a
  codeset consists of the code points for characters in	one or more character
  sets.	For example, DEC Hanyu (dechanyu) is a codeset for Chinese and
  contains code	points for characters in the ASCII, CNS	11643-1986 (plane 1
  and plane 2),	and DTSCS character sets.

  Language Announcement	(Setting Locale)


  Language announcement	is the mechanism by which language, cultural data,
  and codeset requirements are set either for the system as a whole or by
  individual users. An application can also set	these requirements, although
  it is	more common for	an internationalized application to use	the setting
  in effect for	the user who runs the program. Refer to	the System Adminis-
  tration manual for information about setting systemwide defaults for
  shells. Refer	to setlocale(3)	and Writing Software for the International
  Market for information on how	applications query or set locale requirements
  at run time.

  Language announcement	is performed by	setting	one or more reserved environ-
  ment variables to the	name of	an installed locale. Each locale has associ-
  ated with it collating sequences, character conversion tables, character
  classification tables, formats for different kinds of	data, and message
  catalogs. If the same	locale meets user requirements in all these
  categories, set only the LANG	environment variable to	the locale name. A
  locale name usually has the following	format:

  language_territory.codeset[@modifier]

  The following	Korn shell example sets	LANG to	a locale supporting the
  English language, United States cultural data, and ISO8859-1 codeset:

       $ LANG=en_US.ISO8859-1

  The following	C shell	example	sets LANG to a locale supporting the Tradi-
  tional Chinese language, Hong	Kong cultural data, and	the DEC	Hanyu
  codeset:

       % setenv	LANG zh_HK.dechanyu

  Note that locale name	formats	can vary from vendor to	vendor.	Use the
  locale -a command to display the names of locales installed on your system.
  Refer	to the l10n_intro(5) reference page for	a list of the locales pro-
  vided	with the Tru64 UNIX product.

  An alternative way to	set locale requirements	for all	locale categories is
  to set the LC_ALL environment	variable. The difference between the LANG and
  LC_ALL variables is that LC_ALL is a high-precedence variable	that over-
  rides	all other locale variables, including LANG. The	LANG variable, on the
  other	hand, is a low-precedence variable.  When used by itself, the LANG
  variable implicitly sets all locale categories to the	specified locale just
  as LC_ALL does. However, the LANG variable can be used together with vari-
  ables	for specific locale categories to create a multilocale environment.
  The category-specific	locale variables and what they control follow:

  LC_COLLATE
	  String collation

  LC_CTYPE
	  Character classification

  LC_MESSAGES
	  Translations for messages and	valid strings for "yes"	and "no"
	  responses

  LC_MONETARY
	  The currency symbol and the format of	monetary values

  LC_NUMERIC
	  The format of	numeric	values

  LC_TIME The format of	date and time values

	  A locale can support only one	set of date and	time formats; how-
	  ever,	there can be several sets of date and time formats in use for
	  a particular language	and territory. See the l10n_intro(5) refer-
	  ence page for	information about creating a site-specific version of
	  a locale to support date and time formats different from those sup-
	  ported by an installed locale.

  Some locale names have one or	more @modifier suffixes. A locale with the
  suffix @ucs4 is for use by applications that require internal	process	code
  to be	in UCS-4 format. See Unicode(5)	for more information about UCS-4.
  Other	@modifier suffixes indicate locale variants that support alternative
  rules	for collation in Asian languages. Use locales with these suffixes
  only when setting LC_COLLATE.	For example, there are three different sets
  of collation rules (chuyin, radical, and stroke) that	can be used with the
  locale supporting the	Chinese	language, Taiwanese cultural data, and the
  Taiwanese EUC	codeset. If Korn shell users want to use this locale, they
  might	make the following settings:

       $ LANG=zh_TW.eucTW
       $ LC_COLLATE=zh_TW.eucTW@stroke

  The preceding	example	implicitly sets	all locale category variables to
  zh_TW.eucTW, except for the LC_COLLATE variable, which is set	to
  zh_TW.eucTW@stroke. The following locale command displays the	variable set-
  tings	after these assignments:

       $ locale
       LANG=zh_TW.eucTW
       LC_COLLATE=zh_TW.eucTW@stroke
       LC_CTYPE="zh_TW.eucTW"
       LC_MONETARY="zh_TW.eucTW"
       LC_NUMERIC="zh_TW.eucTW"
       LC_TIME="zh_TW.eucTW"
       LC_MESSAGES="zh_TW.eucTW"
       LC_ALL=

SEE ALSO

  Commands: locale(1), setlocale(3)

  Others: i18n_printing(5), iconv_intro(5), l10n_intro(5), Unicode(5)

  Writing Software for the International Market

  System Administration