charmap - symbolic translation file for localedef scripts
localedef -f charmap locale_name
Invoking the localedef command with the -f option causes symbolic
names in the locale description file to be translated into the
encodings given in the charmap file (see localedef(1M)). As a
recommendation, a locale description file should be written completely
with symbolic names.
The charmap file has two sections: a declarations section and a
character definition section.
Declarations can precede the character definitions.
Each consists of the symbol (including the surrounding angle
brackets), followed by one or more blanks (or tabs or space
characters), followed by the value of the symbol.
Certain declarations are required for multibyte character codesets.
For single-byte codesets, all are optional.
Following is a list of possible declarations:
Used to declare the name of the coded character set for which the
charmap file is defined. This keyword is required for multibyte
character codesets. For HP15 encoding scheme, HP15 needs to be
part of the name. For EUC encoding scheme, EUC needs to be part
of the name.
Used to declare the cswidth parameter of the coded character set
for which the charmap file is defined (see euset(1)).
Used to declare the maximum number of bytes in a multibyte
character. Defaults to 1 if not given. For multibyte character
codesets, this keyword must be specified.
Used to declare the minimum number of bytes in a character for
the encoded character set. The value must be less than or equal
Hewlett-Packard Company - 1 - HP-UX Release 11i: November 2000
to <<<<mb_cur_max>>>>. If not given, the default is equal to
Used to declare the escape character, which is used to escape
characters that otherwise would have special meaning. If not
given, the default is backslash (\).
Used to declare the comment character, which is used to begin
comments and should be placed in column one of the charmap file.
If not given, the default is the # character.
Character Definition Section
The character-set mapping definitions immediately follow an identifier
line containing the string CHARMAP and precede a trailer line
consisting of the string END CHARMAP. (Empty lines and lines
beginning with the comment character are ignored.)
The character definitions are of two forms.
The first form defines a single character and its encoding:
A symbolic_name is one or more visible characters from the portable
character set as specified by XPG, enclosed in angle brackets.
Metacharacters such as angle brackets, escape characters, or comment
characters must be escaped if they are used in the name. Two or more
symbolic names can be given for the same encoding.
The encoding is a character constant in one of three forms:
decimal An escape character followed by the letter d,
followed by one to three decimal digits.
octal An escape character followed by one to three octal
hexadecimal An escape character followed by an x, followed by
two hexadecimal digits.
Multibyte characters are represented by the concatenation of character
constants. All constants used in the encoding of a multibyte
character must be of the same form.
The second form defines a range of characters consisting of all
characters from the first symbolic name to the second, inclusive:
Hewlett-Packard Company - 2 - HP-UX Release 11i: November 2000
<<<<symbolic_name>>>>... <<<<symbolic_name>>>> encoding
The symbolic name must consist of one or more nonnumeric characters
followed by an integer formed of one or more decimal digits. The
integer part of the second symbolic name must be larger than that of
the first. The range is then interpreted as a list of symbolic names
consisting of the same character portion and successive integer values
from the first through the last. These names are assigned successive
encodings starting with the one given.
For example, the character definition line
is equivalent to:
For examples, see any of the files under /usr/lib/nls/loc/charmaps
localedef POSIX.2, XPG4.
Hewlett-Packard Company - 3 - HP-UX Release 11i: November 2000