unixdev.net


Switch to SpeakEasy.net DSL

The Modular Manual Browser

Home Page
Manual: (OSF1-V5.1-alpha)
Page:
Section:
Apropos / Subsearch:
optional field



charmap(4)							   charmap(4)



NAME

  charmap - Defines character symbols as character encodings

DESCRIPTION

  The character	set description	(charmap) file defines character symbols as
  character encodings. This file is the	source file for	a coded	character
  set, or codeset.  All	supported codesets have	the Portable Character Set
  (PCS)	as a proper subset.  The PCS consists of the following character sym-
  bols (listed by their	standardized symbolic names) and hexadecimal encod-
  ings:

  __________________________________________
  Symbol Name		Hexadecimal Encoding
  __________________________________________
  <NUL>			\x00
  <SOH>			\x01
  <STX>			\x02
  <ETX>			\x03
  <EOT>			\x04
  <ENQ>			\x05
  <ACK>			\x06
  <alert>		\x07
  <backspace>		\x08
  <tab>			\x09
  <newline>		\x0A
  <vertical-tab>	\x0B
  <form-feed>		\x0C
  <carriage-return>	\x0D
  <SO>			\x0E
  <SI>			\x0F
  <DLE>			\x10
  <DC1>			\x11
  <DC2>			\x12
  <DC3>			\x13
  <DC4>			\x14
  <NAK>			\x15
  <SYN>			\x16
  <ETB>			\x17
  <CAN>			\x18
  <EM>			\x19
  <SUB>			\x1A
  <ESC>			\x1B
  <IS4>			\x1C
  <IS3>			\x1D
  <IS2>			\x1E
  <IS1>			\x1F
  <space>		\x20
  <exclamation-mark>	\x21
  <quotation-mark>	\x22
  <number-sign>		\x23
  <dollar-sign>		\x24
  <percent>		\x25
  <ampersand>		\x26
  <apostrophe>		\x27
  <left-parenthesis>	\x28
  <right-parenthesis>	\x29
  <asterisk>		\x2A
  <plus-sign>		\x2B
  <comma>		\x2C
  <hyphen>		\x2D
  <period>		\x2E
  <slash>		\x2F
  <zero>		\x30
  <one>			\x31
  <two>			\x32
  <three>		\x33
  <four>		\x34
  <five>		\x35
  <six>			\x36
  <seven>		\x37
  <eight>		\x38
  <nine>		\x39
  <colon>		\x3A
  <semi-colon>		\x3B
  <less-than>		\x3C
  <equal-sign>		\x3D
  <greater-than>	\x3E
  <question-mark>	\x3F
  <commercial-at>	\x40
  <A>			\x41
  <B>			\x42
  <C>			\x43
  <D>			\x44
  <E>			\x45
  <F>			\x46
  <G>			\x47
  <H>			\x48
  <I>			\x49
  <J>			\x4A
  <K>			\x4B
  <L>			\x4C
  <M>			\x4D
  <N>			\x4E
  <O>			\x4F
  <P>			\x50
  <Q>			\x51
  <R>			\x52
  <S>			\x53
  <T>			\x54
  <U>			\x55
  <V>			\x56
  <W>			\x57
  <X>			\x58
  <Y>			\x59
  <Z>			\x5A
  <left-bracket>	\x5B
  <backslash>		\x5C
  <right-bracket>	\x5D
  <circumflex>		\x5E
  <underscore>		\x5F
  <grave-accent>	\x60
  <a>			\x61
  <b>			\x62
  <c>			\x63
  <d>			\x64
  <e>			\x65
  <f>			\x66
  <g>			\x67
  <h>			\x68
  <i>			\x69
  <j>			\x6A
  <k>			\x6B
  <l>			\x6C
  <m>			\x6D
  <n>			\x6E
  <o>			\x6F
  <p>			\x70
  <q>			\x71
  <r>			\x72
  <s>			\x73
  <t>			\x74
  <u>			\x75
  <v>			\x76
  <w>			\x77
  <x>			\x78
  <y>			\x79
  <z>			\x7A
  <left-brace>		\x7B
  <vertical-line>	\x7C
  <right-brace>		\x7D
  <tilde>		\x7E
  <DEL>			\x7F
  __________________________________________

  The charmap file has the following components:

    +  An optional special symbolic name declarations section

       Each declaration	in this	section	consists of a special symbolic name,
       followed	by one or more space or	tab characters,	and a value.  The
       following list describes	the special symbolic names that	you can
       include in the declarations section:

       <&lt;code_set_name>&gt;
	   Specifies the name of the codeset for which the charmap file	is
	   defined.  This value	determines the value returned by the
	   nl_langinfo (CODESET) subroutine.  If <&lt;code_set_name>&gt; is not
	   declared, the name for the Portable Character Set is	used.

       <&lt;mb_cur_max>&gt;
	   Specifies the maximum number	of bytes in a character	for the
	   codeset.  Valid values are 1	to 4.  The default value is 1.

       <&lt;mb_cur_min>&gt;
	   Specifies the minimum number	of bytes in a character	for the
	   codeset.  Since all supported codesets have the Portable Character
	   Set as a proper subset, this	value must be 1.

       <&lt;escape_char>&gt;
	   Specifies the escape	character that indicates encodings in hexade-
	   cimal or octal notation.  The default value is a \ (backslash).

       <&lt;comment_char>&gt;
	   Specifies the character used	to indicate a comment within a char-
	   map file.  The default value	is a # (number sign).

    +  The CHARMAP section header

       This header marks the beginning of the section that associates charac-
       ter symbols with	encodings.

    +  Mapping statements for characters in the	codeset

       Each statement lists a symbolic name for	a character and	its associ-
       ated encoding.  The format of a mapping statement is:


	    <&lt;char_symbol>&gt; encoding

       A symbolic name begins with the <&lt; (left-angle bracket) character	and
       ends with the >&gt; (right-angle bracket) character.	 The characters	for
       char_symbol (between <&lt; and >&gt;) can be any	characters from	the Portable
       Character Set, except for control and space characters. The right-
       angle bracket (>&gt;) can occur in char_symbol as well in the last posi-
       tion of the name.  You must precede all >&gt; characters but	the last one
       with the	escape character (as specified by the <&lt;escape_char>&gt; special
       symbolic	name).

       The format of a mapping statement is:


	    <&lt;char_symbol>&gt; encoding

       An encoding is specified	as one or more character constants, with the
       maximum number of character constants specified by the <&lt;mb_cur_max>&gt;
       special symbolic	name.  The encoding may	be listed as decimal, octal,
       or hexadecimal constants	with the following formats:

       Hexadecimal constant
		  \xxx,	where x	is a hexadecimal digit

       Octal constant
		  \ooo or \oo, where o is an octal digit

       Decimal constant
		  \dddd	or \ddd, where d is a decimal digit

       Some examples of	character symbol definitions are the following:


	    <A>	       \d65	   #decimal constant
	    <B>	       \x42	   #hexadecimal	constant
	    <j10101>   \x81\xA1	   #multiple hexadecimal constants

       A range of symbolic names and corresponding encoded values may also be
       defined,	where the nonnumeric prefix for	each symbolic name is common,
       and the numeric portion of the second symbolic name  is equal to	or
       greater than the	numeric	portion	of the first symbolic name.  In	this
       format, a symbolic name value consists of zero or more nonnumeric
       characters followed by an integer of one	or more	decimal	 digits.
       This format defines a series of symbolic	names.	For example, the
       string <&lt;j0101>&gt;...<&lt;j0104>&gt;	is interpreted as the <&lt;j0101>&gt;, <&lt;j0102>&gt;,
       <&lt;j0103>&gt;,	and <&lt;j0104>&gt; symbolic names, in that order.

       In statements defining ranges of	symbolic names,	the encoded value
       listed is the value for the first symbolic name in the range.  Subse-
       quent symbolic names have encoded values	in increasing order.  For
       example:


	    <j0101>...<j0104>	     \d129\d254

       The preceding statement is interpreted as follows:


	    <j0101> \d129\d254
	    <j0102> \d129\d255
	    <j0103> \d130\d0
	    <j0104> \d130\d1

       Although	you cannot assign multiple encodings to	one symbolic name,
       you can create multiple names for one encoded value.  This is allowed
       because some characters have several common names.  For example,	the
       "." character is	called a period	in some	parts of the world, and	a
       full stop in others.  Both names	may appear in the charmap.  For
       example:


	    <period>	    \x2e
	    <full-stop>	    \x2e

       If used,	comments must begin with the character specified by the
       <&lt;comment_char>&gt; special symbolic name.  When an entire line is a com-
       ment, you must specify <&lt;comment_char>&gt; in	the first column of the	line.

    +  The END CHARMAP trailer

       This entry denotes the end of character map statements.

  The following	example	is a portion of	a possible charmap file:

       CHARMAP
       <code_set_name>	       "ISO8859-1"
       <mb_cur_max>	       1
       <mb_cur_min>	       1
       <escape_char>	       \
       <comment_char>	       #

       <NUL>		       \x00
       <SOH>		       \x01
       <STX>		       \x02
       <ETX>		       \x03
       <EOT>		       \x04
       <ENQ>		       \x05
       <ACK>		       \x06
       <alert>		       \x07
       <backspace>	       \x09
       <tab>		       \x09
       <newline>	       \x0a
       <vertical-tab>	       \x0b
       <form-feed>	       \x0c
       <carriage-return>       \x0d
       END CHARMAP

FILES

  /usr/lib/nls/loc/charmaps/*
		Character set description (charmap) source files for sup-
		ported locales.	 The /usr/lib/nls/loc/charmaps directory does
		not exist when source files for	installed locales are not
		provided.

RELATED	INFORMATION

  Commands:  locale(1),	localedef(1).

  Files:  locale(4).