unixdev.net


Switch to SpeakEasy.net DSL

The Modular Manual Browser

Home Page
Manual: (SunOS-4.1.3)
Page:
Section:
Apropos / Subsearch:
optional field

CHRTBL(8)                   System Manager's Manual                  CHRTBL(8)



NAME
       chrtbl - generate character classification table

SYNOPSIS
       /usr/etc/chrtbl [ filename ]

DESCRIPTION
       chrtbl  converts a source description of a character classification ta-
       ble into a form that can be used by the character classification  func-
       tions and multibyte functions (see ctype(3V) and mblen(3)).  The source
       description is found in filename.  If filename is not  given,  or  just
       given  as  `-',  chrtbl  reads its source description from the standard
       input.

       chrtbl creates one or two output files, the second file is only created
       if  the  model token is specified.  By default, these files are created
       in the current  working  directory.   The  first  file,  named  by  the
       chrclass token, is always produced and contains the character classifi-
       cation information for all  single-byte  (7-bit  and  8-bit)  character
       code-sets  described by one setting of the LC_CTYPE category of locale.
       The second file, created if the  model  token  is  specified,  contains
       information  relating  to  details  of width and structure of the coded
       character set currently under definition.  The second file is named  by
       appending `.ci'.  to the value specified by the chrclass token.

       The first output file contains a binary form of the character classifi-
       cation information described in filename.  It is structured in  such  a
       way  that  it  can be used at run-time to replace the active version of
       the ctype[] array in the C-library, For it to  be  understood  at  run-
       time,     the     output     file     must     be    moved    to    the
       /usr/share/lib/locale/LC_TYPE  or  /etc/locale  directory  (see   FILES
       below)  by  the super-user or a member of group bin.  This file must be
       readable by user, group, and other; no other permission should be set.

       filename contains a sequence of tokens in any order after the  chrclass
       token,  each  separated  by  one  or more NEWLINE characters or comment
       lines.  The tokens recognized by chrtbl are as follows:

              chrclass  name
                          name is the filename or pathname  of  the  character
                          classification file.  This is a mandatory token.  It
                          must be the first token to be defined, and  is  usu-
                          ally  given the name that relates to a valid setting
                          of the LC_CTYPE category of locale.

              model name,args
                          This optional token chooses the  type  of  character
                          code-set  announcement mechanism associated with the
                          character classification table generated by  chrtbl.
                          The  name  of  the file created by this token is the
                          name specified by the chrclass  token,  concatenated
                          with a `.ci'.  The arguments to model must be one of
                          the following:

                          euc x,y,z
                                 The model file contains information  describ-
                                 ing  the  required  setting  for the Extended
                                 Unix code-set announcement mechanism.   x,y,z
                                 relate  to  the  storage widths (in bytes) of
                                 EUC code-sets 1, 2 and 3 respectively.

                          xccs   The model file contains information  describ-
                                 ing   the   Xerox   Character  Code  Standard
                                 (XC1-3-3-0)  announcement  mechanism.   There
                                 are no additional arguments required.

                          iso2022 g0,g1,g2,g3 x
                                 The  model file contains information describ-
                                 ing a generative version of the ISO-2022 code
                                 set  announcement  mechanism.   The multibyte
                                 functions driven by this model are capable of
                                 handling the standard one or more byte escape
                                 sequences as well  as  all  of  the  standard
                                 shift    functions.    The   four   arguments
                                 g0,g1,g2,g3  define  the  default  width  (in
                                 bytes)  of  the  four  designations  (respec-
                                 tively)  available  under  ISO-2022,  Maximum
                                 integer value of any of these arguments is 2.
                                 The fianl argument x is mandatory and must be
                                 set  to either 7 or 8. It selects the default
                                 bit-width of each byte on  input  and  output
                                 to/from the multibyte functions.

                          If  the  model  token is declared without arguments,
                          then it is assumed that there  is  a  set  of  user-
                          defined  rules  for character code-set announcement.
                          This is noted in the output file and will  be  later
                          used to fold in user-defined code into the multibyte
                          functions in the C-library (see mblen(3)).

              isupper     Character codes to be classified as upper-case  let-
                          ters.

              islower     Character  codes to be classified as lower-case let-
                          ters.

              isdigit     Character codes to be classified as numeric.

              isspace     Character  codes  to  be  classified  as  a  spacing
                          (delimiter) character.

              ispunct     Character  codes  to  be classified as a punctuation
                          character.

              iscntrl     Character codes to be classified as a control  char-
                          acter.

              isblank     Character code for the space character.

              isxdigit    Character codes to be classified as hexadecimal dig-
                          its.

              ul          Relationship between upper- and  lower-case  charac-
                          ters.

       Any  lines  with the number sign (#) in the first column are treated as
       comments and are ignored.  Blank lines are also ignored.

       A character can be represented as a hexadecimal or octal constant  (for
       example, the letter a can be represented as 0x61 in hexadecimal or 0141
       in octal).  Hexadecimal and octal constants may be separated by one  or
       more space and tab characters.

       The  dash  (-)  may be used to indicate a range of consecutive numbers.
       Zero or more space characters may be used for separating the dash char-
       acter from the numbers.

       The  backslash  character  (\)  is  used for line continuation.  Only a
       RETURN is permitted after the backslash character.

       The  relationship  between  upper-  and  lower-case  letters  (ul)   is
       expressed as ordered pairs of octal and hexadecimal constants:

              <&lt;upper-case_character lower-case_character>&gt;

       These  two  constants may be separated by one or more space characters.
       Zero or more space characters may be  used  for  separating  the  angle
       brackets (<&lt;>&gt;) from the numbers.

EXAMPLES
       The  following  is an example of an input file used to create the ASCII
       code set definition table on a file named ascii.
              chrclass    ascii
              isupper     0x41 - 0x5a
              islower     0x61 - 0x7a
              isdigit     0x30 - 0x39
              isspace     0x20 0x9 - 0xd
              ispunct     0x21 - 0x2f 0x3a - 0x40  \
                          0x5b - 0x60 0x7b - 0x7e
              iscntrl     0x0 - 0x1f 0x7f
              isblank     0x20
              isxdigit    0x30 - 0x39 0x61 - 0x66  \
                          0x41 - 0x46
              ul          <&lt;0x41 0x61>&gt; <&lt;0x42 0x62>&gt; <&lt;0x43 0x63>&gt;  \
                          <&lt;0x44 0x64>&gt; <&lt;0x45 0x65>&gt; <&lt;0x46 0x66>&gt;  \
                          <&lt;0x47 0x67>&gt; <&lt;0x48 0x68>&gt; <&lt;0x49 0x69>&gt;  \
                          <&lt;0x4a 0x6a>&gt; <&lt;0x4b 0x6b>&gt; <&lt;0x4c 0x6c>&gt;  \
                          <&lt;0x4d 0x6d>&gt; <&lt;0x4e 0x6e>&gt; <&lt;0x4f 0x6f>&gt;  \
                          <&lt;0x50 0x70>&gt; <&lt;0x51 0x71>&gt; <&lt;0x52 0x72>&gt;  \
                          <&lt;0x53 0x73>&gt; <&lt;0x54 0x74>&gt; <&lt;0x55 0x75>&gt;  \
                          <&lt;0x56 0x76>&gt; <&lt;0x57 0x77>&gt; <&lt;0x58 0x78>&gt;  \
                          <&lt;0x59 0x79>&gt; <&lt;0x5a 0x7a>&gt;

FILES
       /usr/share/lib/locale/LC_CTYPE/*   run-time location of  the  character
                                          classification  tables  generated by
                                          chrtbl
       /etc/locale/LC_CTYPE/*             location for private versions of the
                                          classification  tables  generated by
                                          chrtbl

SEE ALSO
       ctype(3V), environ(5V)

DIAGNOSTICS
       The error messages produced by chrtbl are intended to be  self-explana-
       tory.   They  indicate  input  errors  in the command line or syntactic
       errors encountered within the input file.



                                2 February 1990                      CHRTBL(8)