unixdev.net


Switch to SpeakEasy.net DSL

The Modular Manual Browser

Home Page
Manual: (OSF1-V5.1-alpha)
Page:
Section:
Apropos / Subsearch:
optional field



eucJP(5)							     eucJP(5)



NAME

  eucJP	- A character encoding system (codeset)	for Japanese

DESCRIPTION

  The Japanese EUC (Extended UNIX Code), or eucJP, codeset consists of the
  following character sets:

    +  CS0 (ASCII or JIS Roman)

    +  CS1 (JIS	X0208)

    +  CS2 (JIS	Katakana)

    +  CS3 (JIS	X0212)

  CS0 is a primary character set. CS1, CS2, and	CS3 are	supplementary charac-
  ter sets. The	MSB (Most Significant Bit) of the byte that represents a
  character in CS0 is set off, whereas the MSB of the bytes that represent
  characters in	CS1, CS2, and CS3 is set on.

  Japanese EUC Encoding


  The representation of	ASCII/JIS Roman	and JIS	X0208 characters in the
  Japanese EUC codeset is similar to how those characters are represented in
  the DEC Kanji	codeset	(refer to deckanji(5)).	The two	additional character
  sets,	JIS Katakana and JIS X0212, are	encoded	in the Japanese	EUC codeset
  by making use	of the SS2 (Single Shift 2) and	SS3 (Single Shift 3) control
  characters.

  The Japanese EUC codeset provides the	following two areas for	representa-
  tion of user-defined characters (UDC):

  __________________________________________________________
  Area Usage   Row Range		     Code Range

			   Number of Char-
			   acters
  __________________________________________________________
  JIS X0208    85-94	   940		     F5A1-FEFE
  JIS X0212    78-94	   1598		     SS3 [EEA1-FEFE]
  __________________________________________________________

  The representation of	UDCs on	these two code planes is identical to that
  for standard characters that occupy the same planes. Code ranges distin-
  guish	between	UDCs and standard JIS X0208 and	JIS X0212 characters that
  occupy the same plane.

  Currently, the operating system does not support JIS X0212 (JIS Supplemen-
  tary)	characters.




  Codeset Conversion


  The following	codeset	converter pairs	are available for converting Japanese
  characters between eucJP and other encoding formats.	Refer to
  iconv_intro(5) for an	introduction to	codeset	conversion. For	more informa-
  tion about the other codeset for which eucJP is the input or output, see
  the reference	page specified in the list item.

    +  deckanji_eucJP, eucJP_deckanji

       Converting from and to the DEC Kanji codeset: deckanji(5).

    +  ISO-2022-JP_eucJP, eucJP_ISO-2022-JP

       Converting from and to the ISO 2022 Japanese codeset: iso2022jp(5).

    +  ISO-2022-JPext_eucJP, eucJP_ISO-2022-JPext

       Converting from and to the ISO 2022 Japanese Extended codeset:
       iso2022jp(5).

    +  JIS7_eucJP, eucJP_JIS7

       Converting from and to the JIS7 codeset:	jiskanji(5).

    +  SJIS_eucJP, eucJP_SJIS

       Converting from and to the Shift	JIS codeset: SJIS(5).

       Shift JIS encoding is identical to the encoding used in the Microsoft
       PC code page for	Japanese. You can therefore use	these converters to
       convert Japanese	text from and to Japanese code-page format. See
       code_page(5) for	more information about how the operating system	sup-
       ports PC	code pages.

    +  sdeckanji_eucJP,	eucJP_sdeckanji

       Converting from and to the Super	DEC Kanji codeset: sdeckanji(5).

    +  UCS-2_eucJP, eucJP_UCS-2

       Converting from and to UCS-2 format: Unicode(5).

    +  UCS-4_eucJP, eucJP_UCS-4

       Converting from and to UCS-4 format: Unicode(5).

    +  UTF-8_eucJP, eucJP_UTF-8

       Converting from and to UTF--8 format: Unicode(5).

  Japanese EUC Fonts


  For display devices, the operating system supports Japanese EUC characters
  by converting	Japanese EUC code to DEC Kanji code and	then using the fonts
  for DEC Kanji. Because the CS3 character set is not supported	by the DEC
  Kanji	codeset, CS3 characters	cannot be displayed.

  The operating	system does not	provide	PostScript fonts for Japanese EUC.
  Some printers	support	Japanese with printer-resident fonts and print
  filters perform codeset conversion, if required, for the encoding used in
  the file input to the	print job. For some other printers, you	can set	up a
  print	filter to convert Japanese bitmap fonts	to PostScript. Refer to
  i18n_printing(5) for introductory information	about your printing options.






SEE ALSO

  Commands: locale(1)

  Others: ascii(5), code_page(5), i18n_intro(5), i18n_printing(5),
  iconv_intro(5), deckanji(5), iso2022jp(5), Japanese(5), jiskanji(5),
  l10n_intro(5), sdeckanji(5), shiftjis(5), Unicode(5)