unixdev.net


Switch to SpeakEasy.net DSL

The Modular Manual Browser

Home Page
Manual: (OSF1-V5.1-alpha)
Page:
Section:
Apropos / Subsearch:
optional field



iconv_JEF(5)							 iconv_JEF(5)



NAME

  iconv_JEF - Specification for	controlling conversion between Fujitsu JEF
  and Tru64 UNIX Japanese codesets

DESCRIPTION

  The iconv utility supports the ability to convert the	encoding of charac-
  ters between Fujitsu JEF (Japanese processing	Extended Feature) code and
  one of the following Tru64 UNIX codesets: DEC	Kanji, Super DEC Kanji,
  Japanese EUC,	or Shift JIS. You choose the type of conversion	by specifying
  the appropriate values for the utility's from-code and to-code parameters,
  as follows:

  _______________________________________________
  Type of Code Conversion   from-code	to-code
  _______________________________________________
  JEF to DEC Kanji	    JEF		deckanji
  JEF to Super DEC Kanji    JEF		sdeckanji
  JEF to Japanese EUC	    JEF		eucJP
  JEF to Shift JIS	    JEF		SJIS
  DEC Kanji to JEF	    deckanji	JEF
  Super	DEC Kanji to JEF    sdeckanji	JEF
  Japanese EUC to JEF	    eucJP	JEF
  Shift	JIS to JEF	    SJIS	JEF
  _______________________________________________

  Conversion behavior for the following	items is affected by the definition
  of environment variables or profile entries in the user's environment. For
  more information, see	the "Environment Variables" and	"Profile" sections.

    +  The UDC (User-Defined Character)	mapping	table that is used for UDC
       conversion

       This table must be an ASCII text	file that contains UDC mapping infor-
       mation.	The table affects conversion of	user-defined characters
       between the codesets.

    +  The EBCDIC to/from ISO code (ASCII, JIS Roman characters) mapping
       table that is used for conversion

       This table must be ASCII	text file that contains	information on how to
       map characters between EBCDIC and ISO code.

    +  The K-shift code

       This is a one- or two-byte hexadecimal code that	marks the beginning
       of Kanji	mode.

    +  The A-shift code

       This is a one- or two-byte hexadecimal code that	marks the beginning
       of EBCDIC mode.

    +  The status of the initial mode (Kanji or	EBCDIC)	at the time iconv
       command starts or the first time	the iconv() function is	called after
       calling the iconv_open()	function that initializes the converter	in a
       program

       The status keywords are either kanji_mode or ebcdic_mode.

    +  How to treat undefined characters when these are	detected in Kanji
       mode

       Specify this action by using one	of the following keywords:

       abort   Stop codeset conversion.

       pass    Output the undefined characters without any processing and
	       continue	codeset	conversion.

       replace Output padding characters instead of the	undefined characters
	       and continue codeset conversion.

       dismiss Ignore the undefined characters and continue codeset conver-
	       sion.

    +  The two-byte padding character used in Kanji mode

       This value is meaningful	when replace is	chosen for the processing of
       undefined characters in Kanji mode. Specify the padding character by
       its hexadecimal value.

    +  How to treat undefined characters when these are	detected in EBCDIC
       mode

       Specify this action by using one	of the following keywords:

       abort
	   Stop	codeset	conversion.

       pass
	   Output the undefined	characters without any processing and con-
	   tinue codeset conversion.

       replace
	   Output padding characters instead of	the undefined characters and
	   continue codeset conversion.

       dismiss
	   Ignore the undefined	characters and continue	codeset	conversion.

    +  The one-byte padding character used in EBCDIC mode

       This value is meaningful	when replace is	chosen for the processing of
       undefined characters in EBCDIC mode. Specify the	padding	character by
       its hexadecimal value.

  When the to-code parameter for the conversion	is JEF,	you can	also specify
  the following	items for conversion behavior:

    +  Whether the initial shift code is output	at the start of	conversion if
       the status of the initial mode (Kanji or	EBCDIC)	is different from the
       mode of the first input character

       The start of conversion is the time the iconv utility starts process-
       ing, or when the	iconv()	function is called just	after opening the
       converter with iconv_open(). Keyword values for this item are yes or
       no.

    +  Whether or not the utility outputs the last shift code when iconv() is
       called with a zero length input string, and the current mode (Kanji or
       EBCDIC) is different from the mode specified by the last	shift state

       Keyword values for this item are	yes or no.

    +  The last	status (Kanji mode or EBCDIC mode)

       Specify kanji_mode or ebcdic_mode for this value. It is meaningful
       only when yes is	the setting for	whether	the utility outputs the	last
       shift code.

  If the items that control conversion behavior	are specified by both
  environment variables	and the	profile	file, values set by environment	vari-
  ables	override values	set by comparable entries in the profile. Note that
  values for all conversion control items are case-sensitive, whether they
  are set by environment variables or in the profile. The following table
  contains the default values for each conversion control item:

  ___________________________________________________
  Conversion Control Item		Default	Value
  ___________________________________________________
  UDC mapping table			None
  K shift code				0x28
  A shift code				0x29
  Initial state				ebcdic_mode
  Processing for undefined characters
  in Kanji mode				abort
  Processing for undefined characters
  in EBCDIC mode			pass
  ___________________________________________________

  The default padding characters are white spaces, whose code values for each
  destination codeset are noted	in the following table.	These padding charac-
  ters are output when you specify replace for processing of undefined char-
  acters and do	not explicitly specify the padding character.

  __________________________________________________
  Mode		Default	Value	Destination Codeset
  __________________________________________________
  Kanji	mode	0x4040		JEF
		0xa1a1		deckanji, sdeckanji,
				or eucJP
		0x8140		SJIS
  EBCDIC mode	0x40		JEF
		0x20		deckanji, sdeckanji,
				eucJP, or SJIS
  __________________________________________________

  The default EBCDIC-ISO mapping table is as follows;

    +  For conversion from JEF to other	codesets:
       /usr/lib/nls/loc/iconv/data/kana_ebcdic.tbl

    +  For conversion from other codesets to JEF:
       /usr/lib/nls/loc/iconv/data/kana_ebcdic.tbl

  These	mapping	tables map both	EBCDIC and ISO code, which includes JIS	Roman
  characters. The kana_ebcdic.tbl mapping table	also maps ISO lowercase	char-
  acters to EBCDIC uppercase characters.

  The following	default	values for conversion control items are	meaningful
  when the iconv utility's to-code conversion parameter	is JEF:








  ____________________________________________
  Conversion Control Item	   Default
  ____________________________________________
  Output the initial shift code?   yes
  Output the last shift	code?	   yes
  Output the last status?	   ebcdic_mode
  ____________________________________________

  Environment Variables


  This section discusses the environment variables that	you can	set to con-
  trol conversion behavior. The	names for these	variables adhere to the	fol-
  lowing format:

       fromcode_tocode_controlitem

  The name segments for	fromcode or tocode can be one of the following key
  words:

  ___________________________
  For Codeset:	    Use:
  ___________________________
  Fujitsu JEF	    JEF
  DEC Kanji	    DECKANJI
  Super	DEC Kanji   SDECKANJI
  Japanese EUC	    EUCJP
  Shift	JIS	    SJIS
  ___________________________

  The name segments for	controlitem can	be one of the following	keywords:

  _______________________________________________________
  For Control Item:		       Use:
  _______________________________________________________
  UDC mapping table		       UDC_TABLE
  EBCDIC-ISO mapping table	       EBCDIC_TABLE
  K shift code			       K_SHIFT_CODE
  A shift code			       A_SHIFT_CODE
  Initial state			       INITIAL_STATE
  Processing of	undefined characters
  in Kanji mode			       KANJI_EXCEPT_PROC
  Processing of	undefined characters
  in EBCDIC mode		       EBCDIC_EXCEPT_PROC
  Padding characters
  in Kanji mode			       PADDING_2BYTE_CHAR
  Padding characters
  in EBCDIC mode		       PADDING_1BYTE_CHAR
  Output initial
  shift	code			       INITIAL_SHIFT_CODE
  Output last
  shift	code			       TRAILER_SHIFT_CODE
  Last status			       LAST_STATE
  File path of the profile	       PROFILE
  _______________________________________________________

  Following are	examples of using the setenv C shell command to	define
  environment variables	to control conversion behavior.	In these examples,
  the fromcode name segment indicates Japanese EUC and the tocode name seg-
  ment indicates JEF:

       setenv EUCJP_JEF_UDC_TABLE eucjp_jef_udc.tbl
       setenv EUCJP_JEF_EBCDIC_TABLE ebcdic_kana.tbl
       setenv EUCJP_JEF_K_SHIFT_CODE 0x28
       setenv EUCJP_JEF_A_SHIFT_CODE 0x29
       setenv EUCJP_JEF_INITIAL_STATE ebcdic_mode
       setenv EUCJP_JEF_KANJI_EXCEPT_PROC replace
       setenv EUCJP_JEF_EBCDIC_EXCEPT_PROC replace
       setenv EUCJP_JEF_PADDING_2BYTE_CHAR 0x4040
       setenv EUCJP_JEF_PADDING_1BYTE_CHAR 0x40
       setenv EUCJP_JEF_INITIAL_SHIFT_CODE yes
       setenv EUCJP_JEF_TRAILER_SHIFT_CODE yes
       setenv EUCJP_JEF_LAST_STATE ebcdic_mode
       setenv EUCJP_JEF_INITIAL_SHIFT_CODE yes
       setenv EUCJP_JEF_TRAILER_SHIFT_CODE yes
       setenv EUCJP_JEF_LAST_STATE ebcdic_mode
       setenv EUCJP_JEF_PROFILE	.eucjp_jef_profile

  Directory Search Path


  When you specify a file name without a directory, the	iconv utility
  searches the following directories and uses the first	file found:

   1.  Current directory

   2.  Home directory

   3.  The iconv/data subdirectory of the directory specified by the environ-
       ment variable LOCPATH

   4.  /usr/lib/nls/loc/iconv/data

   5.  /usr/i18n/lib/nls/loc/iconv/data

  If you specify a relative directory path for a file, the utility searches
  these	same directories in the	same order and uses the	first file found.

  Profile File


  Entry	lines in the profile file adhere to the	following format:

       entry_name	 string_value

  The entry_name and string_value fields are separated by spaces or tabs. Do
  not append a colon (:) after entry_name. The file can	also include blank
  lines	and comment entries, which begin with the # character.

  Following are	the entry_name values for different conversion control items:

  ___________________________________________________________
  Conversion Control Item	    entry_name
  ___________________________________________________________
  UDC mapping table		    udc_mapping_table
  EBCDIC-ISO mapping table	    ebcdic_mapping_table
  K shift code			    k_shift_code
  A shift code			    a_shift_code
  Initial state			    initial_state
  Processing undefined characters
  in Kanji mode			    kanji_except_proc
  Processing undefined characters
  in EBCDIC mode		    ebcdic_except_proc
  Padding character
  in Kanji mode			    padding_2byte_char
  Padding character
  in EBCDIC mode		    padding_1byte_char
  Output initial
  shift	code			    output_initial_shift_code
  Output last
  shift	code			    output_trailer_shift_code
  Last state			    last_state
  ___________________________________________________________

  Following is a sample	profile	for converting from Japanese EUC to Fujitsu
  JEF:.

       #
       #  sample profile for eucJP_JEF
       #
       udc_mapping_table	       eucjp_jef_udc.tbl
       ebcdic_mapping_table	       kana_ebcdic.tbl
       k_shift_code		       0x28	       # ebcdic	-> kanji
       a_shift_code		       0x29	       # kanji -> ebcdic
       initial_state		       ebcdic_mode
       kanji_except_proc	       replace
       ebcdic_except_proc	       replace
       padding_2byte_char	       0x4040	       # kanji mode
       padding_1byte_char	       0x40	       # ebcdic	mode
       output_initial_shift_code       yes
       output_trailer_shift_code       yes
       last_state		       ebcdic_mode

  The default file names for the profile are as	follows;

  _______________________________________________
  Code Conversion	   Default Profile Name
  _______________________________________________
  JEF to DEC Kanji	   .jef_deckanji_profile
  JEF to Super DEC Kanji   .jef_sdeckanji_profile
  JEF to Shift JIS	   .jef_sjis_profile
  JEF to Japanese EUC	   .jef_eucjp_profile
  DEC Kanji to JEF	   .deckanji_jef_profile
  Super	DEC Kanji to JEF   .sdeckanji_jef_profile
  Shift	JIS to JEF	   .sjis_jef_profile
  Japanese EUC to JEF	   .eucjp_jef_profile
  _______________________________________________

  By default, the iconv	utility	checks the directory search path mentioned in
  the "Directory Search	Path" section and uses the first profile it finds.
  However, you can also	specify	an arbitrary file path for your	profile
  instead of the default names by defining the following environment vari-
  ables:

  __________________________________________________________
  Code Conversion	   Profile Path	Environment Variable
  __________________________________________________________
  JEF to DEC Kanji	   JEF_DECKANJI_PROFILE
  JEF to Super DEC Kanji   JEF_SDECKANJI_PROFILE
  JEF to Shift JIS	   JEF_SJIS_PROFILE
  JEF to Japanese EUC	   JEF_EUCJP_PROFILE
  DEC Kanji to JEF	   DECKANJI_JEF_PROFILE
  Super	DEC Kanji to JEF   SDECKANJI_JEF_PROFILE
  Shift	JIS to JEF	   SJIS_JEF_PROFILE
  Japanese EUC to JEF	   EUCJP_JEF_PROFILE
  __________________________________________________________







  UDC Mapping Table


  Entries in a UDC mapping table adhere	to the following format:

       fromcode	     tocode

  Each of these	values is a two-byte hexadecimal number. In the	case of	Super
  DEC Kanji and	Japanese EUC, three-byte hexadecimal values that begin with
  SS3 (0x8f), such as 0x8fxxxx,	are also valid.

  You can specify ranges of UDC	from and to values in the same file entry by
  using	a hyphen to separate the codes that start and end each range:

       start_fromcode-end_fromcode   start_tocode-end_tocode

  When specifying entries that include ranges of values, the number of codes
  in the from range must always	equal the number of codes in the to range. A
  UDC mapping table can	also include blank lines and comment lines, which
  begin	with the # character. Following	is an example of a UDC mapping table:

       # JEF		       eucJP

       0x80a1-0x89fe	       0xf5a1-0xfefe	       # udc
       0x8aa1-0x93fe	       0x8ff5a1-0x8ffefe       # udc
       0x94a1-0x99fe	       0x8feea1-0x8ff3fe       # udc
       0x9aa1-0x9afe	       0x8ff4a1-0x8ff4fe       # udc

  The first entry in this file specifies a range of JEF	values from 0x80a1 to
  0x89fe that are mapped to Japanese EUC code values in	the range 0xf5a1 to
  0xfefe. You can find additional sample UDC mapping table files in the
  /usr/i18n/examples/iconv/data	directory.

  EBCDIC-ISO Mapping Table


  Entries in an	EBCDIC-ISO mapping table adhere	to the following format:

       fromcode	      tocode

  Each code is a one-byte hexadecimal number. You can specify a	range of
  character codes as follows:

       start_fromcode-end_fromcode     start_tocode-end_tocode

  When using the range format, the number of hex values	in the from range
  must be the same as the number of hex	values in the to range.

  The EBCDIC-/ISO mapping table	can also include blank lines and comment
  entries, which begin with the	# character.

  Following is an example of EBCDIC-ISO	code mapping table:

       # EBCDIC		       Kana

       0x40		       0x20	       # space
       0x4f		       0x21	       # '!'
       0x7f		       0x22	       # '"'
	 .			 .
	 .			 .
	 .			 .
       0xc1-0xc9	       0x41-0x49       # 'A' - 'I'
       0xd1-0xd9	       0x4a-0x52       # 'J' - 'R'
       0xe2-0xe9	       0x53-0x5a       # 'S' - 'Z'
	 .			 .
	 .			 .
	 .			 .

  In this example, the first column of values are from codes and the second
  column of values are to codes.  The first three value	entry lines specify
  mapping for single characters, whereas the last three	value entry lines
  specify mapping for ranges of	characters.  You can find additional sample
  EBCDIC-ISO mapping tables in the /usr/i18n/lib/nls/loc/iconv/data direc-
  tory.

NOTES

  This reference page contains code conversion specifications that apply only
  to conversion	between	Fujitsu	JEF code and the DEC Kanji, Super DEC Kanji,
  Japanese EUC,	and Shift JIS codesets.	Refer to iconv_ibmkanji(5) for code
  conversion specifications between IBM	Kanji System characters	and the	DEC
  Kanji, Super DEC Kanji, Japanese EUC,	and Shift JIS codesets.	Refer to
  iconv_KEIS(5)	for code conversion specifications between Hitachi KEIS	char-
  acters and the DEC Kanji, Super DEC Kanji, Japanese EUC, and Shift JIS
  codesets.  Refer to iconv_intro(5) for information about conversion between
  DEC Kanji, Super DEC Kanji, Japanese EUC, Shift JIS, and other Tru64 UNIX
  codesets.

SEE ALSO

  Commands: iconv(1)

  Functions: iconv(3), iconv_close(3), iconv_open(3)

  Others: deckanji(5), eucJP(5), iconv_ibmkanji(5), iconv_intro(5),
  iconv_KEIS(5), Japanese(5), sdeckanji(5), SJIS(5)