unixdev.net


Switch to SpeakEasy.net DSL

The Modular Manual Browser

Home Page
Manual: (OSF1-V5.1-alpha)
Page:
Section:
Apropos / Subsearch:
optional field



tr(1)									tr(1)



NAME

  tr - Translates characters

SYNOPSIS

  tr [-Acs] string1 string2

  tr -ds  [-Ac]	string1	string2

  tr -d	 [-Ac] string1

  tr -s	 [-Ac] string1

  The tr command copies	characters from	the standard input to the standard
  output with substitution or deletion of selected characters.

STANDARDS

  Interfaces documented	on this	reference page conform to industry standards
  as follows:

  tr: XCU5.0

  Refer	to the standards(5) reference page for more information	about indus-
  try standards	and associated tags.

OPTIONS

  -A  [Tru64 UNIX]  Translates on a byte-by-byte basis.	 When you specify
      this option, tr does not support extended	characters.

  -c  Complements (inverts) the	set of characters in string1, which is the
      set of all characters in the current character set, as defined by	the
      current setting of LC_CTYPE, except for those actually specified in the
      string1 argument.	These characters are placed in the array in ascending
      collation	sequence, as defined by	the current setting of LC_COLLATE.

  -d  Deletes all occurrences of input characters or collating elements	found
      in the array specified in	string1.

  -s  Replaces any character specified in string1 that occurs as a string of
      two or more repeating characters as a single instance of the character
      in string2.

OPERANDS

  string1

  string2
      Translation control strings as explained in the DESCRIPTION section.



DESCRIPTION

  Input	characters from	string1	are replaced with the corresponding charac-
  ters in string2.  The	tr command cannot handle an ASCII NUL (\000) in
  string1 or string2; it always	deletes	NUL from the input.

  [Tru64 UNIX]	The trbsd command is a BSD compatible version of tr.

  The following	constructs can be used to specify characters or	single-
  character collating elements.	If any of these	constructs result in mul-
  ticharacter collating	elements, tr excludes those elements from the result-
  ing array without issuing a diagnostic.

  A character
      Represents itself	when not described by one of the other conventions in
      this list.

  \octal-sequence. . .
      Represents a character by	using its octal	value. An octal	sequence con-
      sists of a backslash followed by the longest sequence of one-, two-, or
      three-octal-digit	characters (01234567). The sequence causes the char-
      acter whose encoding is represented by the one-, two-, or	three-digit
      octal value to be	placed in the string.

  \\, \a, \b, \f, \n, \r, \t, \v
      Represent	standard backslash-escape sequences. No	results	are defined
      by the Single UNIX Specification for specifying characters after a
      backslash	other than the ones listed here. In portable applications, a
      backslash	should be followed only	by an octal sequence, another
      backslash, or the	lowercase letter a, b, f, n, r,	t, or v.

      [Tru64 UNIX]  On UNIX systems, you can enclose string operands in	quo-
      tation marks or specify a	backslash before some characters, such as *
      (an asterisk), to	remove the special meaning of those characters to the
      shell.

  c1-c2
      Represents a range of collating elements between the specified range
      endpoints, inclusive, as defined by the current locale setting of	the
      LC_COLLATE category. The starting	element, c1, must precede the ending
      element, c2, in the current collation order. The characters or collat-
      ing elements in the range	are placed in the associated string in
      ascending	collation sequence.  Note that the collation sequence for
      ASCII characters,	such as	letters	in the English alphabet, may vary
      among locales. In	the POSIX locale, for example, a-z produces a string
      with all English lowercase letters in English alphabetical order.	How-
      ever, when LC_COLLATE is set to a	different locale, English lowercase
      letters may be subject to	a different collation order. Therefore,	a-z
      may produce a different result for locales other than the	POSIX locale.

  [c*number]
      Stands for number	repetitions of the character c.	 The number is con-
      sidered to be in decimal unless the first	digit of number	is 0; then it
      is considered to be in octal.  This format is valid only as string2.

  [=equiv=]
      Represents all characters	or collating elements belonging	to the
      equivalence class	specified by equiv, as defined by the LC_COLLATE
      locale category.	An equivalence class expression	can be used for
      string1 or string2 only when used	in combination with the	-d and -s
      options.	(For more information, see the locale(4) reference page.)

  [:class:]
      Represents all characters	belonging to the defined character class, as
      defined by the current setting of	the LC_CTYPE locale category.  The
      following	character class	names are accepted when	specified in string1:


	   alnum   cntrl   lower   space
	   alpha   digit   print   upper
	   blank   graph   punct   xdigit

      If the current locale defines additional keywords	(by including addi-
      tional charclass definitions in the LC_TYPE category), the tr command
      also recognizes those keywords as	class values.

      When the -d and -s options are specified together, any of	the character
      class names are accepted in string2; otherwise, only character class
      names lower or upper are accepted	in string2 and then only if the	class
      complement, (upper or lower, respectively) is specified in the same
      relative position	in string1.  Such a specification is interpreted as a
      request for case conversion.

      When [:lower:] appears in	string1	and [:upper:] appears in string2, the
      arrays contain the characters from the toupper mapping in	the LC_CTYPE
      category of the current locale.  When [:upper:] appears in string1 and
      [:lower:]	appears	in string2, the	arrays contain the characters from
      the tolower mapping in the LC_CTYPE category of the current locale.

      The first	character from each mapping pair is in the array for string1
      and the second character from each mapping pair is in the	array for
      string2 in the same relative position.

  [Tru64 UNIX]	When string2 is	shorter	than string1, a	difference results
  between historical System V and BSD systems.	A BSD system pads string2
  with the last	character found	in string2.  Thus, it is possible to do	the
  following:

       tr 0123456789 d

  [Tru64 UNIX]	The preceding command translates all digits to the letter d.
  A portable application cannot	rely on	the BSD	behavior; it would have	to
  code the example in the following way:

       tr 0123456789 '[d*]'

  [Tru64 UNIX]	If a given character appears more than once in string1,	the
  character in string2 corresponding to	its last appearance in string1 will
  be used in the translation.

  If the -c and	-d options are both specified, all characters except those
  specified by string1 are deleted. The	contents of string2 are	ignored,
  unless -s is also specified.	Note, however, that the	same string cannot be
  used for both	the -d and the -s options; when	both options are specified,
  both string1 (used for deletion) and string2 (used for squeezing) are
  required.

  If the -d option is not specified, each input	character or collating ele-
  ment found in	the array specified by string1 is replaced by the character
  or collating element in the same relative position in	the array specified
  by string2.

  When the -s option is	specified, if the string2 contains a character class,
  the argument's array contains	all of the characters in that character
  class.  For example:

       tr -s '[:space:]'

  In a case conversion,	however, the string2 array contains only those char-
  acters defined as the	second characters in each of the toupper or tolower
  character pairs, as appropriate. For example:

       tr -s '[:upper:]' '[:lower:]'





  System V Compatibility


  [Tru64 UNIX]	The root of the	directory tree that contains the commands
  modified for SVID 2 compliance is specified in the file /etc/svid2_path.
  You can use /etc/svid2_profile as the	basis for, or to include in, your
  .profile.  The file /etc/svid2_profile reads /etc/svid2_path and sets	the
  first	entries	in the PATH environment	variable so that the modified SVID 2
  commands are found first.

  [Tru64 UNIX]	In the SVID 2 compliant	version	of the tr command, only	char-
  acters in the	octal range of 1 to 377	are complemented when you specify the
  -c option.  This behavior is accomplished because the	-A option is impli-
  citly	forced to be on	when you specify the -c	option.

NOTES

   1.  [Tru64 UNIX]  Specifying	the -A option improves ASCII performance.

   2.  Despite similarities in appearance, the string arguments	used by	tr
       are not regular expressions.

   3.  The tr command correctly	processes NULL characters in its input
       stream.	NULL characters	can be stripped	using the following command:
	    tr -d '\000'

   4.  If string1 or string2 is	the empty string, results are undefined	and
       unpredictable.

EXIT STATUS

  The following	exit values are	returned:

  0   Successful completion.

  >>0  An error occurred.

EXAMPLES

   1.  To translate braces into	parentheses, enter:
	    tr '{}' '()' <&lt;textfile >&gt;newfile

       This translates each { (left brace) to (	(left parenthesis) and each }
       (right brace) to	) (right parenthesis).	All other characters remain
       unchanged.

   2.  In the POSIX locale, to translate lowercase ASCII characters to upper-
       case, you can enter:
	    tr 'a-z' 'A-Z' <&lt;textfile >&gt;newfile

       This command assumes that English letters are collated in English
       alphabetical order, which may not be true for locales other than	the
       POSIX locale. The following command is recommended for case conversion
       for all locales:
	    tr '[:lower:]' '[:upper:]' <&lt;textfile >&gt;newfile

   3.  The two strings can be of different lengths:
	    tr '0-9' '#' <&lt;textfile >&gt;newfile

       This translates each 0 into a # (number sign) but does not treat	the
       digits 1	to 9; if the two character strings are not the same length,
       the extra characters in the longer one are ignored.

   4.  To translate each digit to a # (number sign), enter:


	    tr '0-9' '[#*]' <&lt;textfile >&gt;newfile

       The * (asterisk)	tells tr to repeat the # (number sign) enough times
       to make the second string as long as the	first one.

   5.  To translate each string	of digits to a single #	(number	sign), enter:
	    tr -s '0-9'	'[#*]' <&lt;textfile >&gt;newfile

   6.  In the POSIX locale, to translate all ASCII characters that are not
       specified, enter:
	    tr -c '[ -~]' '[A-_]' <&lt;textfile >&gt;newfile

       This translates each nonprinting	ASCII character	to the next following
       corresponding control key letter	(\001 translates to B, \002 to C, and
       so on).	ASCII DEL (\177), the character	that follows ~ (tilde),
       translates to a ] (right	bracket).  This	command	assumes	that ASCII
       characters are collated in a certain order, which may not be true for
       locales other than the POSIX locale.

   7.  To create a list	of all words in	file1 one per line in file2, where a
       word is taken to	be a maximal string of letters,	enter:
	    tr -cs '[:alpha:]' '[\n*]' <&lt; file1 >&gt; file2

   8.  To use an equivalence class to identify accented	variants of the	base
       character e in file1, which are stripped	of diacritical marks and
       written to file2, enter:
	    tr '[=e=]' '[e*]' <&lt;	file1 >&gt;	file2

       Equivalence classes are locale dependent. Some locales may not include
       equivalence classes to associate	base letters and their accented	vari-
       ants.

ENVIRONMENT VARIABLES

  The following	environment variables affect the execution of tr:

  LANG
      Provides a default value for the internationalization variables that
      are unset	or null. If LANG is unset or null, the corresponding value
      from the default locale is used.	If any of the internationalization
      variables	contain	an invalid setting, the	utility	behaves	as if none of
      the variables had	been defined.

  LC_ALL
      If set to	a non-empty string value, overrides the	values of all the
      other internationalization variables.

  LC_COLLATE
      Determines the locale for	the behavior of	range expressions and
      equivalence classes.

  LC_CTYPE
      Determines the locale for	the interpretation of sequences	of bytes of
      text data	as characters (for example, single-byte	as opposed to multi-
      byte characters in arguments) and	the behavior of	character classes.

  LC_MESSAGES
      Determines the locale for	the format and contents	of diagnostic mes-
      sages written to standard	error.

  NLSPATH
      Determines the location of message catalogues for	the processing of
      LC_MESSAGES.



SEE ALSO

  Commands:  ed(1), ksh(1), sed(1), Bourne shell sh(1b), POSIX shell sh(1p),
  trbsd(1)

  Files:  ascii(5)

  Standards:  standards(5)