unixdev.net


Switch to SpeakEasy.net DSL

The Modular Manual Browser

Home Page
Manual: (SunOS-5.10)
Page:
Section:
Apropos / Subsearch:
optional field

awk(1)                           User Commands                          awk(1)



NAME
       awk - pattern scanning and processing language

SYNOPSIS
       /usr/bin/awk  [-f progfile]  [-F c]  [  '  prog  '] [parameters] [file-
       name...]

       /usr/xpg4/bin/awk [-F ERE] [-v assignment...] 'program'  -f progfile...
       [argument...]

DESCRIPTION
       The /usr/xpg4/bin/awk utility is described on the nawk(1) manual page.

       The /usr/bin/awk utility scans each input filename for lines that match
       any of a set of patterns specified in prog. The  prog  string  must  be
       enclosed  in  single quotes ( ') to protect it from the shell. For each
       pattern in prog there may be an associated action performed when a line
       of  a  filename  matches the pattern.  The set of pattern-action state-
       ments may appear literally as prog or in a file specified with  the  -f
       progfile  option. Input files are read in order; if there are no files,
       the standard input is read. The file name '-' means the standard input.

OPTIONS
       The following options are supported:

       -f progfile     awk uses the set of patterns it reads from progfile.



       -Fc             Uses the character c as the field separator (FS)  char-
                       acter.  See the discussion of FS below.



USAGE
   Input Lines
       Each  input  line  is matched against the pattern portion of every pat-
       tern-action statement; the associated  action  is  performed  for  each
       matched  pattern.  Any  filename of the form var=value is treated as an
       assignment, not a filename, and is executed at the time it  would  have
       been  opened  if  it were a filename. Variables assigned in this manner
       are not available inside a BEGIN rule, and are  assigned  after  previ-
       ously specified files have been read.

       An  input line is normally made up of fields separated by white spaces.
       (This default can be changed by using the FS built-in variable  or  the
       -Fc  option.)  The  default is to ignore leading blanks and to separate
       fields by blanks and/or tab characters. However, if FS  is  assigned  a
       value  that  does  not  include  any  of the white spaces, then leading
       blanks are not ignored. The fields are denoted $1, $2, ...;  $0  refers
       to the entire line.

   Pattern-action Statements
       A pattern-action statement has the form:


       pattern { action }


       Either  pattern  or  action  may be omitted. If there is no action, the
       matching line is printed. If there is no pattern, the  action  is  per-
       formed  on every input line. Pattern-action statements are separated by
       newlines or semicolons.

       Patterns are arbitrary Boolean combinations ( !, ||, &&&&, and  parenthe-
       ses)  of  relational  expressions and regular expressions. A relational
       expression is one of the following:


       expression relop expression
       expression matchop regular_expression



       where a relop is any of the  six  relational  operators  in  C,  and  a
       matchop  is either ~ (contains) or !~ (does not contain). An expression
       is an arithmetic  expression,  a  relational  expression,  the  special
       expression


       var in array



       or a Boolean combination of these.

       Regular  expressions  are as in egrep(1). In patterns they must be sur-
       rounded by slashes. Isolated regular expressions in a pattern apply  to
       the  entire  line.  Regular  expressions  may  also occur in relational
       expressions. A pattern may consist  of  two  patterns  separated  by  a
       comma;  in this case, the action is performed for all lines between the
       occurrence of the first pattern to the occurrence of  the  second  pat-
       tern.

       The  special  patterns  BEGIN  and  END  may be used to capture control
       before the first input line has been read and after the last input line
       has  been  read  respectively.  These  keywords do not combine with any
       other patterns.

   Built-in Variables
       Built-in variables include:

       FILENAME        name of the current input file



       FS              input field separator regular expression (default blank
                       and tab)



       NF              number of fields in the current record



       NR              ordinal number of the current record



       OFMT            output format for numbers (default %.6g)



       OFS             output field separator (default blank)



       ORS             output record separator (default new-line)



       RS              input record separator (default new-line)



       An  action  is  a sequence of statements. A statement may be one of the
       following:


       if ( expression ) statement [ else statement ]
       while ( expression ) statement
       do statement while ( expression )
       for ( expression ; expression ; expression ) statement
       for ( var in array ) statement
       break
       continue
       { [ statement ] ... }
       expression      # commonly variable = expression
       print [ expression-list ] [ >expression ]
       printf format [ ,expression-list ] [ >expression ]
       next            # skip remaining patterns on this input line
       exit [expr]     # skip the rest of the input; exit status is expr



       Statements are terminated by semicolons, newlines, or right braces.  An
       empty expression-list stands for the whole input line. Expressions take
       on string or numeric values as appropriate, and  are  built  using  the
       operators  +,  -,  *, /, %, ^ and concatenation (indicated by a blank).
       The operators ++, --, +=, -=, *=, /=, %=, ^=, >&gt;, >&gt;=, <&lt;, <&lt;=, ==, !=, and
       ?:  are  also available in expressions. Variables may be scalars, array
       elements (denoted x[i]), or fields. Variables are  initialized  to  the
       null  string or zero. Array subscripts may be any string, not necessar-
       ily numeric; this allows for a form of associative memory. String  con-
       stants are quoted (""), with the usual C escapes recognized within.

       The  print statement prints its arguments on the standard output, or on
       a file if >&gt;expression is present, or on a pipe if  '|cmd'  is  present.
       The  output resulted from the print statement is terminated by the out-
       put record separator with each argument separated by the current output
       field  separator.  The  printf  statement  formats  its expression list
       according to the format (see printf(3C)).

   Built-in Functions
       The arithmetic functions are as follows:

       cos(x)          Return  cosine  of  x,  where  x  is  in  radians.  (In
                       /usr/xpg4/bin/awk only. See nawk(1).)



       sin(x)          Return   sine   of  x,  where  x  is  in  radians.  (In
                       /usr/xpg4/bin/awk only. See nawk(1).)



       exp(x)          Return the exponential function of x.



       log(x)          Return the natural logarithm of x.



       sqrt(x)         Return the square root of x.



       int(x)          Truncate its argument to an integer. It will  be  trun-
                       cated toward 0 when x > 0.



       The string functions are as follows:

       index(s, t)

           Return  the  position in string s where string t first occurs, or 0
           if it does not occur at all.



       int(s)

           truncates s to an integer value. If s is not specified, $0 is used.



       length(s)

           Return the length of its argument taken as  a  string,  or  of  the
           whole line if there is no argument.



       split(s, a, fs)

           Split  the  string  s into array elements a[1], a[2], ... a[n], and
           returns n. The separation is done with the regular expression fs or
           with the field separator FS if fs is not given.



       sprintf(fmt, expr, expr,...)

           Format  the expressions according to the printf(3C) format given by
           fmt and returns the resulting string.



       substr(s, m, n)

           returns the n-character substring of s that begins at position m.



       The input/output function is as follows:

       getline         Set $0 to the next input record from the current  input
                       file. getline returns 1 for successful input, 0 for end
                       of file, and -1 for an error.



   Large File Behavior
       See largefile(5) for the  description  of  the  behavior  of  awk  when
       encountering files greater than or equal to 2 Gbyte ( 2**31 bytes).

EXAMPLES
       Example 1: Printing lines longer than 72 characters

       example% length >&gt; 72

       Example 2: Printing first two fields in opposite order

       example% { print $2, $1 }

       Example 3: Same, with input fields separated by comma and/or blanks and
       tabs

       example% BEGIN { FS = ",[ \t]*|[ \t]+" }
             { print $2, $1 }

       Example 4: Adding up first column, print sum and average

       example% { s += $1 }
       END  { print "sum is", s, " average is", s/NR }

       Example 5: Printing fields in reverse order

       example% { for (i = NF; i >&gt; 0; --i) print $i }

       Example 6: Printing all lines between start/stop pairs

       example% /start/, /stop/

       Example 7: Printing all lines whose first field is different  from  the
       previous one

       example% $1 != prev { print; prev = $1 }

       Example 8: Printing a file, filling in page numbers starting at 5

       example% /Page/     { $2 = n++; }
                    { print }

       Example 9: Printing a file and numbering its pages, starting at 5

       Assuming  this  program  is in a file named prog, the following command
       line prints the file input numbering its pages starting at 5:

       example% awk -f prog n=5 input

ENVIRONMENT VARIABLES
       See environ(5) for descriptions of the following environment  variables
       that  affect  the execution of awk: LANG, LC_ALL, LC_COLLATE, LC_CTYPE,
       LC_MESSAGES, NLSPATH, and PATH.

       LC_NUMERIC      Determine the radix character  used  when  interpreting
                       numeric  input,  performing conversions between numeric
                       and  string  values  and  formatting  numeric   output.
                       Regardless  of  locale, the period character (the deci-
                       mal-point character of the POSIX locale) is  the  deci-
                       mal-point  character  recognized in processing awk pro-
                       grams  (including  assignments  in  command-line  argu-
                       ments).



ATTRIBUTES
       See attributes(5) for descriptions of the following attributes:

   /usr/bin/awk
       tab()     allbox;     cw(2.750000i)|    cw(2.750000i)    lw(2.750000i)|
       lw(2.750000i).   ATTRIBUTE  TYPEATTRIBUTE   VALUE   AvailabilitySUNWesu
       CSIEnabled


   /usr/xpg4/bin/awk
       tab()     allbox;     cw(2.750000i)|    cw(2.750000i)    lw(2.750000i)|
       lw(2.750000i).   ATTRIBUTE  TYPEATTRIBUTE  VALUE   AvailabilitySUNWxcu4
       CSIEnabled Interface StabilityStandard


SEE ALSO
       egrep(1),  grep(1),  nawk(1),  sed(1), printf(3C), attributes(5), envi-
       ron(5), largefile(5), standards(5)

NOTES
       Input white space is not preserved on output if fields are involved.

       There are no explicit conversions between numbers and strings. To force
       an  expression  to  be  treated  as  a number, add 0 to it. To force an
       expression to be treated as a string, concatenate the null string  ("")
       to it.



SunOS 5.10                        7 Jul 2000                            awk(1)