unixdev.net


Switch to SpeakEasy.net DSL

The Modular Manual Browser

Home Page
Manual: (SunOS-4.1.3)
Page:
Section:
Apropos / Subsearch:
optional field

NAWK(1)                     General Commands Manual                    NAWK(1)



NAME
       nawk - pattern scanning and processing language

SYNOPSIS
       awk  [ -f program-file ] [ -F c ] [ program ] [ variable =value ... ] [
       filename...]

DESCRIPTION
       nawk is a new version  of  awk(1)  that  provides  additional  features
       including, dynamic regular expressions, additional built-ins and opera-
       tors, and user defined functions.  Other implementations refer to  this
       command  by  its  original  name, awk, choosing to replace the original
       program with the enhanced one.  Since there is a slight incompatibility
       between  the  two versions (see BUGS below) both versions are available
       in the SunOS environment, the original, awk, and the enhanced, nawk.

       nawk scans each input filename for lines that match any  of  a  set  of
       patterns specified in program.  program string must be enclosed in sin-
       gle quotes (') to protect it from the shell.  For each pattern in  pro-
       gram there may be an associated action performed when a line of a file-
       name matches the pattern.  The set  of  pattern-action  statements  may
       appear literally as program or in a file specified with the -f program-
       file option.

OPTIONS
       -f filename
              Specify the contents of filename as the source for the program.

       -F c   Set the input field separator to c.  If the field  separator  is
              longer  than  one character, it is taken to be a regular expres-
              sion, and should be enclosed in single quotes to protect special
              characters from the shell.

       variable=value
              Set  a built-in variable to value before the first record of the
              next filename is read.  See Built-in Variables below for a  com-
              plete list of available variables.

USAGE
   Input Lines
       Input  files  are  read  in  order; if there are no files, the standard
       input is read.  The file name `-' means the standard input.  Each input
       line  is  matched  against  the pattern portion of every pattern-action
       statement; the associated action is performed for each matched pattern.

       An input line is normally made up of fields separated by  white  space.
       This default can be changed by using the FS built-in variable or the -F
       c option.)  The fields are denoted $1, $2, ...; $0 refers to the entire
       line.

   Pattern-action Statements
       nawk programs contain pattern-action statements of the form:

              pattern { action }

       Either  pattern or action may be omitted.  If there is no action with a
       pattern, the matching line is printed.  If there is no pattern with  an
       action, the action is performed on every input line.

       Patterns  are  arbitrary Boolean combinations (!, ||, &&&&, and parenthe-
       ses) of relational expressions and regular expressions.   A  relational
       expression is one of the following:

              expression relop expression
              expression matchop regular expression
              expression in array-name
              (expression, expression, ...) in array-name

       where relop is any of the six relational operators in C, and matchop is
       either ~ (contains) or !~ (does not  contain).   An  expression  is  an
       arithmetic expression, a relational expression, the special expression

              var in array,

       or a Boolean combination of these.

       The  special  patterns  BEGIN  and  END  may be used to capture control
       before the first input line has been read and after the last input line
       has been read respectively.  They are the only patterns that require an
       action statement.  These keywords do not combine with  any  other  pat-
       terns.

       Regular  expressions  are  as in egrep (see grep(1V).  In patterns they
       must be surrounded by slashes.  Isolated regular expressions in a  pat-
       tern  apply  to the entire line.  Regular expressions may also occur in
       relational expressions.  A pattern may consist of  two  patterns  sepa-
       rated  by  a comma; in this case, the action is performed for all lines
       between an occurrence of the first pattern and the next  occurrence  of
       the second pattern.

       An  action  is a sequence of statements.  A statement may be one of the
       following:

              if ( expression ) statement [ else statement ]
              while ( expression ) statement
              do statement while ( expression )
              for ( expression ; expression ; expression ) statement
              for ( var in array ) statement
              delete array[subscript]
              break
              continue
              { [ statement ] ... }
              expression     # commonly variable = expression
              print [ expression-list ] [ >expression ]
              printf format [ , expression-list ] [ >expression ]
              next      # skip remaining patterns on this input line
              exit [expr]    # skip the rest of the input; exit status is expr
              return [expr]

       Statements are terminated by semicolons, right braces, or NEWLINE char-
       acters.   An  empty  expression-list  stands  for the whole input line.
       Expressions take on string or numeric values as  appropriate,  and  are
       built  using  the operators +, -, *, /, %, and concatenation (indicated
       by a blank).  The C operators ++, --, +=, -=, *=, /=, and %=  are  also
       available  in  expressions.   Variables  may be scalars, array elements
       (denoted x[i]), or fields.   Variables  are  initialized  to  the  null
       string  or  zero.   Array subscripts may be any string, not necessarily
       numeric; this allows for a form of  associative  memory.   String  con-
       stants are quoted (").

       The  print statement prints its arguments on the standard output, or on
       a file if >>expression is present, or on a pipe if `| cmd'  is  present.
       The  arguments  are separated by the current output field separator and
       terminated by the output record separator.  The printf  statement  for-
       mats its expression list according to the format (see printf(3V)).

   Built-in Variables
       A  regular expression may be used to separate fields by using the -F  c
       option or by assigning the expression to the built-in variable FS.  The
       default  is  to  ignore leading blanks and to separate fields by blanks
       and/or tab characters.  However, if FS is  assigned  a  value,  leading
       blanks are no longer ignored.

       Built-in variables include:

       ARGC      Command line argument count.

       ARGV      Command line argument array.

       FILENAME  Name of the current input file.

       FNR       Ordinal number of the current record in the current file.

       FS        Input field separator regular expression (default blank).

       NF        Number of fields in the current record.

       NR        Ordinal number of the current record.

       OFMT      Output format for numbers (default %.6g).

       OFS       Output field separator (default blank).

       ORS       Output record separator (default NEWLINE).

       RS        Input record separator (default NEWLINE).

       SUBSEP    Separates multiple subscripts (default is 034).

       nawk   has   a  variety  of  built-in  functions:  arithmetic,  string,
       input/output, and general.

       The arithmetic functions are: atan2, cos, exp,  int,  log,  rand,  sin,
       sqrt,  and  srand.   int  truncates  its  argument to an integer.  rand
       returns a random number between 0 and 1.  srand ( expr ) sets the  seed
       value for rand to expr or uses the time of day if expr is omitted.

       The string functions are:

       gsub(for,`repl,`in)
                 behaves like sub (see below), except that it replaces succes-
                 sive occurrences of  the  regular  expression  (like  the  ed
                 global substitute command).

       index(s,t)
                 returns the position in string s where string t first occurs,
                 or 0 if it does not occur at all.

       int       truncates to an integer value.

       length(s) returns the length of its argument taken as a string,  or  of
                 the whole line if there is no argument.

       match(s,`re)
                 returns the position in string s where the regular expression
                 re occurs, or 0 if it does not occur at all.  RSTART  is  set
                 to  the  starting position (which is the same as the returned
                 value), and RLENGTH is set  to  the  length  of  the  matched
                 string.

       rand      random number on (0, 1).

       split(s,`a,`fs)
                 splits the string s into array elements a[1], a[2], a[n], and
                 returns n.  The separation is done with the  regular  expres-
                 sion fs or with the field separator FS if fs is not given.

       srand     sets the seed for rand

       sprintf(fmt,`expr,`expr,`...)
                 formats  the  expressions  according to the printf(3V) format
                 given by fmt and returns the resulting string.

       sub(for,`repl,`in)
                 substitutes the string repl in place of the first instance of
                 the  regular expression for in string in and returns the num-
                 ber of substitutions.  If in is omitted, nawk substitutes  in
                 the current record ($0).

       substr(s,`m,`n)
                 returns  the  n-character substring of s that begins at posi-
                 tion m.

       The input/output and general functions are:

       close(filename)
                 closes the file or pipe named filename.

       cmd| getline
                 pipes the output of cmd into getline; each successive call to
                 getline returns the next line of output from cmd.

       getline   sets $0 to the next input record from the current input file.

       getline <file
                 sets $0 to the next record from file.

       getline x sets variable x instead.

       getline x <file
                 sets x from the next record of file.

       system(cmd)
                 executes cmd and returns its exit status.

       All  forms of getline return 1 for successful input, 0 for end of file,
       and -1 for an error.

       nawk also provides  user-defined  functions.   Such  functions  may  be
       defined (in the pattern position of a pattern-action statement) as

              function name(args,...) { stmts }
              func name(args,...) { stmts }

       Function  arguments  are  passed by value if scalar and by reference if
       array name.  Argument names are local to the function; all other  vari-
       able  names are global.  Function calls may be nested and functions may
       be recursive.  The return statement may be used to return a value.

EXAMPLES
       Print lines longer than 72 characters:

              length >&gt; 72

       Print first two fields in opposite order:

              { print $2, $1 }

       Same, with input fields separated by comma and/or blanks and tabs:

              BEGIN { FS = ",[ \t]*|[ \t]+" }
                     { print $2, $1 }

       Add up first column, print sum and average:

              { s += $1 }
              END  { print "sum is", s, " average is", s/NR }

       Print fields in reverse order:

              { for (i = NF; i >&gt; 0; --i) print $i }

       Print all lines between start/stop pairs:

              /start/, /stop/

       Print all lines whose first field is different from previous one:

              $1 != prev { print; prev = $1 }

       Simulate echo(1V):

              BEGIN {
                   for (i = 1; i <&lt; ARGC; i++)
                        printf "%s", ARGV[i]
                   printf "\n"
                   exit
              }

       Print file, filling in page numbers starting at 5:

              /Page/ { $2 = n++; }
                     { print }

              example%  nawk -f program n=5 input

SEE ALSO
       grep(1V), lex(1), sed(1V), printf(3V)

       A. V. Aho, B. W. Kerninghan, P. J. Weinberger, The AWK Programming Lan-
       guage Addison-Wesley, 1988.

BUGS
       Input white space is not preserved on output if fields are involved.

       There  are  no  explicit  conversions  between numbers and strings.  To
       force an expression to be treated as a number add 0 to it; to force  it
       to be treated as a string concatenate the null string ("") to it.

       Pattern-action statements must be separated by either a semi-colon or a
       NEWLINE.  This is an incompatibility with the old version of awk.



                                9 October 1989                         NAWK(1)