unixdev.net


Switch to SpeakEasy.net DSL

The Modular Manual Browser

Home Page
Manual: (4.4BSD-Lite2)
Page:
Section:
Apropos / Subsearch:
optional field



BPF(4)               BSD Programmer's Manual               BPF(4)


NAME
       bpf - Berkeley Packet Filter

SYNOPSIS
       pseudo-device bpfilter 16

DESCRIPTION
       The  Berkeley  Packet  Filter  provides a raw interface to
       data link layers in a protocol independent  fashion.   All
       packets  on  the  network,  even  those destined for other
       hosts, are accessible through this mechanism.

       The packet filter appears as a character  special  device,
       /dev/bpf0,  /dev/bpf1, etc.  After opening the device, the
       file descriptor must be bound to a specific network inter-
       face  with  the  BIOSETIF ioctl.  A given interface can be
       shared be multiple listeners, and  the  filter  underlying
       each  descriptor will see an identical packet stream.  The
       total number of open files is limited to the  value  given
       in the kernel configuration; the example given in the SYN-
       OPSIS above sets the limit to 16.

       A separate device file is required for each minor  device.
       If  a file is in use, the open will fail and errno will be
       set to EBUSY.

       Associated with each open instance of  a  bpf  file  is  a
       user-settable   packet   filter.   Whenever  a  packet  is
       received by an interface, all file  descriptors  listening
       on  that  interface  apply  their filter.  Each descriptor
       that accepts the packet receives its own copy.

       Reads from these files return the next  group  of  packets
       that have matched the filter.  To improve performance, the
       buffer passed to read must be the same size as the buffers
       used  internally  by  bpf.   This  size is returned by the
       BIOCGBLEN ioctl (see below), and under  BSD,  can  be  set
       with  BIOCSBLEN.   Note  that  an individual packet larger
       than this size is necessarily truncated.

       The packet filter will support  any  link  level  protocol
       that  has fixed length headers.  Currently, only Ethernet,
       SLIP and PPP drivers have been modified to  interact  with
       bpf.

       Since  packet  data is in network byte order, applications
       should use the byteorder(3n) macros to extract  multi-byte
       values.

       A  packet  can  be sent out on the network by writing to a
       bpf file descriptor.  The writes are  unbuffered,  meaning



4.4 Berkeley Distribution April 25, 1995                        1








BPF(4)               BSD Programmer's Manual               BPF(4)


       only  one  packet  can be processed per write.  Currently,
       only writes to Ethernets and SLIP links are supported.

IOCTLS
       The ioctl command codes below are defined in  <net/bpf.h>.
       All commands require these includes:

            #include <&lt;sys/types.h>&gt;
            #include <&lt;sys/time.h>&gt;
            #include <&lt;sys/ioctl.h>&gt;
            #include <&lt;net/bpf.h>&gt;

       Additionally,  BIOCGETIF and BIOCSETIF require <&lt;net/if.h>&gt;.

       In addition to FIONREAD  and  SIOCGIFADDR,  the  following
       commands  may  be  applied to any open bpf file.  (SIOCGI-
       FADDR is obsolete under BSD systems.   SIOCGIFCONF  should
       be used to query link-level addresses.)  The (third) argu-
       ment to the ioctl should be a pointer to  the  type  indi-
       cated.

       BIOCGBLEN (u_int)
                 Returns  the required buffer length for reads on
                 bpf files.

       BIOCSBLEN (u_int)
                 Sets the buffer length for reads on  bpf  files.
                 The  buffer  must  be  set  before  the  file is
                 attached to an interface with BIOCSETIF.  If the
                 requested buffer size cannot be accomodated, the
                 closest allowable size will be set and  returned
                 in the argument.  A read call will result in EIO
                 if it is passed a buffer that is not this  size.

       BIOCGDLT (u_int)
                 Returns  the  type of the data link layer under-
                 yling  the  attached   interface.    EINVAL   is
                 returned  if  no  interface  has been specified.
                 The device types, prefixed  with  ``DLT_'',  are
                 defined in <net/bpf.h>.

       BIOCPROMISC
                 Forces the interface into promiscuous mode.  All
                 packets, not just those destined for  the  local
                 host,  are  processed.  Since more than one file
                 can be listening on a given  interface,  a  lis-
                 tener    that    opened   its   interface   non-
                 promiscuously may receive packets promiscuously.
                 This problem can be remedied with an appropriate
                 filter.




4.4 Berkeley Distribution April 25, 1995                        2








BPF(4)               BSD Programmer's Manual               BPF(4)


                 The interface remains in promiscuous mode  until
                 all files listening promiscuously are closed.

       BIOCFLUSH Flushes  the  buffer  of  incoming  packets, and
                 resets  the  statistics  that  are  returned  by
                 BIOCGSTATS.

       BIOCGETIF (struct ifreq)
                 Returns  the name of the hardware interface that
                 the file is listening on.  The name is  returned
                 in  the  if_name field of ifr.  All other fields
                 are undefined.

       BIOCSETIF (struct ifreq)
                 Sets the hardware interface associate  with  the
                 file.  This command must be performed before any
                 packets can be read.  The device is indicated by
                 name  using  the  if_name  field  of  the ifreq.
                 Additionally, performs the actions of BIOCFLUSH.

       BIOCSRTIMEOUT, BIOCGRTIMEOUT (struct timeval)
                 Set  or  get  the  read  timeout parameter.  The
                 timeval specifies the length  of  time  to  wait
                 before  timing  out  on  a  read  request.  This
                 parameter is initialized  to  zero  by  open(2),
                 indicating no timeout.

       BIOCGSTATS (struct bpf_stat)
                 Returns   the   following  structure  of  packet
                 statistics:

                 struct bpf_stat {
                      u_int bs_recv;
                      u_int bs_drop;
                 };

                 The fields are:

                 bs_recv        the number of packets received by
                                the  descriptor  since  opened or
                                reset  (including  any   buffered
                                since the last read call); and

                 bs_drop        the  number of packets which were
                                accepted  by   the   filter   but
                                dropped  by the kernel because of
                                buffer   overflows   (i.e.,   the
                                application's  reads aren't keep-
                                ing up with the packet  traffic).





4.4 Berkeley Distribution April 25, 1995                        3








BPF(4)               BSD Programmer's Manual               BPF(4)


       BIOCIMMEDIATE (u_int)
                 Enable  or  disable ``immediate mode'', based on
                 the truth value of the argument.  When immediate
                 mode  is  enabled, reads return immediately upon
                 packet reception.  Otherwise, a read will  block
                 until either the kernel buffer becomes full or a
                 timeout occurs.  This  is  useful  for  programs
                 like  rarpd(8c),  which must respond to messages
                 in real time.  The default for  a  new  file  is
                 off.

       BIOCSETF (struct bpf_program)
                 Sets  the  filter  program used by the kernel to
                 discard  uninteresting  packets.   An  array  of
                 instructions  and  its length is passed in using
                 the following structure:

                 struct bpf_program {
                      int bf_len;
                      struct bpf_insn *bf_insns;
                 };

                 The filter program is pointed to by the bf_insns
                 field  while  its  length  in  units  of `struct
                 bpf_insn' is given by the bf_len  field.   Also,
                 the actions of BIOCFLUSH are performed.

                 See section FILTER MACHINE for an explanation of
                 the filter language.

       BIOCVERSION (struct bpf_version)
                 Returns the major and minor version  numbers  of
                 the filter languange currently recognized by the
                 kernel.  Before installing  a  filter,  applica-
                 tions  must  check  that  the current version is
                 compatible with  the  running  kernel.   Version
                 numbers  are  compatible  if  the  major numbers
                 match and the application minor is less than  or
                 equal  to  the kernel minor.  The kernel version
                 number is returned in the following structure:

                 struct bpf_version {
                      u_short bv_major;
                      u_short bv_minor;
                 };

                 The  current  version  numbers  are   given   by
                 BPF_MAJOR_VERSION   and  BPF_MINOR_VERSION  from
                 <net/bpf.h>.  An incompatible filter may  result
                 in  undefined  behavior  (most  likely, an error
                 returned  by   ioctl()   or   haphazard   packet



4.4 Berkeley Distribution April 25, 1995                        4








BPF(4)               BSD Programmer's Manual               BPF(4)


                 matching).

BPF HEADER
       The  following  structure  is  prepended  to  each  packet
       returned by read(2):

               struct bpf_hdr {
                    struct timeval bh_tstamp;
                    u_long bh_caplen;
                    u_long bh_datalen;
                    u_short bh_hdrlen;
               };

       The fields, whose values are stored  in  host  order,  and
       are:

       bh_tstamp      The  time at which the packet was processed
                      by the packet filter.

       bh_caplen      The length of the captured portion  of  the
                      packet.  This is the minimum of the trunca-
                      tion amount specified by the filter and the
                      length of the packet.

       bh_datalen     The  length  of  the  packet  off the wire.
                      This value is independent of the truncation
                      amount specified by the filter.

       bh_hdrlen      The length of the BPF header, which may not
                      be equal to sizeof(struct bpf_hdr).

       The bh_hdrlen field exists to account for padding  between
       the  header and the link level protocol.  The purpose here
       is to guarantee proper alignment of the packet data struc-
       tures,  which is required on alignment sensitive architec-
       tures and and improves performance on many other architec-
       tures.  The packet filter insures that the bpf_hdr and the
       network layer header will be word aligned.  Suitable  pre-
       cautions  must be taken when accessing the link layer pro-
       tocol fields  on  alignment  restricted  machines.   (This
       isn't  a problem on an Ethernet, since the type field is a
       short falling on an even offset,  and  the  addresses  are
       probably accessed in a bytewise fashion).

       Additionally,  individual  packets are padded so that each
       starts on a word boundary.  This requires that an applica-
       tion  has  some  knowledge  of  how  to get from packet to
       packet.  The macro BPF_WORDALIGN is defined in <net/bpf.h>
       to  facilitate this process.  It rounds up its argument to
       the  nearest  word  aligned  value  (where   a   word   is
       BPF_ALIGNMENT bytes wide).



4.4 Berkeley Distribution April 25, 1995                        5








BPF(4)               BSD Programmer's Manual               BPF(4)


       For  example, if `p' points to the start of a packet, this
       expression will advance it to the next packet:

              p = (char *)p + BPF_WORDALIGN(p->bh_hdrlen + p->bh_caplen)

       For the alignment mechanisms to work properly, the  buffer
       passed  to read(2) must itself be word aligned.  malloc(3)
       will always return an aligned buffer.

FILTER MACHINE
       A filter program is an array  of  instructions,  with  all
       branches   forwardly  directed,  terminated  by  a  return
       instruction.  Each instruction performs some action on the
       pseudo-machine  state,  which  consists of an accumulator,
       index register, scratch memory store, and implicit program
       counter.

       The following structure defines the instruction format:

              struct bpf_insn {
                   u_short   code;
                   u_char    jt;
                   u_char    jf;
                   long k;
              };

       The  k  field  is  used  in  differnet  ways  by different
       insutructions, and the jt and jf fields are used  as  off-
       sets  by  the branch intructions.  The opcodes are encoded
       in a semi-hierarchical fashion.  There are  eight  classes
       of intructions: BPF_LD, BPF_LDX, BPF_ST, BPF_STX, BPF_ALU,
       BPF_JMP, BPF_RET, and BPF_MISC.  Various  other  mode  and
       operator  bits  are or'd into the class to give the actual
       instructions.   The  classes  and  modes  are  defined  in
       <net/bpf.h>.

       Below  are the semantics for each defined BPF instruction.
       We use the convention that A is the accumulator, X is  the
       index  register,  P[]  packet data, and M[] scratch memory
       store.  P[i:n] gives the data at byte offset ``i'' in  the
       packet,  interpreted  as  a  word (n=4), unsigned halfword
       (n=2), or unsigned byte (n=1).  M[i] gives the  i'th  word
       in  the  scratch  memory store, which is only addressed in
       word units.   The  memory  store  is  indexed  from  0  to
       BPF_MEMWORDS-1.   k,  jt,  and  jf  are  the corresponding
       fields in the instruction definition.  ``len''  refers  to
       the length of the packet.


       BPF_LD    These instructions copy a value into the accumu-
                 lator.   The  type  of  the  source  operand  is



4.4 Berkeley Distribution April 25, 1995                        6








BPF(4)               BSD Programmer's Manual               BPF(4)


                 specified by an ``addressing mode'' and can be a
                 constant (BPF_IMM), packet data at a fixed  off-
                 set  (BPF_ABS), packet data at a variable offset
                 (BPF_IND), the packet  length  (BPF_LEN),  or  a
                 word in the scratch memory store (BPF_MEM).  For
                 BPF_IND and BPF_ABS, the data size must be spec-
                 ified  as  a  word (BPF_W), halfword (BPF_H), or
                 byte (BPF_B).  The semantics of all  the  recog-
                 nized BPF_LD instructions follow.


                 BPF_LD+BPF_W+BPF_ABS          A <- P[k:4]

                 BPF_LD+BPF_H+BPF_ABS          A <- P[k:2]

                 BPF_LD+BPF_B+BPF_ABS          A <- P[k:1]

                 BPF_LD+BPF_W+BPF_IND          A <- P[X+k:4]

                 BPF_LD+BPF_H+BPF_IND          A <- P[X+k:2]

                 BPF_LD+BPF_B+BPF_IND          A <- P[X+k:1]

                 BPF_LD+BPF_W+BPF_LEN          A <- len

                 BPF_LD+BPF_IMM                A <- k

                 BPF_LD+BPF_MEM                A <- M[k]


       BPF_LDX   These  instructions  load a value into the index
                 register.  Note that the  addressing  modes  are
                 more  retricted  than  those  of the accumulator
                 loads, but they  include  BPF_MSH,  a  hack  for
                 efficiently loading the IP header length.

                 BPF_LDX+BPF_W+BPF_IMM         X <- k

                 BPF_LDX+BPF_W+BPF_MEM         X <- M[k]

                 BPF_LDX+BPF_W+BPF_LEN         X <- len

                 BPF_LDX+BPF_B+BPF_MSH         X               <-
                                               4*(P[k:1]&0xf)


       BPF_ST    This instruction stores the accumulator into the
                 scratch  memory.   We  do not need an addressing
                 mode since there is only one possibility for the
                 destination.




4.4 Berkeley Distribution April 25, 1995                        7








BPF(4)               BSD Programmer's Manual               BPF(4)


                 BPF_ST                        M[k] <- A


       BPF_STX   This  instruction  stores  the index register in
                 the scratch memory store.

                 BPF_STX                       M[k] <- X


       BPF_ALU   The alu instructions perform operations  between
                 the  accumulator and index register or constant,
                 and store the result back  in  the  accumulator.
                 For binary operations, a source mode is required
                 (BPF_K or BPF_X).

                 BPF_ALU+BPF_ADD+BPF_K         A <- A + k

                 BPF_ALU+BPF_SUB+BPF_K         A <- A - k

                 BPF_ALU+BPF_MUL+BPF_K         A <- A * k

                 BPF_ALU+BPF_DIV+BPF_K         A <- A / k

                 BPF_ALU+BPF_AND+BPF_K         A <- A & k

                 BPF_ALU+BPF_OR+BPF_K          A <- A | k

                 BPF_ALU+BPF_LSH+BPF_K         A <- A << k

                 BPF_ALU+BPF_RSH+BPF_K         A <- A >> k

                 BPF_ALU+BPF_ADD+BPF_X         A <- A + X

                 BPF_ALU+BPF_SUB+BPF_X         A <- A - X

                 BPF_ALU+BPF_MUL+BPF_X         A <- A * X

                 BPF_ALU+BPF_DIV+BPF_X         A <- A / X

                 BPF_ALU+BPF_AND+BPF_X         A <- A & X

                 BPF_ALU+BPF_OR+BPF_X          A <- A | X

                 BPF_ALU+BPF_LSH+BPF_X         A <- A << X

                 BPF_ALU+BPF_RSH+BPF_X         A <- A >> X

                 BPF_ALU+BPF_NEG               A <- -A


       BPF_JMP   The jump instructions  alter  flow  of  control.



4.4 Berkeley Distribution April 25, 1995                        8








BPF(4)               BSD Programmer's Manual               BPF(4)


                 Conditional   jumps   compare   the  accumulator
                 against a constant (BPF_K) or the index register
                 (BPF_X).   If  the result is true (or non-zero),
                 the true branch is taken,  otherwise  the  false
                 branch  is taken.  Jump offsets are encoded in 8
                 bits so the longest jump  is  256  instructions.
                 However,  the  jump  always (BPF_JA) opcode uses
                 the 32 bit k field as the offset, allowing arbi-
                 trarily  distant destinations.  All conditionals
                 use unsigned comparison conventions.

                 BPF_JMP+BPF_JA                pc += k

                 BPF_JMP+BPF_JGT+BPF_K         pc += (A > k) ? jt
                                               : jf

                 BPF_JMP+BPF_JGE+BPF_K         pc  +=  (A >= k) ?
                                               jt : jf

                 BPF_JMP+BPF_JEQ+BPF_K         pc += (A ==  k)  ?
                                               jt : jf

                 BPF_JMP+BPF_JSET+BPF_K        pc += (A & k) ? jt
                                               : jf

                 BPF_JMP+BPF_JGT+BPF_X         pc += (A > X) ? jt
                                               : jf

                 BPF_JMP+BPF_JGE+BPF_X         pc  +=  (A >= X) ?
                                               jt : jf

                 BPF_JMP+BPF_JEQ+BPF_X         pc += (A ==  X)  ?
                                               jt : jf

                 BPF_JMP+BPF_JSET+BPF_X        pc += (A & X) ? jt
                                               : jf

       BPF_RET   The return  instructions  terminate  the  filter
                 program  and  specify  the  amount  of packet to
                 accept  (i.e.,  they   return   the   truncation
                 amount).   A return value of zero indicates that
                 the packet should be ignored.  The return  value
                 is  either a constant (BPF_K) or the accumulator
                 (BPF_A).

                 BPF_RET+BPF_A                 accept A bytes

                 BPF_RET+BPF_K                 accept k bytes

       BPF_MISC  The miscellaneous category was created for  any-
                 thing  that  doesn't fit into the above classes,



4.4 Berkeley Distribution April 25, 1995                        9








BPF(4)               BSD Programmer's Manual               BPF(4)


                 and for any new instructions that might need  to
                 be  added.   Currently,  these  are the register
                 transfer intructions that copy the index  regis-
                 ter to the accumulator or vice versa.

                 BPF_MISC+BPF_TAX              X <- A

                 BPF_MISC+BPF_TXA              A <- X

       The BPF interface provides the following macros to facili-
       tate array initializers:
              BPF_STMT(opcode, operand)
              and
              BPF_JUMP(opcode,       operand,        true_offset,
              false_offset)


EXAMPLES
       The following filter is taken from the Reverse ARP Daemon.
       It accepts only Reverse ARP requests.

              struct bpf_insn insns[] = {
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_REVARP, 0, 3),
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, REVARP_REQUEST, 0, 1),
                   BPF_STMT(BPF_RET+BPF_K, sizeof(struct ether_arp) +
                         sizeof(struct ether_header)),
                   BPF_STMT(BPF_RET+BPF_K, 0),
              };

       This  filter  accepts  only  IP   packets   between   host
       128.3.112.15 and 128.3.112.35.

              struct bpf_insn insns[] = {
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 8),
                   BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 26),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 2),
                   BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 3, 4),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x80037023, 0, 3),
                   BPF_STMT(BPF_LD+BPF_W+BPF_ABS, 30),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 0x8003700f, 0, 1),
                   BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
                   BPF_STMT(BPF_RET+BPF_K, 0),
              };

       Finally,  this filter returns only TCP finger packets.  We
       must parse the IP header to reach  the  TCP  header.   The
       BPF_JSET instruction checks that the IP fragment offset is



4.4 Berkeley Distribution April 25, 1995                       10








BPF(4)               BSD Programmer's Manual               BPF(4)


       0 so we are sure that we have a TCP header.

              struct bpf_insn insns[] = {
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 12),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, ETHERTYPE_IP, 0, 10),
                   BPF_STMT(BPF_LD+BPF_B+BPF_ABS, 23),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, IPPROTO_TCP, 0, 8),
                   BPF_STMT(BPF_LD+BPF_H+BPF_ABS, 20),
                   BPF_JUMP(BPF_JMP+BPF_JSET+BPF_K, 0x1fff, 6, 0),
                   BPF_STMT(BPF_LDX+BPF_B+BPF_MSH, 14),
                   BPF_STMT(BPF_LD+BPF_H+BPF_IND, 14),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 2, 0),
                   BPF_STMT(BPF_LD+BPF_H+BPF_IND, 16),
                   BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, 79, 0, 1),
                   BPF_STMT(BPF_RET+BPF_K, (u_int)-1),
                   BPF_STMT(BPF_RET+BPF_K, 0),
              };

SEE ALSO
       tcpdump(1)

       McCanne, S., Jacobson V., The BSD  Packet  Filter:  A  New
       Architecture for User-level Packet Capture, Proceedings of
       the 1993 Winter USENIX Technical Conference, pp 259-269.

FILES
       /dev/bpf0, /dev/bpf1, ...

BUGS
       The read buffer must be of a fixed size (returned  by  the
       BIOCGBLEN ioctl).

       A  file that does not request promiscuous mode may receive
       promiscuously received packets as a side effect of another
       file  requesting this mode on the same hardware interface.
       This could be fixed in the kernel with additional process-
       ing overhead.  However, we favor the model where all files
       must assume that the interface is promiscuous, and  if  so
       desired,  must utilize a filter to reject foreign packets.

       Data link protocols with variable length headers  are  not
       currently supported.

       Under  SunOS,  if  a  BPF application reads more than 2^31
       bytes of data, read will fail in EINVAL.  You  can  either
       fix  the  bug  in SunOS, or lseek to 0 when read fails for
       this reason.

HISTORY
       The Enet packet filter was created in 1980 by Mike Accetta
       and  Rick  Rashid  at Carnegie-Mellon University.  Jeffrey



4.4 Berkeley Distribution April 25, 1995                       11








BPF(4)               BSD Programmer's Manual               BPF(4)


       Mogul, at Stanford, ported the code to BSD  and  continued
       its  development from 1983 on.  Since then, it has evolved
       into the Ultrix Packet Filter at DEC, a STREAMS NIT module
       under SunOS 4.1, and BPF.


















































4.4 Berkeley Distribution April 25, 1995                       12