hiprof(1)							    hiprof(1)


  hiprof - CPU-time and	page-fault call-graph profiler for performance


  hiprof [-cycles  | -faults  |	-pthread  | -threads] [hiprof-option...]
  [gprof-option...] program [argument...]

  See the start	of the OPTIONS section below for details of hiprof options
  that may be essential	for the	correct	execution of the program.

  The atom -tool hiprof	interface is still available, for compatibility	with
  earlier releases. However, it	is now undocumented, and it will be retired
  in a future release.


  See prof_intro(1) for	an introduction	to the application performance tuning
  tools	provided with Tru64 UNIX.

  The hiprof command creates an	instrumented version of	a program
  (program.hiprof) that	produces call-graph and	flat profiles of one of	a
  range	of performance statistics:

    +  The CPU time spent in each procedure (or	optionally, each source	line
       or instruction),	measured by sampling the program counter about every
       millisecond (the	default)

    +  The CPU time spent in each procedure and	procedure call,	measured as
       machine cycles, including the effects of	any memory-access delays
       (with the -cycles option)

    +  The number of page faults suffered by each procedure and	procedure
       call (with the -faults option)

  See the limitations of each performance statistic in the RESTRICTIONS	sec-
  tion below.

  If you specify program arguments (argument...) or -run, the instrumented
  program is executed also.

  If you specify -display or any gprof-option, the hiprof command runs the
  instrumented program and then	displays the profile by	running	the gprof
  tool (with any specified gprof-option).

  If you omit the program name,	a usage	message	is printed.

  The following	example	shows how to instrument, run, and display the profile
  for a	multi-threaded program:

       cc *.c -pthread -L. -g1 -O2 -o program -lapp1 -lapp2
       hiprof -pthread -L. -all	program	data/*

  The -all option request that all shared libraries be profiled, but
  threads-related system libraries cannot be safely instrumented to count
  procedure calls that are needed to print a call graph. By default, these
  libraries are	still sampled to provide flat CPU-time profiles. The -cycles
  and -faults options cannot be	used with threaded programs, but the
  displayed time or page-fault count for a procedure includes the time or
  count	for any	procedures that	it calls but that were not selected for
  instrumentation--for example,	any procedures in libraries not	selected by
  the -all or -incobj options. This means that time is not lost	from these
  profiles by excluding	shared libraries.


      File name	of a fully linked call-shared or nonshared executable to be
      profiled.	 This program should be	compiled with the -g or	-gn option
      (n>=1) to	obtain more complete profiling information.  If	the default
      symbol table level (-g0) is used,	line number information, static	pro-
      cedure names, and	file names are unavailable. Inlined procedure calls
      are also unavailable. Programs that are stripped or are optimized	by
      spike or cc -om are not supported.

      All arguments following the program name are considered to be arguments
      needed by	the instrumented program to execute the	procedures, lines,
      and instructions of interest. Multiple arguments can be specified. They
      imply -run if any	are specified, and they	can be replaced	by -run	if
      none are needed.


  Options can be abbreviated to	three characters. The gprof-options, which
  are provided as alternatives to the -display option, can be abbreviated to
  one character.

  For options that specify a procedure name (proc), C++	procedures can omit
  the argument type list, though this will match all overloaded	procedures
  with that name. To select a specific procedure, specify the full symbol
  name (as printed by the nm command). Symbol names containing spaces, aster-
  isks,	and so on must be quoted.

  Essential Options

  Some or all of these options may be needed to	prevent	the instrumented pro-
  gram malfunctioning:

      Specify -pthread if the program or any of	its libraries calls
      pthread_create(3)	(for example, if it was	compiled with either the
      -pthread option or the -threads compatibility option). This will make
      the collection of	profile	data thread-safe.

      Specify -fork if the program calls any variant of	fork(2). It is not
      usually needed if	the subprocesses also call any variant of exec(2).
      The -fork	option ensures that forked multi-threaded programs are pro-
      filed in a thread-safe way, and it produces separate profiling data
      files for	the forked subprocesses, including the process id in their
      file names as if -pids was specified. Failure to use -fork might lead
      to deadlock in the forked	child processes.

      For compatibility	with earlier releases, a default level of fork
      support is provided if the executable is non-shared or if	libc.so	is
      instrumented. However, this approach can lead to deadlock	and will be
      retired in a future release, so specifying -fork is recommended.

  -heapbase addr
      By default, the hiprof code running in the program's process allocates
      memory for its own use at	address	38000000000. If	the program needs to
      use memory between 38000000000 and 3ff00000000, specify the address
      that the hiprof code should use.

  -sigdump signal
      Specify -sigdump to force	the instrumented program to write the current
      profile data to its file(s) on receipt of	the named signal.  By
      default, the program writes the profiling	data file(s) only when the
      process terminates, but some processes never terminate normally, so
      this option lets you generate the	file(s)	on demand. After a file	is
      written, the instruction counts of the profile are all set to zero; so
      by sending two signals, any interval of a	test run can be	profiled,
      with the second signal's file(s) overwriting the first. For example, to
      use the default kill pid command to signal the program, specify -sig-
      dump TERM. Choose	a signal that the program does not use for another

  Profiling Statistics Options

      Profiles CPU time	by counting the	machine	cycles used in each procedure
      call. Use	this option only for non-threaded programs.

      Profiles page faults suffered by each procedure instead of the default
      time spent in each procedure. Use	this option only for non-threaded

  File Generating Options

      Does not print informational and progress	messages on the	standard
      error stream.

  -v  Prints the command lines used to instrument the program and to execute
      the instrumented program.	 Prints	the names of any procedures that were
      not instrumented.

  -output file
      Names the	instrumented program file instead of the default

  -dirname path
      Specifies	the directory to which the instrumented	program	writes the
      profiling	data file(s) for each test run.	 The default is	the current

      Adds the process-id of the instrumented program's	test run to the	name
      of the profiling data file produced (that	is, program.pid.hiout).	By
      default, the file	is named program.hiout.

      When profiling a threaded	program, specify -threads to produce a
      separate profile for each	pthread	in the program.	The files are named
      program[.pid].sequence.hiout, where sequence is the thread sequence
      number assigned by pthread_create(3). The	-threads option	implies	the
      -pthread option.If -sigdump is needed, -pthread is recommended instead
      of -threads, to avoid possible synchronization problems.

  Shared-Library Profiling Options

      Profiles all the shared libraries	in addition to the program's execut-

  -excobj lib
      If -all was specified, does not profile the shared library lib. Can be
      repeated,	to exclude multiple libraries.

  -incobj lib
      Profiles the shared library lib. Can be repeated to include multiple

      Searches for shared-libraries in the specified directory before search-
      ing the default directories. Can be repeated to make a search path. Use
      the same options that were used when linking the program with ld.

  -E proc
      Does not instrument the procedure	proc. This option can be used to
      exclude procedures that are uninteresting	or that	interfere with the
      instrumentation (such as non-standard assembly code).

  Execution Control Options

      Executes the instrumented	program, even if no arguments are specified.
      By default, the program is just instrumented for later execution.

      Prints the tool's	version	number.

      Executes the instrumented	program, and runs gprof	with default options
      on the resulting .hiout file(s).

      Executes the instrumented	program, and runs gprof	on the resulting
      .hiout file(s). The following gprof options are supported:

	  Profiles each	instruction within selected procedures.

	  Does not report on called procedures.

      -e proc
	  Excludes procedure proc and its descendants from the profile,	but
	  totals all procedures.

      -f proc
	  Includes only	procedure proc and its descendants in the profile,
	  but totals all procedures.

	  Profiles procedures as an indexed call graph (default).

	  Profiles source lines, listing the most heavily used first.

	  Profiles source lines, in order within selected procedures.

      -merge file
	  Merges all .hiout input files	into file.

	  Prints each procedure's starting line	number.

	  Profiles procedures, listing the most	heavily	used first (default).

	  Profiles the whole executable	and any	shared libraries.

	  Reports procedures that were never called.


  If hiprof finds any previously instrumented shared libraries in the working
  directory, it	will reuse them	if they	meet current requirements, to reduce
  re-instrumentation costs.

  Temporary instrumentation files are created in /tmp.	Set the	TMPDIR
  environment variable to a different directory	to create the files else-
  where, for example in	a disk partition with more space.


  The default sampled profile only estimates the CPU time spent	in each	pro-
  cedure call; profiles	made with the -cycles and -faults options measure it.

  When timing a	program's procedures by	measuring machine cycles (with the
  -cycles option), the 32-bit cycle-counting hardware will wrap	if no pro-
  cedure call or return	is executed by the program every few seconds --	for
  example, because of a	long-running loop. If the counter wraps, the profile
  will be incorrect. Using the -all or -incobj options to profile all non-
  system libraries and procedures can help avoid this restriction.

  The -cycles option generates an inaccurate profile if	the instrumented pro-
  gram is run on a system whose	processors have	different cycle	speeds.	This
  inaccuracy can be avoided by using hiprof's default sampling profiler	or
  the cc -p/-pg	profilers instead, or by running the application on a subset
  of the processors:

    +  Select a	single processor using the runon command.

    +  Check the processor speeds using	the psrinfo -v command and run the
       application in a	processor set comprising only processors that run at
       the same	speed (see processor_sets(4)).

  Approximate performance estimates are	as follows but will vary according to
  the application and the machine's CPU	count, type, and clock rate. The
  hiprof instrumentation takes ~2s per Mb of program file on a 500-MHz EV6
  (21264) Alpha	system,	using ~10 Mb of	memory plus another ~10	Mb per Mb of
  the largest file. The	instrumented files are ~20% larger than	the origi-
  nals,	plus ~1	Mb of hiprof code. They	run ~4 times slower. By	default, each
  profile data file is at least	the size of the	instrumented code (and uses
  this much memory), but these files are very small for	the -cycles and
  -faults options.

  If a procedure contains interprocedural branches or interprocedural jumps,
  that procedure will not be instrumented with the -cycles or -faults option,
  and no information will be reported about that procedure. Use	the -v option
  to see which procedures were not instrumented. Compilers can optimize
  return statements or non-returning function calls to interprocedural
  branches. To avoid this, recompile with the -O0 or -no_inline	option.


      Instrumented version of program produced by hiprof

      Profile data file	produced by program.hiprof

      Instrumented shared libraries produced by	hiprof

      Temporary	file created and deleted in the	current	and -dirname path


  Introduction:	prof_intro(1)

  atom(1), cc(1), dxprof(1), fork(2), gprof(1),	kill(1), ld(1),	pixie(1),
  processor_sets(4), psrinfo(1), pthread(3), runon(1), uprofile(1).  (dxprof
  is available as an option.)

  Programmer's Guide