Switch to SpeakEasy.net DSL

The Modular Manual Browser

Home Page
Manual: (OSF1-V5.1-alpha)
Apropos / Subsearch:
optional field

volintro(8)							  volintro(8)


  volintro, lsm, LSM - Introduction to Logical Storage Manager (LSM) terms
  and commands


  The following	LSM commands provide a shell-level interface used by the sys-
  tem administrator and	higher-level applications and scripts to query and
  manipulate LSM objects:

  volassist, vold, voldg, voldiskadm, voledit, volencap, volinfo, volinstall,
  voliod, vollogcnvt, volmend, volnotify, volplex, volprint, volrecover, vol-
  reconfig, volrootmir,	volsd, volsetup, volstat, voltrace, volume, volwatch


  The following	list describes LSM terms:

      A	virtual	disk device that looks to applications and file	systems	like
      a	regular	disk partition device. Volumes present block and raw device
      interfaces that are compatible in	their use with disk partition dev-
      ices. However, a volume is a virtual device that can be mirrored,
      striped or spanned across	several	disk drives, and moved to use dif-
      ferent storage, using administrative commands. The configuration of a
      volume can be changed, using LSM commands, without causing disruption
      to applications or file systems that are using the volume.

      A	copy of	a volume's logical data	address	space, also sometimes known
      as a mirror. A volume can	have up	to 32 plexes associated	with it. Each
      plex is, at least	conceptually, a	copy of	the volume that	is maintained
      consistently in the presence of volume I/O and reconfigurations. Plexes
      represent	the primary means of configuring storage for a volume. Plexes
      can have a striped, concatenated,	or RAID5 organization (layout).

      Disks exist as two entities. One is the physical disk on which all data
      is ultimately stored and which exhibits all the behaviors	of the under-
      lying technology.	The other is the LSM presentation of disks which,
      while mapping one-to-one with the	physical disks,	are just presenta-
      tions of storage devices from which allocations of storage are made. As
      an example, a physical disk presents the image of	a device with a
      definable	geometry with a	definable number of cylinders, heads, and so
      on, whereas a Logical Storage Manager disk is simply a unit of alloca-
      tion with	a name and a size.

      A	region of storage allocated on a disk for use with a volume. Subdisks
      are associated to	volumes	through	plexes.	 One or	more subdisks are
      laid out to form plexes based on the plex	layout (striped, con-
      catenated, or RAID5). Subdisks are defined relative to disk media

  disk media record
      A	reference to a physical	disk, or possibly a disk partition. This
      record can be thought of as a physical disk identifier for the disk or
      partition. Disk media records are	configuration records that provide a
      name (known as the disk media name or DM name) that an administrator
      can use to reference a particular	disk, independent of its location on
      the system's various disk	controllers. Disk media	records	reference
      particular physical disks	through	a disk ID, which is a unique identif-
      ier that is assigned to a	disk when it is	initialized for	use with LSM.

      Operations are provided to set or	remove the disk	ID stored in a disk
      media record. Such operations have the effect of removing	or replacing
      disks, along with	any associated subdisks.

  disk access record
      A	configuration record that defines a pathway to a disk. Disk access
      records most often name a	unit number.  The list of all disk access
      records stored in	a system is used to find all disks attached to the
      system. Disk access records do not identify particular physical disks.

      Disk access records are identified by their disk access names (also
      known as DA names).

      Through the use of disk IDs, LSM allows disks to be moved	between	con-
      trollers,	or to different	locations on a controller. When	a disk is
      moved, a different disk access record may	be used	when accessing the
      disk, although the disk media record will	continue to track the actual
      physical disk.

      LSM builds a list	of disk	access records automatically, based on the
      list of all devices attached to the system. It is	not necessary to
      define disk access records explicitly. Specialty disks (such as RAM
      disks or floppy disks) may require that disk access records be defined
      by using the voldisk define command.

  disk group
      A	group of disks that share a common configuration database. A confi-
      guration database	is a set of records describing objects (including
      disks, volumes, plexes, and subdisks) that are associated	with one par-
      ticular disk group. Each disk group has an administrator-assigned	name
      that is used to reference	that disk group. Each disk group also has an
      internally defined unique	disk group ID.

      Disk groups provide a method for partitioning the	LSM configuration, so
      that the database	size is	not too	large and so that database modifica-
      tions do not affect too many drives. They	also allow LSM to operate
      with groups of physical disk media that can be moved between systems.

      Disks and	disk groups have a circular relationship: disk groups are
      formed from disks, and disk group	configurations are stored on disks.
      All disks	in a disk group	are stamped with a disk	group ID, which	is a
      unique identifier	for naming disk	groups.	Some or	all disks in a disk
      group also store copies of the configuration database of the disk

  configuration	database
      A	disk group configuration is a small database that contains all
      volume, plex, subdisk, and disk media records within a single disk
      group. These configurations are replicated onto some or all disks	in
      the disk group, with up to two copies on each disk. Because these	data-
      bases are	stored within disk groups, record associations cannot span
      disk groups. Thus, a subdisk defined on a	disk in	one disk group cannot
      be associated with a volume in another disk group.

  root disk group
      Each system requires one special disk group, named rootdg, which is
      generally	the default for	most utilities.	The configuration database
      for the root disk	group (rootdg) contains	the disk access	records	for
      all disk in rootdg and all LSM disks on the system, whether members of
      another disk group or not	(such as unassigned spare disks). The rootdg
      disk group cannot	be moved to a different	host, unlike other,
      administrator-created disk groups.between	systems.

  private region
      Most disks used by LSM contain two special regions: a private region
      and a public region. Usually, each region	is formed from a complete
      partition	of the disk; however, the private and public regions can be
      allocated	from the same partition.

      The private region of a disk contains various on-disk structures that
      are used by LSM for various internal purposes. Each private region
      begins with a disk header	which identifies the disk and its disk group.
      Private regions can also contain copies of a disk	group's	configuration
      database and copies of the disk group's kernel log.

  public region
      The public region	of a disk is the space reserved	for allocating sub-
      disks. Subdisks are defined with offsets that are	relative to the
      beginning	of the public region of	a particular disk partition. A sub-
      disk represents a	contiguous region of the disk, and subdisks must be
      contiguous with each other within	the public region. Only	one contigu-
      ous region of disk can form the public region for	a particular disk

  kernel log
      A	log kept in the	private	region on the disk and that is written by LSM
      kernel. The log contains records describing the state of volumes in the
      disk group. This log provides a mechanism	for the	kernel to per-
      sistently	register state changes so that vold can	be guaranteed to
      detect the state changes even in the event of a system failure.

  disk header
      A	block stored in	a private region of a disk and that defines several
      properties of the	disk, including	the size of the	private	region,	the
      location and size	of the public region, the unique disk ID for the
      disk, the	disk group ID and disk group name (if the disk is currently
      associated with a	disk group), and the host ID for a host	that has
      exclusive	use of the disk.

  disk ID
      A	64-byte	universally unique identifier that is assigned to a physical
      disk when	it is initialized for use with LSM.  The disk ID is recorded
      in the disk media	record so that the physical disk can be	related	to
      the disk media record at system startup.

  disk group ID
      A	64-byte	universally unique identifier that is assigned to a disk
      group when the disk group	is created. This identifier is in addition to
      the disk group name, which is assigned by	the administrator.  The	disk
      group ID is used to distinguish between disk groups that have conflict-
      ing administrator-assigned names.

  host ID
      A	name, usually assigned by the administrator, that identifies a par-
      ticular host. Host IDs are used to assign	ownership to particular	phy-
      sical disks. When	a disk is part of a disk group that is in active use
      by a particular host, the	disk is	stamped	with that host's host ID. If
      another system attempts to access	the disk, it will detect that the
      disk has a non-matching host ID and will disallow	access until the
      first system discontinues	use of the disk. In the	event of system
      failures that do not clear the host ID, the voldisk clearimport opera-
      tion can be used to clear	the host ID stored on a	disk.

      If a disk	is a member of a disk group and	has a host ID that matches a
      particular host, then that host will import the disk group as part of
      system startup.

  striped plex
      A	plex that scatters data	evenly across each of its associated sub-
      disks. A plex has	a characteristic number	of stripe columns (consisting
      of associated subdisks) and a characteristic stripe unit size. The
      stripe unit size defines how data	with a particular address is allo-
      cated to one of the associated subdisks. Given a stripe unit size	of
      128 blocks, and two stripe columns, the first group of 128 blocks	would
      be allocated to the first	subdisk, the second group of 128 blocks	would
      be allocated to the second subdisk, the third group to the first sub-
      disk again, and so on.

  concatenated plex
      A	plex whose subdisks are	associated at specific offsets within the
      address range of the plex, and extend into the plex address range	for
      the length of the	subdisk. This layout allows regions of one or more
      disks to create a	plex, rather than a single big region.

  volboot file
      The volboot file is a special file (usually stored as /etc/vol/volboot)
      that is used to bootstrap	the root disk group and	to define a system's
      host ID. The volboot file	may also contain a list	of disk	access names.
      On system	startup, this list of disks is scanned,	in addition to the
      automatically generated list, to find a disk that	has a copy of the
      rootdg configuration database and	also matches the system's host ID.
      The volboot file allows the configuration	to be located on disks not
      detected by system initialization, or to be detected in cases where
      autoconfig is disabled. When such	a disk is found, its configuration is
      read and is used to get a	complete list of disk access records that are
      used as a	second-stage bootstrap of the root disk	group, and to locate
      all other	disk groups.

  plex consistency
      If the plexes of a volume	contain	different data,	then the plexes	are
      said to be inconsistent. This is a problem only if LSM is	unaware	of
      the inconsistencies, as the volume can return differing results for
      consecutive reads. Plex inconsistency is a serious compromise of data
      integrity. This inconsistency can	be caused by write operations that
      start around the time of a system	failure, if parts of the write com-
      plete on one plex	but not	the other. Plexes can also be inconsistent
      after creation of	a mirrored volume, if the plexes are not first syn-
      chronized	to contain the same data. An important part of Logical
      Storage Manager operation	is ensuring that consistent data is returned
      to any application that reads a volume. This may require that plex con-
      sistency of a volume be ``recovered'' by copying data between plexes so
      that they	have the same contents.	Alternatively, the volume can be put
      into a state such	that reads from	one plex are automatically written
      back to the other	plexes,	thus making the	data consistent	for that
      volume offset.


  A number of conventions are available	for LSM	commands to provide a finer
  degree of administration. The	following is a list of such conventions:

  Command Syntax

  Most LSM commands provide more than one operation, with operations grouped
  primarily by object type. Commands that provide multiple operations are
  typically invoked with the following form:

  command [options] [keyword] [operands]

  Here,	command	is the name of the command and keyword is a name that identi-
  fies the specific operation to perform. Any options that are introduced in
  the standard -letter form precede the	operation keyword.

  To aid in normal use,	all of the commands provide an extended	usage message
  that lists all the options and operation keywords it supports. For commands
  that are keyword-based, the extended usage message can be displayed by
  using	a keyword of help. For commands	that use operands for purposes other
  than operation selection, the	extended usage message can be displayed	by
  using	the -H option. The extended usage messages are intended	to serve as
  reminders, and not as	replacements for user documentation.

  Standard Length Numbers

  Many basic properties	of objects that	are managed by LSM require specifica-
  tion of lengths, either as a pure object length or as	an offset relative to
  some other object. LSM supports volume lengths up to 2,147,483,647 disk
  sectors (one terabyte	on most	systems). Typing such large numbers, or	even
  much smaller numbers,	can be annoying	and subject to error. LSM provides a
  uniform syntax for representing such numbers,	which uses suffixes to pro-
  vide convenient multipliers.	Numbers	can be specified in decimal, octal,
  or hexadecimal. Also,	numbers	can be specified as a sum of several numbers,
  as a convenience to avoid using a calculator.

  A hexadecimal	(base 16) number is introduced using a prefix of 0x. For
  example, 0xfff is the	same as	decimal	4095. An octal (base 8)	number is
  introduced using a prefix of 0. For example, 0177777 is the same as decimal

  A number can be followed by a	suffix character to indicate a multiplier for
  the number. A	length number with no suffix character represents a count of
  standard disk	sectors. The length of a standard disk sector can vary
  between systems; it is commonly 512 bytes. On	systems	where disks can	have
  different sector sizes, one of the sectors sizes will	be chosen as the
  ``standard'' size. Supported suffix characters are:

  b   Multiply the length by 512 bytes (blocks)

  s   Multiply the length by the standard sectors size (default)

  k   Multiply the length by 1024 bytes	(Kilobytes)

  m   Multiply the length by 1,048,576 (1024K) bytes (Megabytes)

  g   Multiply the length by 1,073,741,824 (1024M) bytes (Gigabytes)

  t   Multiply the length by 1,099,511,627,776 (1024G) bytes (Terabytes)

  Numbers are represented internally as	an integer number of sectors.  As a
  result, if the standard disk sector size is larger than 512 bytes, numbers
  will be rounded down to the nearest multiple of the specified	number of
  sectors.  Rounding is	always done to the next	lowest,	not the	nearest, mul-
  tiple	of the sector size.

  Since	the letter b is	a valid	hexadecimal character, there is	a special
  case for the b suffix	where a	single blank character can separate a number
  from the b suffix character. Use of a	blank within a number, when invoking
  commands from	the shell, usually requires enclosing the number in quotes.
  For example:

       /sbin/volassist make vol01 "0x1000 b"

  Numbers can be added or subtracted by	separating two or more numbers by a
  plus or minus	sign, respectively. A plus sign	is optional. As	an example,
  the largest allowed number that can be represented on	a system with a	512
  byte sector size can be entered as:


  The number 2g-1 can be used to represent the largest volume size that	can
  be used with most file systems.

  In output, LSM reports length	numbers	as a simple count of sectors, with no
  suffix character.

  Case is not important	in length specification. Hexadecimal numbers and suf-
  fix characters can be	specified using	any reasonable combination of upper-
  case and lowercase letters.

  Disk group selection

  Most commands	operate	upon only one disk group per invocation. Each disk
  group	has a separate configuration from every	other disk group and it	is
  possible for two disk	groups to contain objects that have the	same name.
  This can happen, in particular, if a disk group is moved from	one system to
  another. However, most utilities make	no attempt to ensure that names
  between disk groups are unique, so name collisions can occur anyway.

  In general you do not	need to	specify	disk groups except when	creating
  objects. You cannot use single-command invocations that reference objects
  in more than one disk	group, but disk	groups will be selected	automati-
  cally, based on objects specified in the command.

  The standard rules that most commands	use for	selecting the disk group for
  a command are	as follows:

    +  Given a particular set of object	names specified	on the command,	look
       for the disk group of each object. If all objects are in	the same disk
       group, use that disk group. If any named	object is not unique between
       all disk	groups,	and if one of those object names is not	in the rootdg
       disk group, then	fail.

    +  To force	use of a particular disk group,	use -g diskgroup to indicate
       the group. Non-unique names do not cause	errors when a disk group is
       specified explicitly. The diskgroup specification is either a disk
       group ID	or a disk group	name.

    +  A special case is provided for the rootdg disk group. Any set of
       objects in the rootdg disk group	can be specified without specifying
       -g rootdg, even if the given object name	is used	in another disk

  If a set of object names is given on the command line, and if	some are
  unique but some are not unique, then the command will	still fail according
  to the rules listed above.


  Disk group configurations contain six	types of records: volume records,
  plex records,	subdisk	records, disk media records, disk group	records, and
  disk access records. Each of these record types is described in the
  sections that	follow.	Disk access records are	specific to the	root disk
  group	and are	stored in configurations only because there is no other	con-
  venient place	to store them; otherwise, they are logically separate from
  all disk groups. Since they are specific and meaningful to the local system
  only,	the logical place for their storage is the rootdg since	that is	the
  only disk group guaranteed to	exist on the system.

  Disk Group Records

  Disk group records define several different types of names for a disk
  group. The different types of	names are:

  real name
      The name of the disk group, as defined on	disk. This name	is stored in
      the disk group configuration, and	is also	stored in the disk headers of
      all disks	in the disk group.

  alias	name
      The standard name	that the system	uses when referencing the disk group.
      References to the	disk group name	usually	mean the alias name. Volume
      directories are structured into subdirectories based on the disk group
      alias name. Typically, the disk group's alias name and real name are
      identical. A local alias can be useful for gaining access	to a disk
      group with a name	that conflicts with other disk groups in the system,
      or that conflicts	with records in	the rootdg disk	group.

  disk group ID
      A	64-byte	identifier that	represents the unique ID of the	disk group.
      All disk groups on all systems should have a different disk group	ID,
      even if they have	the same real name. This identifier is stored in the
      disk headers of all disks	in the disk group that have a private region.
      It is used to ensure that	LSM does not confuse two disk groups that
      were created with	the same name.

  Volume Records

  Volume records define	the characteristics of particular volume devices.
  The name of a	volume record defines the node name used for files in the
  /dev/vol and /dev/rvol directories. The block	device for a particular
  volume (which	can be used as an argument to the mount	command	(see
  mount(1)) has	the path:


  where	groupname is the name assigned by the administrator to the disk	group
  containing the volume. The raw device	for a volume, typically	used for
  application I/O and for issuing I/O control operations (see ioctl(2)), has
  the path:


  For convenience, volumes assigned to the root	disk group are accessible
  under	the rootdg subdirectories of /dev/vol and /dev/rvol, but are also
  under	/dev/vol/volume	and /dev/rvol/volume.

  Reads	to a volume device are directed	to one of the read-write or read-only
  plexes associated with the volume. Writes to the volume are directed to all
  of the enabled read-write and	write-only plexes associated with the volume.

  During a write operation, two	plexes of a volume may become out of sync
  with each other, because writes directed to two disks	can complete at	dif-
  ferent times.	This is	not normally a problem.	However, if the	system were
  to crash or lose power during	a write	operation, the two plexes could	have
  different contents.

  Most applications and	file systems are not designed with the presumption
  that two separate reads of a device can return different contents without
  an intervening write operation. Since	plexes with different contents could
  cause	such a situation, LSM expends considerable effort to guarantee that
  this does not	happen.

  Volumes have the following fundamental attributes:

  usage	type
      Defines a	particular class of rules for operating	on the volume, typi-
      cally based on the expected content of the volume. Several utilities
      can apply	extensions or limitations that apply to	volumes	with a par-
      ticular usage type. Several usage	types are included with	the base
      release of LSM: fsgen, for use with volumes that contain file systems;
      gen, for use with	volumes	that are used as swap devices or for other
      applications that	do not use file	systems; raid5 for use with volumes
      that have	a RAID 5 plex layout, regardless of what the volume is used
      for; and special root and	swap usage types which are specifically	for
      use with the root	file system volume and the primary swap	device.

  usage-type state
      Usage types maintain a private state field related to the	volume that
      records operations that have been	performed on the volume	or failure
      conditions that have been	encountered. This state	field contains a
      string of	up to 14 characters.

      Each volume has a	length,	which defines the limiting offset of read and
      write operations.	The length is assigned by the administrator, and may
      or may not match the lengths of the associated plexes.

  volume state
      Each volume is either enabled, disabled, or detached. When enabled,
      normal read and write operations are allowed on the volume, and any
      file system residing on the volume can be	mounted, or used in the	usual
      way. When	disabled, no access to the volume or any of its	associated
      plexes is	allowed. When detached,	some ioctls can	be used	by commands
      to operate on the	volume.

      Each volume has between zero and 32 associated plexes.

  read policy
      A	configurable policy for	switching between plexes for volume reads.
      When a volume has	more than one enabled associated plex, LSM can dis-
      tribute reads between the	plexes to distribute the I/O load and thus
      increase total possible bandwidth	of reads through the volume.

      You can set and change the read policy to	one of the following:

	  Every	other read operation switches to a different plex from the
	  previous read	operation. Given three plexes, this will switch
	  between each of the three plexes, in order.

      preferred	plex
	  Specifies a particular named plex that is used to satisfy read
	  requests. In the event that a	read request cannot be satisfied by
	  the preferred	plex, this policy changes to round-robin.

	  Adjusts to use an appropriate	read policy based on the set of
	  plexes associated with the volume. If	exactly	one enabled read-
	  write	striped	plex is	associated with	the volume, then that plex is
	  chosen automatically as the preferred	plex; otherwise, the round-
	  robin	policy is used.	If a volume has	one striped plex and one
	  non-striped plex, preferring the striped plex	often yields better
	  throughput. This is the default policy.

  start	options
      A	string that is organized as a set of usage-type	options	to apply when
      starting (enabling) a volume. See	volume(8) for details.

  log type
      An assignable policy to use for logging changes to the volume. The pol-
      icies are:

	  Does not perform any special actions when writing to the volume.
	  Writes the requested data to all read-write or write-only plexes.

      dirty-region logging (DRL)
	  Maintains a bitmap that represents different regions of the volume.
	  When a write to a particular region occurs, the respective bit is
	  set to on. When the system is	restarted after	a crash, this region
	  bitmap is used to limit the amount of	data copying that is required
	  to recover plex consistency for the volume. The region changes are
	  logged to special log	subdisks associated with the volume. Use of
	  DRL can greatly speed	recovery of a volume, but it may degrade per-
	  formance of the volume under normal operation.

  read/write-back recover mode
      A	mode that applies to the volume	during plex consistency	recovery.
      When this	mode is	enabled, the data read from a plex region is written
      back to the corresponding	region in all other plexes. Plex consistency
      is recovered by reading data from	blocks of one plex and writing that
      data to all other	writable plexes. This ensures that a future read
      operation	covering the same range	of blocks will return the same data.

  write-back-on-read-failure mode
      Can be enabled or	disabled using voledit.	If this	mode is	enabled, a
      read failure for a plex will cause data to be read from an alternate
      plex and then written back to the	plex that had the read failure.	This
      will usually fix the error. Only if the writeback	fails will the plex
      be detached for having an	unrecoverable I/O failure.

  writecopy mode
      Can be enabled or	disabled using voledit.	This mode takes	effect only
      if dirty region logging is in effect. When the operating system passes
      a	write request to the volume driver, the	operating system may continue
      to change	the memory that	is being written to disk. LSM cannot detect
      that the memory is changing, so it can inadvertently leave plexes	with
      inconsistent contents. This is not normally a problem, because the
      operating	system ensures that any	such modified memory is	rewritten to
      the volume before	the volume is closed (such as by a clean system	shut-
      down). However, if the system crashes, plexes may	be inconsistent.
      Since the	dirty region logging feature prevents recovery of the entire
      volume, it may not ensure	that plexes are	entirely consistent.

      Turning on the writecopy mode (which is normally set by default) often
      causes LSM to copy the data for a	write request to a new section of
      memory before writing it to disk.	Because	the write is done from the
      copied memory, it	cannot change and so the data written to each plex is
      guaranteed to be the same	if the write completes.

  exception policy
      There are	several	modes that can be set on the volume according to its
      usage type. These	modes affect operation of a volume in the presence of
      I/O failures. Currently only one of these	policies, called
      GEN_DET_SPARSE is	ever used. This	policy tracks complete and incomplete
      plexes in	a volume (an incomplete	plex does not have a backing subdisk
      for all blocks in	the volume). If	an unrecoverable error occurs on an
      incomplete plex, the plex	is detached (disabled from receiving regular
      volume I/O requests). If an unrecoverable	error occurs on	a complete
      plex, the	plex is	detached unless	it is the last complete	plex, in
      which case any incomplete	plexes that overlap with the error will	be
      detached but the plex with the error will	remain attached.

      This default policy is chosen to ensure that an I/O that fails on	one
      plex will	not, in	the future, be directed	to that	plex again unless
      that plex	is the last complete plex remaining attached to	the volume.
      In that case, the	policy ensures that the	volume will return the error
      consistently, even in the	presence of incomplete plexes.

      An administrator-assigned	string of up to	40 characters that can be set
      and changed using	the voledit command. LSM does not interpret the	com-
      ment field. The comment cannot contain newline characters.

  user,	group, and mode
      The user,	group, and file	permission modes used for the volume device
      nodes. The user and group	are normally root and system. The mode usu-
      ally grants read and write permission to the owner, and no access	by
      other users.

  Plex Records

  Plex records define the characteristics of a particular plex of a volume.
  A plex can be	in either an associated	state or a dissociated state. In the
  dissociated state, the plex is not a part of a volume. A dissociated plex
  cannot be accessed in	any way. An associated plex can	be accessed through
  the volume.

  Plexes have the following fundamental	attributes:

  plex state
      Each plex	is either enabled, disabled, or	detached. When enabled,	nor-
      mal read and write operations from the volume can	be directed to the
      plex. When disabled or detached, no I/O operations will be applied to
      the plex.

      Failures encountered during normal volume	I/O may	change the plex	state
      from enabled to detached.	See the	description of volume exception	poli-
      cies (earlier in this manual page) for more information.

  I/O mode
      Each plex	is in read-write, read-only, or	write-only mode. This mode
      affects read and write operations	directed to the	volume,	if the plex
      is enabled. For read-write and read-only modes, volume read operations
      can be directed to the plex. For read-write and write-only modes,
      volume write operations are directed to the plex.

      Plexes are normally in read-write	mode. Write-only mode is used to
      recover a	plex that failed, and whose contents have thus become out-
      of-date with respect to the volume. It is	also used when attaching a
      new plex to a volume. In read-write mode,	writes to the volume will
      update the plex, causing written regions to be up-to-date. Typically, a
      set of special copy operations will be used to update the	remainder of
      the plex.

      The organization of associated subdisks with respect to the plex
      address space. The layout	is striped, concatenated, or RAID5.

      Each plex	has zero or more associated subdisks. Subdisks are associated
      at offsets relative to the beginning of the plex address space. Sub-
      disks for	concatenated plexes may	not cover the entire length of the
      plex, in which case they leave holes in the plex.	A plex that is not as
      long as the volume to which it is	associated is considered to have a
      hole extending from the end of the plex to the end of the	volume.	 A
      plex with	a hole is considered incomplete, and is	sometimes called

  log subdisk
      Each plex	can have at most one associated	log subdisk. A log subdisk is
      used with	the DRL	feature	to reduce the time required to recover con-
      sistency of a volume after a system failure. If a	plex is	identified as
      a	log subdisk, that plex is a log	plex.

      The length of a plex is the offset of the	last subdisk in	the plex plus
      the length of that subdisk. In other words, the length of	the plex is
      defined by the last block	in the plex address space that is backed by a
      subdisk. This value may or may not relate	to the length of the volume,
      depending	on whether the plex is completely contiguously allocated.

  contiguous length
      The offset of the	first block in the plex	address	space that is not
      backed by	a subdisk.  If the plex	has no holes, the contiguous length
      matches the plex length. If the contiguous length	is equal to or
      greater than the length of the associated	volume,	the plex is con-
      sidered complete,	otherwise it is	sparse.

  usage-type state
      Volume usage types maintain a private state field	related	to the opera-
      tions that have been performed on	the plex, or to	failure	conditions
      that have	been encountered. This state field contains a string of	up to
      14 characters.

  condition flags
      Various condition	flags are defined for the plex that LSM	sets and
      changes independent of the volume	usage type.  Defined flags are:

	  No physical disk could be found corresponding	to the disk ID in the
	  disk media record for	one of the subdisks associated with the	plex.
	  The plex cannot be used until	the condition is fixed or the
	  affected subdisk is dissociated.

	  One of the disk media	records	was put	into the removed state
	  through explicit administrative action.  The plex cannot be used
	  until	the disk is replaced or	the affected subdisk is	dissociated.

	  A disk for one of the	disk media records was replaced	or was reat-
	  tached too late to prevent the plex from becoming out-of-date	with
	  respect to the volume. The plex requires complete recovery from
	  another plex in the volume to	synchronize the	plex with the correct
	  contents of the volume.

	  The plex was detached	as a result of an I/O failure detected during
	  normal volume	I/O. The plex is out-of-date with respect to the
	  volume, and in need of complete recovery.  However, this condition
	  can also indicate that one of	the disks in the system	should be

  volatile state
      A	plex is	considered to have ``volatile''	contents if the	disk for any
      of the plex's subdisks is	considered to be volatile.  The	contents of a
      volatile disk are	not presumed to	survive	a system reboot.  The con-
      tents of a volatile plex are always considered out-of-date after a
      recovery and in need of complete recovery	from another plex.

      An administrator-assigned	string of up to	40 characters that can be set
      and changed using	the voledit command. LSM does not interpret the	com-
      ment field. The comment cannot contain newline characters.

  Subdisk Records

  Subdisk records define a region of disk, allocated from a disk's public
  region. Subdisks have	very little state associated with them,	other than
  the configuration state that defines which region of disk the	subdisk	occu-
  pies.	 Subdisks cannot overlap each other, either in their associations
  with plexes, or in their arrangement on disk public regions.

  Subdisks have	the following fundamental attributes:

  disk media name
      The name of the disk media record	that points to the physical disk.

  disk offset
      The offset from the beginning of the disk's public region	to the start
      of the subdisk.

  plex offset
      For associated subdisks, this is the offset (from	the beginning of the
      plex) of the subdisk association.	For subdisks associated	with striped
      plexes, the plex offset defines relative ordering	of subdisks in the
      plex, rather than	actual offsets within the plex address space.

      The length of the	subdisk.

      An administrator-assigned	string of up to	40 characters that can be set
      and changed using	the voledit command. LSM does not interpret the	com-
      ment field. The comment cannot contain newline characters.

  Disk Media Records

  Disk media records define a specific disk within a disk group. The name of
  a disk media record (the disk	media name) is assigned	when a disk is first
  added	to a disk group. Disk media records can	be assigned to specific	phy-
  sical	disks by associating the disk media record with	the current disk
  access record	for the	physical disk.

  Disk media records have the following	fundamental attributes:

  disk ID
      A	64-byte	unique identifier representing the physical disk to which the
      media record is associated. This can be cleared to indicate that the
      disk is considered in the	removed	state. A removed disk has no current
      association with any physical disk.

  disk access name
      The disk access name that	is currently used to access the	physical disk
      referenced by the	disk ID. If the	disk ID	is defined, but	no physical
      disk with	that ID	can be found, the disk access name will	be null. If
      the physical disk	is not found, the disk state is	NODAREC, or inacces-
      sible. A disk can	become inaccessible either because the indicated disk
      is not currently attached	to the system, or because I/O failures on the
      physical disk prevented LSM from identifying or using the	physical

  A disk media record that has an active association with a physical disk
  (both	the disk ID and	the disk access	name attributes	are defined) inherits
  several properties from the underlying physical disk.	These attributes are
  taken	from the disk header, which is stored in the private region of the
  disk.	 These inherited attributes are:

  public length
      The length of the	region of the physical disk that is available for
      subdisk allocations.

  private length
      The length of the	region of the physical disk that is reserved for
      storing private Logical Storage Manager information.

  atomic I/O size
      The fundamental I/O size for the disk, in	bytes;,	also known as the
      sector size. All I/Os destined for this disk must	be multiples of	this
      size. Currently, LSM requires that all disks have	the same sector	size.
      On Tru64 UNIX systems the	sector size is 512 bytes.

  Disk Access Records

  Disk access records define an	address, or access path, that can be used to
  access a disk. The list of all disk access records defines the list of all
  disk addresses that LSM can use to locate physical disks. Disk access
  records do not define	specific physical disks, since physical	disks can be
  moved	on a system. When a physical disk is moved, a different	disk access
  record may be	necessary to locate it.

  Disk access records are stored in the	rootdg disk group configuration.
  Unlike all other record types, the names of disk access records can con-
  flict	with the names of other	records. For example, a	specialty disk (such
  as a RAM disk) can use the same name for both	the disk access	record and
  the disk media record	that points to it.

  Disk access records can be defined explicitly. Some (sometimes all) disk
  access records may be	configured automatically by LSM, based on available
  information in the operating system. Such automatically-configured disks
  are not stored persistently in the on-disk root disk group configuration,
  but are instead regenerated every time LSM starts up.

  Disk access records have the following fundamental attributes:

  disk access name
      The name of the disk access record is typically a	disk address of	some
      kind. Disk names are usually of the form dsknp, where dsk	is the device
      mnenomic for disk	devices, n is the sequence number of the disk, and p
      is the partition identifier (in the range	a to h).

      Each disk	access record has a type, which	identifies certain key
      characteristics of LSM's interaction with	the disk.  Currently avail-
      able types are: sliced, simple, and nopriv. See voldisk(8) for more
      information on disk types. Typically, most or all	of the disks will be
      of type sliced. It may be	desirable to create specialty disks (such as
      RAM disks) with type nopriv.

  If the physical disk represented by the disk access record is	currently
  associated with a disk media record, then the	following fields are defined:

  disk group name
      The name of the disk group containing the	disk media record.

  disk media name
      The name of the disk media record	that points to the physical disk.

  Additional attributes	can be added, arbitrarily, by disk types. See vol-
  disk(8) for a	list of	additional attributes defined by the standard disk


  The usage type of a volume represents	a class	of rules for operating on a
  volume. Each usage type is defined by	a set of executables under the direc-
  tory /sbin/lsm.d/usage_type, where usage_type	is the name given to the
  usage	type. The required executables are: volinfo, volmake, volmend, vol-
  plex,	volsd, and volume. These executables are invoked by LSM	administra-
  tive utilities with the same names. The executables under
  /sbin/lsm.d/usage_type should	not, normally, be executed directly.

  Five usage types are provided	with LSM: gen, fsgen, root, swap, and raid5.
  It is	likely that new	usage types will be added in future releases. It is
  also possible	for third-party	products to install additional usage types.

  The usage types currently provided with LSM store state information in the
  volume and plex usage-type state fields. The state fields defined for
  volumes are:

      The volume is not	yet initialized. This is the initial state for
      volumes created by volmake.

      The volume has been stopped and the contents for all plexes are con-

      The volume has been started and is running normally, or was running
      normally when the	system was stopped. If the system crashes in this
      state, then the volume may require plex consistency recovery.

      The volume requires recovery. This is typically set after	a system
      failure to indicate that the plexes in the volume	may be inconsistent
      and require recovery (see	the resync operation in	volume(8)).

      Plex consistency recovery	is currently being done	on the volume.
      volume resync sets this state when it starts to recover plex con-
      sistency on a volume that	was in the NEEDSYNC state.

  The state fields defined for plexes are:

      The plex is not yet initialized.	This state is set when the volume
      state is also EMPTY.

      The plex was running normally when the volume was	stopped. The plex
      will be enabled without requiring	recovery when the volume is started.

      The plex is running normally on a	started	volume.	The plex condition
      flags (NODAREC, REMOVED, RECOVER,	and IOFAIL) may	apply if the system
      is rebooted and the volume restarted.

      The plex was detached, either by volplex det or by an I/O	failure.
      volume start will	change the state for a plex to STALE if	any of the
      plex condition flags are set.  STALE plexes will be reattached automat-
      ically when a volume is started.

      The plex was disabled explicitly by the volmend off operation. See vol-
      mend(8) for more information.

      Applies to a snapshot plex that is being attached	by the volassist
      snapstart	operation. When	the attach is complete,	the state for the
      plex will	be changed to SNAPDONE.	If the system fails before the attach
      completes, the plex and all of its subdisks will be removed.

      Applies to a snapshot plex created by volassist snapstart	that is	fully
      attached.	 A plex	in this	state can be turned into a snapshot volume
      with volassist snapshot. See volassist(8)	for more information. If the
      system fails before the attach completes,	the plex and all of its	sub-
      disks will be removed.

      Applies to a snapshot plex being attached	by the volplex snapstart
      operation.  When the attach is complete, the state for the plex will be
      changed to SNAPDIS. If the system	fails before the attach	completes,
      the plex will be dissociated from	the volume.

      Applies to a snapshot plex created by volplex snapstart that is fully
      attached.	 A plex	in this	state can be turned into a snapshot volume
      with volplex snapshot. See volplex(8) for	more information. If the sys-
      tem fails	before the attach completes, the plex will be dissociated
      from the volume.

      Applies to a plex	that is	being associated and attached to a volume
      with volplex att.	 If the	system fails before the	attach completes the
      plex will	be dissociated from the	volume.

      Applies to a plex	that is	being associated and attached to a volume
      with volplex att.	 If the	system fails before the	attach completes the
      plex will	be dissociated from the	volume and removed. Any	subdisks in
      the plex will be kept.

      Applies to a plex	that is	being associated and attached to a volume
      with volplex att.	If the system fails before the attach completes, the
      plex and its subdisks will be dissociated	from the volume	and removed.


  The majority of LSM utilities	use a common set of exit codes,	which can be
  used by shell	scripts	or other types of programs to react to specific	prob-
  lems detected	by the utilities. For C	programmers, these exit	status codes
  are defined in the include file volclient.h. The number and macro name for
  each distinct	exit code is described below. Shell script writers must
  directly compare against the numbers specified.

  (0) VEX_OK
      The command is not reporting any error through the exit code.

      Some command line	arguments to the command were invalid.

      A	syntax error occurred in a command or description, or a	specified
      record name is too long or contains invalid characters. This code	is
      returned only by utilities that implement	a command or description
      language.	This code may also be returned for errors in search patterns.

      The volume daemon	does not appear	to be running.

  (4) VEX_IPC
      An unexpected error was encountered while	communicating with the volume

      An unexpected error was returned by a system call	or by the C library.
      This can also indicate that the command ran out of memory.

  (6) VEX_LOST
      The status for a commit was lost because the volume daemon was killed
      and restarted during the commit of a transaction,	but after restart the
      volume daemon did	not know whether the commit succeeded or failed.

      The command encountered an error that it should not have encountered.
      This generally implies a condition that the command should have tested
      for but did not, or a condition that results from	the volume daemon
      returning	a value	that did not make sense.

      VEX_UNKNOWN: An unknown or internal error	was encountered.  This code
      may be used, for example,	when the volume	daemon returns an unrecog-
      nized error number.

      The time required	to complete a transaction exceeded 60 seconds, caus-
      ing the transaction locks	to be lost. As most utilities will reattempt
      the transaction at least once if a timeout occurs, this usually implies
      that a transaction timed out two or more times.

  (9) VEX_NODG
      No disk group could be identified	for an operation. This results either
      from specifying a	disk group that	does not exist,	or from	supplying
      names on a command line that are in different disk groups	or in multi-
      ple disk groups.

      A	change made to the database by another process caused the command to
      stop. This code is also returned by a usage-type-dependent command if
      it is given a record that	has a different	usage type.

  (11) VEX_NOENT
      A	requested subdisk, plex, or volume record was not found	in the confi-
      guration database.  This may also	mean that a record was an inappropri-
      ate type.

  (12) VEX_EXIST
      A	name used to create a new configuration	record matches the name	of an
      existing record.

  (13) VEX_BUSY
      A	subdisk, plex, or volume is locked against concurrent access. This
      code is used for inter-transaction locks associated with usage type
      utilities. The code is also used for the dissociated plex	or subdisk
      lock convention, which writes a non-blank	string to the tutil[0] field
      in a plex	or subdisk structure to	indicate that the record is being

      No usage type could be determined	for a command that requires a usage

      An unknown or invalid usage type was specified.

  (16) VEX_ASSOC
      A	plex or	subdisk	is associated, but the operation requires a dissoci-
      ated record.

      A	plex or	subdisk	is dissociated,	but the	operation requires an associ-
      ated record. This	code can also be used to indicate that a subdisk or
      plex is not associated with a specific plex or volume.

  (18) VEX_LAST
      A	plex or	subdisk	was not	dissociated because it was the last record
      associated with a	volume or plex.

      Association of a plex or subdisk would surpass the maximum number	that
      can be associated	to a volume or plex.

  (20) VEX_INVAL
      A	specified operation is invalid within the parameters specified.

  (21) VEX_IOERR
      An I/O error was encountered that	caused the command to abort an opera-

      A	volume involved	in an operation	did not	have any associated plexes,
      although at least	one was	required.

      A	plex involved in an operation did not have any associated subdisks,
      although at least	one was	required.

      A	volume could not be started by the volume start	operation, because
      the configuration	of the volume and its plexes prevented the operation.

      A	specified volume was already started.

      A	specified volume was not started. For example, this code is returned
      by the volume stop operation if the operation is given a volume that is
      not started.

      A	volume or plex involved	in an operation	is in the detached state,
      thus preventing a	successful operation.

      A	volume or plex involved	in an operation	is in the disabled state,
      thus preventing a	successful operation.

      A	volume or plex involved	in an operation	is in the enabled state, thus
      preventing a successful operation.

      An unrecognized error was	encountered. This code is currently unused.

  (31) VEX_OPEN
      An operation failed because a volume device was open or mounted, or
      because a	subdisk	was associated with an open or mounted volume or

  Exit codes greater than 32 are reserved for use by usage types. Codes
  greater than 64 can be reserved for use by specific utilities.


  volassist(8),	vold(8), voldg(8), voldiskadm(8), voledit(8), volencap(8),
  volinfo(8), volinstall(8), voliod(8),	vollogcnvt(8), volmend(8), volno-
  tify(8), volplex(8), volprint(8), volrecover(8), volreconfig(8), volroot-
  mir(8), volsd(8), volsetup(8), volstat(8), voltrace(8), volume(8),