unixdev.net


Switch to SpeakEasy.net DSL

The Modular Manual Browser

Home Page
Manual: (HP-UX-11.11)
Page:
Section:
Apropos / Subsearch:
optional field



 vxsparecheck(1M)		  VxVM 3.5		    vxsparecheck(1M)
				 1 Jun 2002



 NAME
      vxsparecheck - monitor VERITAS Volume Manager for failure events and
      replace failed disks

 SYNOPSIS
      /etc/vx/bin/vxsparecheck [mail-address...]

 DESCRIPTION
      The vxsparecheck command monitors VERITAS Volume Manager (VxVM) by
      analyzing the output of the vxnotify command, waiting for failures to
      occur.  It then sends mail via mailx to the logins specified on the
      command line, or (by default) to root.  It then replaces any failed
      disks.  After an attempt at replacement is complete, mail will be sent
      indicating the status of each disk replacement.

      The mail notification that is sent when a failure is detected follows
      this format:

	   Failures have been detected by the VERITAS Volume Manager:

	   failed disks:
	   medianame
	     ...
	   failed plexes:
	   plexname
	     ...
	   failed subdisks:
	   subdiskname
	     ...
	   failed volumes:
	   volumename
	     ...

	   The Volume Manager will attempt to find hot-spare disks to replace any
	   failed disks and attempt to reconstruct any data in volumes that have
	   storage on the failed disk.


      The medianame list specifies disks that appear to have completely
      failed. The plexname list show plexes of mirrored volumes that have
      been detached due to I/O failures experienced while attempting to do
      I/O to subdisks they contain. The subdiskname list specifies subdisks
      in RAID-5 volumes that have been detached due to I/O errors. The
      volumename list shows non-RAID-5 volumes that have become unusable
      because disks in all of their plexes have failed (and are listed in
      the ``failed disks'' list) and shows those RAID-5 volumes that have
      become unusable because of multiple failures.

      If any volumes appear to have failed, the following paragraph will be
      included in the mail:




				    - 1 -	  Formatted:  August 2, 2006






 vxsparecheck(1M)		  VxVM 3.5		    vxsparecheck(1M)
				 1 Jun 2002



	   The data in the failed volumes listed above is no longer
	   available. It will need to be restored from backup.


 Replacement Procedure
      After mail has been sent, vxsparecheck finds a hot spare replacement
      for any disks that appear to have failed (that is, those listed in the
      medianame list). This involves finding an appropriate replacement for
      those eligible hot spares in the same disk group as the failed disk. A
      disk is eligible as a replacement if it is a valid VERITAS Volume
      Manager disk (VM disk), has been marked as a hot-spare disk and
      contains enough space to hold the data contained in all the subdisks
      on the failed disk.

      To determine which disk from among the eligible hot spares to use,
      vxsparecheck first checks the file /etc/vx/sparelist (see Sparelist
      File below). If this file does not exist or lists no eligible hot
      spares for the failed disk, the disk that is ``closest'' to the failed
      disk is chosen. The value of ``closeness'' depends on the controller,
      target and disk number of the failed disk.  A disk on the same
      controller as the failed disk is closer than a disk on a different
      controller; and a disk under the same target as the failed disk is
      closer than one under a different target.

      If no hot spare disk can be found, the following mail is sent:

	   No hot spare could be found for disk medianame in
	   diskgroup. No replacement has been made and the disk is still
	   unusable.


      The mail then explains the disposition of volumes that had storage on
      the failed disk. The following message lists disks that had storage on
      the failed disk, but are still usable:

	   The following volumes have storage on medianame:

	   volumename

	   These volumes are still usable, but the redundancy of
	   those volumes is reduced. Any RAID-5 volumes with storage
	   on the failed disk may become unusable in the face of further
	   failures.


      If any non-RAID-5 volumes were made unusable due to the failure of the
      disk, the following message is included:

	   The following volumes:

	   volumename



				    - 2 -	  Formatted:  August 2, 2006






 vxsparecheck(1M)		  VxVM 3.5		    vxsparecheck(1M)
				 1 Jun 2002



	   have data on medianame but have no other usable
	   mirrors on other disks. These volumes are now unusable
	   and the data on them is unavailable.


      If any RAID-5 volumes were made unavailable due to the disk failure,
      the following message is included

	   The following RAID-5 volumes:

	   volumename

	   had storage on medianame and have experienced
	   other failures. These RAID-5 volumes are now unusable
	   and data on them is unavailable.


      If a hot-spare disk was found, a hot-spare replacement is attempted.
      This involves associating the device marked as a hot spare with the
      media record that was associated with the failed disk. If this is
      successful, the vxrecover(1M) command is used in the background to
      recover the contents of any data in volumes that had storage on the
      disk.

      If the hot-spare replacement fails, the following message is sent:

	   Replacement of disk medianame in group diskgroup
	   failed. The error is:

	   error message


      If any volumes (RAID-5 or otherwise) are rendered unusable due to the
      failure, the following message is included:

	   The following volumes:

	   volumename

	   occupy space on the failed disk and have no other available
	   mirrors or have experienced other failures. These volumes are
	   unusable, and the data they contain is unavailable.


      If the hot-spare replacement procedure completed successfully and
      recovery is under way, a final mail message is sent:

	   Replacement of disk medianame in group diskgroup
	   with disk device sparedevice has successfully completed
	   and recovery is under way.




				    - 3 -	  Formatted:  August 2, 2006






 vxsparecheck(1M)		  VxVM 3.5		    vxsparecheck(1M)
				 1 Jun 2002



      If any non-RAID-5 volumes were rendered unusable by the failure
      despite the successful hot-spare procedure, the following message is
      included in the mail:

	   The following volumes:

	   volumename

	   occupy spare on the replaced disk, but have no other enabled
	   mirrors on other disks from which to perform recovery. These
	   volumes must have their data restored.


      If any RAID-5 volumes were rendered unusable by the failure despite
      the successful hot-spare procedure, the following message is included
      in the mail:

	   The following RAID-5 volumes:

	   volumename

	   have subdisks on the replaced disk and have experienced
	   other failures that prevent recovery. These RAID-5 volumes
	   must have their data restored.


      If any volumes (RAID-5 or otherwise) were rendered unusable, the
      following message is also included:

	   To restore the contents of any volumes listed above, the
	   volume should be started with the command:

		vxvol -f start volumename

	   and the data restored from backup.


 Sparelist File
      The sparelist file is a text file that specifies an ordered list of
      disks to be used as hot spares when a specific disk fails.  The
      system-wide sparelist file is located in /etc/vx/sparelist.  Each line
      in the sparelist file specifies a list of spares for one disk.  Lines
      beginning with the pound (#) character and empty lines are ignored.
      The format for a line in the sparelist file is:

	   [ diskgroup:] diskname : spare1 [ spare2 ... ]


      The diskgroup field, if present, specifies the disk group within which
      the disk and designated spares reside. If not present, rootdg is
      presumed. The diskname specifies the disk for which spares are being



				    - 4 -	  Formatted:  August 2, 2006






 vxsparecheck(1M)		  VxVM 3.5		    vxsparecheck(1M)
				 1 Jun 2002



      designated. The spare list after the colon lists the disks to be used
      as hot spares. The list is order dependent; in case of failure of
      diskname, the spares are tried in order. A spare will be used only if
      it is a valid hot spare (see above). If the list is exhausted without
      finding any spares, the default policy of using the closest disk is
      used.

 FILES
      /etc/vx/sparelist		    Specifies a list of disks to serve as
				    hot spares for a disk.

 NOTES
      The sparelist file is not checked in any way for correctness until a
      disk failure occurs. It is possible to inadvertently specify a non-
      existent disk or inappropriate disk or disk group. Malformed lines are
      also ignored.

 SEE ALSO
      mailx(1), vxintro(1M), vxnotify(1M), vxrecover(1M), vxrelocd(1M),
      vxunreloc(1M)


































				    - 5 -	  Formatted:  August 2, 2006