Red Hat Cluster Manager allows administrators to connect separate systems (called members or nodes) together to create failover clusters that ensure application availability and data integrity under several failure conditions. Administrators can use Red Hat Cluster Manager with database applications, file sharing services, web servers, and more.
To set up a failover cluster, you must connect the nodes to the cluster hardware, and configure the nodes into the cluster environment. The foundation of a cluster is an advanced host membership algorithm. This algorithm ensures that the cluster maintains complete data integrity by using the following methods of inter-node communication:
Network connections between the cluster systems
A Cluster Configuration System daemon (ccsd) that synchronizes configuration between cluster nodes
To make an application and data highly available in a cluster, you must configure a cluster service, an application that would benefit from Red Hat Cluster Manager to ensure high availability. A cluster service is made up of cluster resources, components that can be failed over from one node to another, such as an IP address, an application initialization script, or a Red Hat GFS shared partition. Building a cluster using Red Hat Cluster Manager allows transparent client access to cluster services. For example, you can provide clients with access to highly-available database applications by building a cluster service using Red Hat Cluster Manager to manage service availability and shared Red Hat GFS storage partitions for the database data and end-user applications.
You can associate a cluster service with a failover domain, a subset of cluster nodes that are eligible to run a particular cluster service. In general, any eligible, properly-configured node can run the cluster service. However, each cluster service can run on only one cluster node at a time in order to maintain data integrity. You can specify whether or not the nodes in a failover domain are ordered by preference. You can also specify whether or not a cluster service is restricted to run only on nodes of its associated failover domain. (When associated with an unrestricted failover domain, a cluster service can be started on any cluster node in the event no member of the failover domain is available.)
You can set up an active-active configuration in which the members run different cluster services simultaneously, or a hot-standby configuration in which primary members run all the cluster services, and a backup member takes over only if a primary member fails.
If a hardware or software failure occurs, the cluster automatically restarts the failed node's cluster services on the functional node. This cluster-service failover capability ensures that no data is lost, and there is little disruption to users. When the failed node recovers, the cluster can re-balance the cluster services across the nodes.
In addition, you can cleanly stop the cluster services running on a cluster system and then restart them on another system. This cluster-service relocation capability allows you to maintain application and data availability when a cluster node requires maintenance.
Cluster systems deployed with Red Hat Cluster Manager include the following features:
Clusters can include a dual-controller RAID array, multiple bonded network channels, multiple paths between cluster members and storage, and redundant uninterruptible power supply (UPS) systems to ensure that no single failure results in application down time or loss of data.
For information about using dm-multipath with Red Hat Cluster Suite, refer toAppendix C Multipath-usage.txt File for Red Hat Enterprise Linux 4 Update 3
Alternatively, a low-cost cluster can be set up to provide less availability than a no-single-point-of-failure cluster. For example, you can set up a cluster with a single-controller RAID array and only a single Ethernet channel.
Certain low-cost alternatives, such as host RAID controllers, software RAID without cluster support, and multi-initiator parallel SCSI configurations are not compatible or appropriate for use as shared cluster storage.
Red Hat Cluster Manager allows you to easily configure and administer cluster services to make resources such as applications, server daemons, and shared data highly available. To create a cluster service, you specify the resources used in the cluster service as well as the properties of the cluster service, such as the cluster service name, application initialization (init) scripts, disk partitions, mount points, and the cluster nodes on which you prefer the cluster service to run. After you add a cluster service, the cluster management software stores the information in a cluster configuration file, and the configuration data is aggregated to all cluster nodes using the Cluster Configuration System (or CCS), a daemon installed on each cluster node that allows retrieval of changes to the XML-based /etc/cluster/cluster.conf configuration file.
Red Hat Cluster Manager provides an easy-to-use framework for database applications. For example, a database cluster service serves highly-available data to a database application. The application running on a cluster node provides network access to database client systems, such as Web applications. If the cluster service fails over to another node, the application can still access the shared database data. A network-accessible database cluster service is usually assigned an IP address, which is failed over along with the cluster service to maintain transparent access for clients.
The cluster-service framework can also easily extend to other applications through the use of customized init scripts.
The Red Hat Cluster Suite management graphical user interface (GUI) facilitates the administration and monitoring tasks of cluster resources such as the following: creating, starting, and stopping cluster services; relocating cluster services from one node to another; modifying the cluster service configuration; and monitoring the cluster nodes. The CMAN interface allows administrators to individually control the cluster on a per-node basis.
By assigning a cluster service to a restricted failover domain, you can limit the nodes that are eligible to run a cluster service in the event of a failover. (A cluster service that is assigned to a restricted failover domain cannot be started on a cluster node that is not included in that failover domain.) You can order the nodes in a failover domain by preference to ensure that a particular node runs the cluster service (as long as that node is active). If a cluster service is assigned to an unrestricted failover domain, the cluster service starts on any available cluster node (if none of the nodes of the failover domain are available).
To ensure data integrity, only one node can run a cluster service and access cluster-service data at one time. The use of power switches in the cluster hardware configuration enables a node to power-cycle another node before restarting that node's cluster services during the failover process. This prevents any two systems from simultaneously accessing the same data and corrupting it. It is strongly recommended that fence devices (hardware or software solutions that remotely power, shutdown, and reboot cluster nodes) are used to guarantee data integrity under all failure conditions. Watchdog timers are an alternative used to ensure correct operation of cluster service failover.
To monitor the health of the other nodes, each node monitors the health of the remote power switch, if any, and issues heartbeat pings over network channels. With Ethernet channel bonding, multiple Ethernet interfaces are configured to behave as one, reducing the risk of a single-point-of-failure in the typical switched Ethernet connection between systems.
If a hardware or software failure occurs, the cluster takes the appropriate action to maintain application availability and data integrity. For example, if a node completely fails, a healthy node (in the associated failover domain, if used) starts the service or services that the failed node was running prior to failure. Cluster services already running on the healthy node are not significantly disrupted during the failover process.
For Red Hat Cluster Suite 4, node health is monitored through a cluster network heartbeat. In previous versions of Red Hat Cluster Suite, node health was monitored on shared disk. Shared disk is not required for node-health monitoring in Red Hat Cluster Suite 4.
When a failed node reboots, it can rejoin the cluster and resume running the cluster service. Depending on how the cluster services are configured, the cluster can re-balance services among the nodes.
In addition to automatic cluster-service failover, a cluster allows you to cleanly stop cluster services on one node and restart them on another node. You can perform planned maintenance on a node system while continuing to provide application and data availability.
To ensure that problems are detected and resolved before they affect cluster-service availability, the cluster daemons log messages by using the conventional Linux syslog subsystem.
The infrastructure in a cluster monitors the state and health of an application. In this manner, should an application-specific failure occur, the cluster automatically restarts the application. In response to the application failure, the application attempts to be restarted on the node it was initially running on; failing that, it restarts on another cluster node. You can specify which nodes are eligible to run a cluster service by assigning a failover domain to the cluster service.
Table 1-1 summarizes the GFS Software subsystems and their components.
|Cluster Configuration Tool||system-config-cluster||Command used to manage cluster configuration in a graphical setting.|
|Cluster Configuration System (CCS)||ccs_tool||Notifies ccsd of an updated cluster.conf file. Also, used for upgrading a configuration file from a Red Hat GFS 6.0 (or earlier) cluster to the format of the Red Hat Cluster Suite 4 configuration file.|
|ccs_test||Diagnostic and testing command that is used to retrieve information from configuration files through ccsd.|
|ccsd||CCS daemon that runs on all cluster nodes and provides configuration file data to cluster software.|
|Resource Group Manager (rgmanager)||clusvcadm||Command used to manually enable, disable, relocate, and restart user services in a cluster|
|clustat||Command used to display the status of the cluster, including node membership and services running.|
|clurgmgrd||Daemon used to handle user service requests including service start, service disable, service relocate, and service restart|
|Fence||fence_ack_manual||User interface for fence_manual agent.|
|fence_apc||Fence agent for APC power switch.|
|fence_bladecenter||Fence agent for for IBM Bladecenters with Telnet interface.|
|fence_brocade||Fence agent for Brocade Fibre Channel switch.|
|fence_bullpap||Fence agent for Bull Novascale Platform Administration Processor (PAP) Interface.|
|fence_drac||Fence agent for Dell Remote Access Controller/Modular Chassis (DRAC/MC).|
|fence_egenera||Fence agent used with Egenera BladeFrame system.|
|fence_gnbd||Fence agent used with GNBD storage.|
|fence_ilo||Fence agent for HP ILO interfaces (formerly fence_rib).|
|fence_ipmilan||Fence agent for Intelligent Platform Management Interface (IPMI).|
|fence_manual||Fence agent for manual interaction. Note: Manual fencing is not supported for production environments.|
|fence_mcdata||Fence agent for McData Fibre Channel switch.|
|fence_node||Command used by lock_gulmd when a fence operation is required. This command takes the name of a node and fences it based on the node's fencing configuration.|
|fence_rps10||Fence agent for WTI Remote Power Switch, Model RPS-10 (Only used with two-node clusters).|
|fence_rsa||Fence agent for IBM Remote Supervisor Adapter II (RSA II).|
|fence_sanbox2||Fence agent for SANBox2 Fibre Channel switch.|
|fence_vixel||Fence agent for Vixel Fibre Channel switch.|
|fence_wti||Fence agent for WTI power switch.|
|fenced||The fence daemon. Manages the fence domain.|
|DLM||libdlm.so.1.0.0||Library for Distributed Lock Manager (DLM) support.|
|dlm.ko||Kernel module that is installed on cluster nodes for Distributed Lock Manager (DLM) support.|
|LOCK_GULM||lock_gulm.o||Kernel module that is installed on GFS nodes using the LOCK_GULM lock module.|
|lock_gulmd||Server/daemon that runs on each node and communicates with all nodes in GFS cluster.|
|libgulm.so.xxx||Library for GULM lock manager support|
|gulm_tool||Command that configures and debugs the lock_gulmd server.|
|LOCK_NOLOCK||lock_nolock.o||Kernel module installed on a node using GFS as a local file system.|
|GNBD||gnbd.o||Kernel module that implements the GNBD device driver on clients.|
|gnbd_serv.o||Kernel module that implements the GNBD server. It allows a node to export local storage over the network.|
|gnbd_export||Command to create, export and manage GNBDs on a GNBD server.|
|gnbd_import||Command to import and manage GNBDs on a GNBD client.|
Table 1-1. Red Hat Cluster Manager Software Subsystem Components