2.3. Installing and Configuring Red Hat Enterprise Linux

After the setup of basic cluster hardware, proceed with installation of Red Hat Enterprise Linux on each member and ensure that all systems recognize the connected devices. Follow these steps:

  1. Install Red Hat Enterprise Linux on all cluster members. Refer to Red Hat Enterprise Linux Installation Guide for instructions.

    In addition, when installing Red Hat Enterprise Linux, it is strongly recommended to do the following:

    • Gather the IP addresses for the members and for the bonded Ethernet ports, before installing Red Hat Enterprise Linux. Note that the IP addresses for the bonded Ethernet ports can be private IP addresses, (for example, 10.x.x.x).

    • Do not place local file systems (such as /, /etc, /tmp, and /var) on shared disks or on the same SCSI bus as shared disks. This helps prevent the other cluster members from accidentally mounting these file systems, and also reserves the limited number of SCSI identification numbers on a bus for cluster disks.

    • Place /tmp and /var on different file systems. This may improve member performance.

    • When a member boots, be sure that the member detects the disk devices in the same order in which they were detected during the Red Hat Enterprise Linux installation. If the devices are not detected in the same order, the member may not boot.

    • When using RAID storage configured with Logical Unit Numbers (LUNs) greater than zero, it is necessary to enable LUN support by adding the following to /etc/modules.conf:

      options scsi_mod max_scsi_luns=255

      After modifying modules.conf, it is necessary to rebuild the initial ram disk using mkinitrd. Refer to the Red Hat Enterprise Linux System Administration Guide for more information about creating ramdisks using mkinitrd.

  2. Reboot the members.

  3. When using a terminal server, configure Red Hat Enterprise Linux to send console messages to the console port.

  4. Edit the /etc/hosts file on each cluster member and include the IP addresses used in the cluster or ensure that the addresses are in DNS. Refer to Section 2.3.1 Editing the /etc/hosts File for more information about performing this task.

  5. Decrease the alternate kernel boot timeout limit to reduce boot time for members. Refer to Section 2.3.2 Decreasing the Kernel Boot Timeout Limit for more information about performing this task.

  6. Ensure that no login (or getty) programs are associated with the serial ports that are being used for the remote power switch connection (if applicable). To perform this task, edit the /etc/inittab file and use a hash symbol (#) to comment out the entries that correspond to the serial ports used for the remote power switch. Then, invoke the init q command.

  7. Verify that all systems detect all the installed hardware:

  8. Verify that the members can communicate over all the network interfaces by using the ping command to send test packets from one member to another.

  9. If intending to configure Samba services, verify that the required RPM packages for Samba services are installed.

2.3.1. Editing the /etc/hosts File

The /etc/hosts file contains the IP address-to-hostname translation table. The /etc/hosts file on each member must contain entries for the following:

As an alternative to the /etc/hosts file, naming services such as DNS or NIS can be used to define the host names used by a cluster. However, to limit the number of dependencies and optimize availability, it is strongly recommended to use the /etc/hosts file to define IP addresses for cluster network interfaces.

The following is an example of an /etc/hosts file on a member:

127.0.0.1         localhost.localdomain   localhost
193.186.1.81      cluster2.example.com      cluster2
10.0.0.1          ecluster2.example.com     ecluster2
193.186.1.82      cluster3.example.com      cluster3
10.0.0.2          ecluster3.example.com     ecluster3

The previous example shows the IP addresses and hostnames for two members (cluster2 and cluster3), and the private IP addresses and hostnames for the Ethernet interface (ecluster2 and ecluster3) used for the point-to-point heartbeat connection on each member.

Verify correct formatting of the local host entry in the /etc/hosts file to ensure that it does not include non-local systems in the entry for the local host. An example of an incorrect local host entry that includes a non-local system (server1) is shown next:

127.0.0.1     localhost.localdomain     localhost server1

An Ethernet connection may not operate properly if the format of the /etc/hosts file is not correct. Check the /etc/hosts file and correct the file format by removing non-local systems from the local host entry, if necessary.

Note that each network adapter must be configured with the appropriate IP address and netmask.

The following example shows a portion of the output from the /sbin/ifconfig command on a cluster member:

eth0      Link encap:Ethernet  HWaddr 00:00:BC:11:76:93  
          inet addr:192.186.1.81  Bcast:192.186.1.245  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:65508254 errors:225 dropped:0 overruns:2 frame:0
          TX packets:40364135 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100
          Interrupt:19 Base address:0xfce0

eth1      Link encap:Ethernet  HWaddr 00:00:BC:11:76:92  
          inet addr:10.0.0.1  Bcast:10.0.0.245  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100
          Interrupt:18 Base address:0xfcc0

The previous example shows two network interfaces on a cluster member: eth0 (the network interface for the member) and eth1 (the network interface for the point-to-point Ethernet connection).

You may also add the IP addresses for the cluster members to your DNS server. Refer to the Red Hat Enterprise Linux Reference Guide for information on configuring DNS, or consult your network administrator.

2.3.2. Decreasing the Kernel Boot Timeout Limit

It is possible to reduce the boot time for a member by decreasing the kernel boot timeout limit. During the Red Hat Enterprise Linux boot sequence, the boot loader allows for specifying an alternate kernel to boot. The default timeout limit for specifying a kernel is ten seconds.

To modify the kernel boot timeout limit for a member, edit the appropriate files as follows:

When using the GRUB boot loader, the timeout parameter in /boot/grub/grub.conf should be modified to specify the appropriate number of seconds for the timeout parameter. To set this interval to 3 seconds, edit the parameter to the following:

timeout = 3

When using the LILO or ELILO boot loaders, edit the /etc/lilo.conf file (on x86 systems) or the elilo.conf file (on Itanium systems) and specify the desired value (in tenths of a second) for the timeout parameter. The following example sets the timeout limit to three seconds:

timeout = 30

To apply any changes made to the /etc/lilo.conf file, invoke the /sbin/lilo command.

On an Itanium system, to apply any changes made to the /boot/efi/efi/redhat/elilo.conf file, invoke the /sbin/elilo command.

2.3.3. Displaying Console Startup Messages

Use the dmesg command to display the console startup messages. Refer to the dmesg(8) man page for more information.

The following example of output from the dmesg command shows that two external SCSI buses and nine disks were detected on the member. (Lines with backslashes display as one line on most screens):

May 22 14:02:10 storage3 kernel: scsi0 : Adaptec AHA274x/284x/294x \
	      (EISA/VLB/PCI-Fast SCSI) 5.1.28/3.2.4 
May 22 14:02:10 storage3 kernel:         
May 22 14:02:10 storage3 kernel: scsi1 : Adaptec AHA274x/284x/294x \
              (EISA/VLB/PCI-Fast SCSI) 5.1.28/3.2.4 
May 22 14:02:10 storage3 kernel:         
May 22 14:02:10 storage3 kernel: scsi : 2 hosts. 
May 22 14:02:11 storage3 kernel:   Vendor: SEAGATE   Model: ST39236LW         Rev: 0004 
May 22 14:02:11 storage3 kernel: Detected scsi disk sda at scsi0, channel 0, id 0, lun 0 
May 22 14:02:11 storage3 kernel:   Vendor: SEAGATE   Model: ST318203LC        Rev: 0001 
May 22 14:02:11 storage3 kernel: Detected scsi disk sdb at scsi1, channel 0, id 0, lun 0 
May 22 14:02:11 storage3 kernel:   Vendor: SEAGATE   Model: ST318203LC        Rev: 0001 
May 22 14:02:11 storage3 kernel: Detected scsi disk sdc at scsi1, channel 0, id 1, lun 0 
May 22 14:02:11 storage3 kernel:   Vendor: SEAGATE   Model: ST318203LC        Rev: 0001 
May 22 14:02:11 storage3 kernel: Detected scsi disk sdd at scsi1, channel 0, id 2, lun 0 
May 22 14:02:11 storage3 kernel:   Vendor: SEAGATE   Model: ST318203LC        Rev: 0001 
May 22 14:02:11 storage3 kernel: Detected scsi disk sde at scsi1, channel 0, id 3, lun 0 
May 22 14:02:11 storage3 kernel:   Vendor: SEAGATE   Model: ST318203LC        Rev: 0001 
May 22 14:02:11 storage3 kernel: Detected scsi disk sdf at scsi1, channel 0, id 8, lun 0 
May 22 14:02:11 storage3 kernel:   Vendor: SEAGATE   Model: ST318203LC        Rev: 0001 
May 22 14:02:11 storage3 kernel: Detected scsi disk sdg at scsi1, channel 0, id 9, lun 0 
May 22 14:02:11 storage3 kernel:   Vendor: SEAGATE   Model: ST318203LC        Rev: 0001 
May 22 14:02:11 storage3 kernel: Detected scsi disk sdh at scsi1, channel 0, id 10, lun 0 
May 22 14:02:11 storage3 kernel:   Vendor: SEAGATE   Model: ST318203LC        Rev: 0001 
May 22 14:02:11 storage3 kernel: Detected scsi disk sdi at scsi1, channel 0, id 11, lun 0 
May 22 14:02:11 storage3 kernel:   Vendor: Dell      Model: 8 BAY U2W CU      Rev: 0205 
May 22 14:02:11 storage3 kernel:   Type:   Processor \
                          ANSI SCSI revision: 03 
May 22 14:02:11 storage3 kernel: scsi1 : channel 0 target 15 lun 1 request sense \
	      failed, performing reset. 
May 22 14:02:11 storage3 kernel: SCSI bus is being reset for host 1 channel 0. 
May 22 14:02:11 storage3 kernel: scsi : detected 9 SCSI disks total.

The following example of the dmesg command output shows that a quad Ethernet card was detected on the member:

May 22 14:02:11 storage3 kernel: 3c59x.c:v0.99H 11/17/98 Donald Becker
May 22 14:02:11 storage3 kernel: tulip.c:v0.91g-ppc 7/16/99 becker@cesdis.gsfc.nasa.gov 
May 22 14:02:11 storage3 kernel: eth0: Digital DS21140 Tulip rev 34 at 0x9800, \
	      00:00:BC:11:76:93, IRQ 5. 
May 22 14:02:12 storage3 kernel: eth1: Digital DS21140 Tulip rev 34 at 0x9400, \
	      00:00:BC:11:76:92, IRQ 9. 
May 22 14:02:12 storage3 kernel: eth2: Digital DS21140 Tulip rev 34 at 0x9000, \
	      00:00:BC:11:76:91, IRQ 11. 
May 22 14:02:12 storage3 kernel: eth3: Digital DS21140 Tulip rev 34 at 0x8800, \
	      00:00:BC:11:76:90, IRQ 10.

2.3.4. Displaying Devices Configured in the Kernel

To be sure that the installed devices, including serial and network interfaces, are configured in the kernel, use the cat /proc/devices command on each member. Use this command to also determine if there is raw device support installed on the member. For example:

Character devices:
  1 mem
  2 pty
  3 ttyp
  4 ttyS
  5 cua
  7 vcs
 10 misc
 19 ttyC
 20 cub
128 ptm
136 pts
162 raw

Block devices:
  2 fd
  3 ide0
  8 sd
 65 sd

The previous example shows: