Veritas Cluster Server: Cheatsheet

LLT and GRAB

VCS uses two components, LLT and GAB to share data over the private networks among systems.
These components provide the performance and reliability required by VCS.

LLTLLT (Low Latency Transport) provides fast, kernel-to-kernel comms and monitors network connections. The system admin configures the LLT by creating a configuration file (llttab) that describes the systems in the cluster and private network links among them. The LLT runs in layer 2 of the network stack
GABGAB (Group membership and Atomic Broadcast) provides the global message order required to maintain a synchronised state among the systems, and monitors disk comms such as that required by the VCS heartbeat utility. The system admin configures GAB driver by creating a configuration file ( gabtab).

LLT and GAB files

/etc/llthostsThe file is a database, containing one entry per system, that links the LLT system ID with the hosts name. The file is identical on each server in the cluster.
/etc/llttabThe file contains information that is derived during installation and is used by the utility lltconfig.
/etc/gabtabThe file contains the information needed to configure the GAB driver. This file is used by the gabconfig utility.
/etc/VRTSvcs/conf/config/main.cfThe VCS configuration file. The file contains the information that defines the cluster and its systems.

Gabtab Entries

/sbin/gabdiskconf – i /dev/dsk/c1t2d0s2 -s 16 -S 1123
/sbin/gabdiskconf – i /dev/dsk/c1t2d0s2 -s 144 -S 1124
/sbin/gabdiskhb -a /dev/dsk/c1t2d0s2 -s 16 -p a -s 1123
/sbin/gabdiskhb -a /dev/dsk/c1t2d0s2 -s 144 -p h -s 1124
/sbin/gabconfig -c -n2
gabdiskconf-i   Initialises the disk region
-s   Start Block
-S   Signature
gabdiskhb (heartbeat disks)-a   Add a gab disk heartbeat resource
-s   Start Block
-p   Port
-S   Signature
gabconfig-c   Configure the driver for use
-n   Number of systems in the cluster.

LLT and GAB Commands

Verifying that links are active for LLTlltstat -n
verbose output of the lltstat commandlltstat -nvv | more
open ports for LLTlltstat -p
display the values of LLT configuration directiveslltstat -c
lists information about each configured LLT linklltstat -l
List all MAC addresses in the clusterlltconfig -a list
stop the LLT runninglltconfig -U
start the LLTlltconfig -c
verify that GAB is operatinggabconfig -a Note: port a indicates that GAB is communicating, port h indicates that VCS is started
stop GAB runninggabconfig -U
start the GABgabconfig -c -n <number of nodes>
override the seed values in the gabtab filegabconfig -c -x

GAB Port Memberbership

List Membershipgabconfig -a
Unregister port f/opt/VRTS/bin/fsclustadm cfsdeinit
Port Functiona   gab driver
b   I/O fencing (designed to guarantee data integrity)
d   ODM (Oracle Disk Manager)
f   CFS (Cluster File System)
h   VCS (VERITAS Cluster Server: high availability daemon)
o   VCSMM driver (kernel module needed for Oracle and VCS interface)
q   QuickLog daemon
v   CVM (Cluster Volume Manager)
w   vxconfigd (module for cvm)

Cluster daemons

High Availability Daemonhad
Companion Daemonhashadow
Resource Agent daemon<resource>Agent
Web Console cluster managerment daemonCmdServer

Cluster Log Files

Log Directory/var/VRTSvcs/log
primary log file (engine log file)/var/VRTSvcs/log/engine_A.log

Starting and Stopping the cluster

“-stale” instructs the engine to treat the local config as stale
“-force” instructs the engine to treat a stale config as a valid one
hastart [-stale|-force]
Bring the cluster into running mode from a stale state using the configuration file from a particular serverhasys -force <server_name>
stop the cluster on the local server but leave the application/s running, do not failover the application/shastop -local
stop cluster on local server but evacuate (failover) the application/s to another node within the clusterhastop -local -evacuate
stop the cluster on all nodes but leave the application/s runninghastop -all -force

Cluster Status

display cluster summaryhastatus -summary
continually monitor clusterhastatus
verify the cluster is operatinghasys -display

Cluster Details

information about a clusterhaclus -display
value for a specific cluster attributehaclus -value <attribute>
modify a cluster attributehaclus -modify <attribute name> <new>
Enable LinkMonitoringhaclus -enable LinkMonitoring
Disable LinkMonitoringhaclus -disable LinkMonitoring

Users

add a userhauser -add <username>
modify a userhauser -update <username>
delete a userhauser -delete <username>
display all usershauser -display

System Operations

add a system to the clusterhasys -add <sys>
delete a system from the clusterhasys -delete <sys>
Modify a system attributeshasys -modify <sys> <modify options>
list a system statehasys -state
Force a system to starthasys -force
Display the systems attributeshasys -display [-sys]
List all the systems in the clusterhasys -list
Change the load attribute of a systemhasys -load <system> <value>
Display the value of a systems nodeid (/etc/llthosts)hasys -nodeid
Freeze a system (No offlining system, No groups onlining)hasys -freeze [-persistent][-evacuate] Note: main.cf must be in write mode
Unfreeze a system ( reenable groups and resource back online)hasys -unfreeze [-persistent] Note: main.cf must be in write mode

Dynamic Configuration

The VCS configuration must be in read/write mode in order to make changes. When in read/write mode the
configuration becomes stale, a .stale file is created in $VCS_CONF/conf/config. When the configuration is put
back into read only mode the .stale file is removed.

Change configuration to read/write modehaconf -makerw
Change configuration to read-only modehaconf -dump -makero
Check what mode cluster is running inhaclus -display |grep -i ‘readonly’ 0 = write mode
1 = read only mode
Check the configuration filehacf -verify /etc/VRTSvcs/conf/config Note: you can point to any directory as long as it has main.cf and types.cf
convert a main.cf file into cluster commandshacf -cftocmd /etc/VRTSvcs/conf/config -dest /tmp
convert a command file into a main.cf filehacf -cmdtocf /tmp -dest /etc/VRTSvcs/conf/config

Service Groups

add a service grouphaconf -makerw
  hagrp -add groupw
  hagrp -modify groupw SystemList sun1 1 sun2 2
  hagrp -autoenable groupw -sys sun1
haconf -dump -makero
delete a service grouphaconf -makerw
  hagrp -delete groupw
haconf -dump -makero
change a service grouphaconf -makerw
  hagrp -modify groupw SystemList sun1 1 sun2 2 sun3 3
haconf -dump -makero Note: use the “hagrp -display <group>” to list attributes
list the service groupshagrp -list
list the groups dependencieshagrp -dep <group>
list the parameters of a grouphagrp -display <group>
display a service group’s resourcehagrp -resources <group>
display the current state of the service grouphagrp -state <group>
clear a faulted non-persistent resource in a specific grphagrp -clear <group> [-sys] <host> <sys>
Change the system list in a cluster# remove the host
hagrp -modify grp_zlnrssd SystemList -delete <hostname> # add the new host (don’t forget to state its position)
hagrp -modify grp_zlnrssd SystemList -add <hostname> 1 # update the autostart list
hagrp -modify grp_zlnrssd AutoStartList <host> <host>

Service Group Operations

Start a service group and bring its resources onlinehagrp -online <group> -sys <sys>
Stop a service group and takes its resources offlinehagrp -offline <group> -sys <sys>
Switch a service group from system to anotherhagrp -switch <group> to <sys>
Enable all the resources in a grouphagrp -enableresources <group>
Disable all the resources in a grouphagrp -disableresources <group>
Freeze a service group (disable onlining and offlining)hagrp -freeze <group> [-persistent] note: use the following to check “hagrp -display <group> | grep TFrozen”
Unfreeze a service group (enable onlining and offlining)hagrp -unfreeze <group> [-persistent] note: use the following to check “hagrp -display <group> | grep TFrozen”
Enable a service group. Enabled groups can only be brought onlinehaconf -makerw
  hagrp -enable <group> [-sys]
haconf -dump -makero Note to check run the following command “hagrp -display | grep Enabled”
Disable a service group. Stop from bringing onlinehaconf -makerw
  hagrp -disable <group> [-sys]
haconf -dump -makero Note to check run the following command “hagrp -display | grep Enabled”
Flush a service group and enable corrective action.hagrp -flush <group> -sys <system>

Resources

add a resourcehaconf -makerw
  hares -add appDG DiskGroup groupw
  hares -modify appDG Enabled 1
  hares -modify appDG DiskGroup appdg
  hares -modify appDG StartVolumes 0
haconf -dump -makero
delete a resourcehaconf -makerw
  hares -delete <resource>
haconf -dump -makero
change a resourcehaconf -makerw
  hares -modify appDG Enabled 1
haconf -dump -makero Note: list parameters “hares -display <resource>”
change a resource attribute to be globally widehares -global <resource> <attribute> <value>
change a resource attribute to be locally widehares -local <resource> <attribute> <value>
list the parameters of a resourcehares -display <resource>
list the resourceshares -list  
list the resource dependencieshares -dep

Resource Operations

Online a resourcehares -online <resource> [-sys]
Offline a resourcehares -offline <resource> [-sys]
display the state of a resource( offline, online, etc)hares -state
display the parameters of a resourcehares -display <resource>
Offline a resource and propagate the command to its childrenhares -offprop <resource> -sys <sys>
Cause a resource agent to immediately monitor the resourcehares -probe <resource> -sys <sys>
Clearing a resource (automatically initiates the onlining)hares -clear <resource> [-sys]

Resource Types

Add a resource typehatype -add <type>
Remove a resource typehatype -delete <type>
List all resource typeshatype -list
Display a resource typehatype -display <type>
List a partitcular resource typehatype -resources <type>
Change a particular resource types attributeshatype -value <type> <attr>

Resource Agents

add a agentpkgadd -d . <agent package>
remove a agentpkgrm <agent package>
change a agentn/a
list all ha agentshaagent -list  
Display agents run-time information i.e has it started, is it running ?haagent -display <agent_name>  
Display agents faultshaagent -display |grep Faults

Resource Agent Operations

Start an agenthaagent -start <agent_name>[-sys]
Stop an agenthaagent -stop <agent_name>[-sys]

RELATED POST

Veritas Volume Manager: Growing a disk group and expand the filesystem

Validated by Mr Man! Lets start off on node2 [root@node02 ~]# vxdisk list DEVICE TYPE DISK GROUP STATUS sda auto:none…

Virtual Machine Manager: Error starting domain

Starting up the KVM error occurred Error starting domain: Requested operation is not valid: network 'default' is not active Locate…

Git Commands

How to initialize a Git repo: Everything starts from here. The first step is to initialize a new Git repo…

Lab Hack: Raspberry Pi running VMWare ESXi

As strange as the title sounds, yes I am running VMWare ESXi on a Raspberry Pi 4 Model B (4GB)…