CentOS 8.x Cloudstack 4.15 Setup primary storage

From Notes_Wiki
Revision as of 03:04, 23 April 2022 by Saurabh (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Home > CentOS > CentOS 8.x > Virtualization > Cloudstack 4.15 > Setup primary storage

There are two types of primary storage possible for Cloudstack setup

NFS based
For this refer to CentOS 8.x Cloudstack 4.15 Setup NFS server and create /mnt/primary similar to /mnt/secondary. Then this can be mounted using NFS on KVM hosts for primary storage
FC or iSCSI disk based
It is possible that Cloudstack setup is required with proper dedicated central storage. In this case we need to setup shared filesystem with clustering on this shared disk. Steps for setting this shared filesystem with clustering are described below.


The below steps do not work. Please refer to CentOS 8.x Distributed Filesystems for other alternatives for setting up shared /mnt/primary mount point across all KVM hosts. In case you plan to use moosefs please note that live migration of VM does not works when we use SharedMountPoint as primary storage in cloudstack. If you use moosefs, after shutting down the VM it can be booted from another host. However, if you use glusterfs the live migration feature works properly.



Create disks required for shared storage

To use shared disks use following steps:

  1. At least two disks are required. One is primary for primary VM storage typically 1TB or larger based on requirement. Other is smaller 1GB or so stonith disk for fencing.
  2. Assuming required disks are mapped via storage (FC/iSCSI) or via CentOS 8.x iSCSI target server setup. In case of iSCSI refer CentOS 7.x iSCSI initiator client setup for client connectivity. Either way validate two disks are available using:
    fdisk -l
  3. Create required volume group and logical volume on these disks using Only one one of the hosts:
    #Note pvcreate tab based autocomplete does not plays well with /dev/disk/by-path devices so get full device names by copy-paste
    #Stonith disk
    pvcreate /dev/disk/by-path/<id-of-smaller-1-GB-stonith-disk>
    vgcreate vgfence /dev/disk/by-path/<id-of-smaller-1-GB-stonith-disk>
    lvcreate -L 900M -n lvfence vgfence
    #Primary storage disk
    pvcreate /dev/disk/by-path/dev/disk/by-path/<id-of-larger-1-TB-primary-storage-disk>
    vgcreate vgstorage /dev/disk/by-path/dev/disk/by-path/<id-of-larger-1-TB-primary-storage-disk>
    lvcreate -l '99%VG' -n lvstorage vgstorage


Install required packages

Install required packages on all hosts using:

  1. Enable HA repository
    dnf config-manager --set-enabled ha
  2. Install packages using
    dnf -y install pcs fence-agents-all
  3. Create required host entry in /etc/hosts for all hosts. For example:
    172.31.1.161 saurabh-cloudstack-host1
    172.31.1.162 saurabh-cloudstack-host2
  4. Set same password for hacluster user on all nodes using:
    passwd hacluster
  5. Start and enable pcsd service:
    systemctl start pcsd
    systemctl enable pcsd
  6. Stop and disable firewalld
    systemctl stop firewalld
    systemctl disable firewalld
  7. From now on All pcs commands should be executed from a single node unless otherwise noted. Status should be checked on all nodes..
  8. From one any one of the cluster nodes, setup authentication across all nodes using:
    pcs host auth saurabh-cloudstack-host1 saurabh-cloudstack-host2
    Authenticate with username hacluster and password set during previous steps.
  9. Create and start the cluster.
    pcs cluster setup --start pcscluster1 saurabh-cloudstack-host1 saurabh-cloudstack-host2
  10. Start the cluster on all nodes
    pcs cluster start --all
    (Precautionary: As such cluster is already running and can be verified using 'pcs status' commmand)
  11. Enable the cluster to start automatically at boot:
    pcs cluster enable --all
    (IMP: Without this 'pcs status' will show 'Daemon status' with /disabled)
  12. Check the cluster status (check on all nodes)
    pcs cluster status



Configure corosync

Cluster should have quorum

       	corosync-quorumtool -s



Configure Stonith fence

Most commands are required on one node only. Status commands should be run on all nodes

  1. Get iscsi device ids (Failover & fencing) by running below command:
    ls -l /dev/disk/by-path/
  2. Add a STONITH device(Fencing device). In our case this is the 1GB LUN presented to both nodes over iSCSI
    pcs stonith create pcs-stonith-device fence_scsi pcmk_host_list="saurabh-cloudstack-host1 saurabh-cloudstack-host2" devices="/dev/disk/by-path/<appr-name>" meta provides=unfencing
    pcs stonigh config
  3. Check all currently configured STONITH properties
    pcs property list --all|grep stonith
    pcs property list --defaults
    pcs stonith status
  4. Format the sbd device and enable it using:
    pcs stonith sbd device setup device=/dev/disk/by-path/<appr-name>
    pcs cluster stop --all
    pcs stonith sbd enable
    pcs cluster start --all
    Note in case of VMs the VM must have watchdog device for this to work. In case of VMWare ESXi VMs the hardware compatibility version should be at least 17 https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.hostclient.doc/GUID-67FD0C44-C8DA-451D-8DA2-BDC5974433EB.html. If the hardwarae version is old, you can shutdown the VM, right click on VM and choose Compatibility -> Upgrade compatibility option. Refer: https://kb.vmware.com/s/article/1010675



Configure pacemaker for gfs2

Most commands are required on one node only. Status commands should be run on all nodes

  1. On all cluster nodes, install below packages
    dnf -y install gfs2-utils lvm2 lvm2-lockd
  2. Enable clustered locking for LVM on *all* nodes
    #lvmconf --enable-cluster
  3. Change default stickiness
    pcs resource defaults update resource-stickiness=200
    pcs resource defaults
  4. The DLM needs to run on all nodes, so we’ll start by creating a resource for it and clone it. This creates dlm_cfg xml file which we will modify using future commands.
    pcs cluster cib dlm_cfg
    pcs -f dlm_cfg resource create dlm ocf:pacemaker:controld op monitor interval=120s on-fail=fence clone interleave=true ordered=true
  5. Set up clvmd as a cluster resource.
    #pcs -f dlm_cfg resource create clvmd ocf:heartbeat:clvm op monitor interval=120s on-fail=fence clone interleave=true ordered=true
    pcs -f dlm_cfg resource create clvmd ocf:heartbeat:lvmlockd op monitor interval=120s on-fail=fence clone interleave=true ordered=true
  6. Set up clvmd and dlm dependency and start up order. Create the ordering and the colocation constraint so that clvm starts after dlm and that both resources start on the same node.
    pcs -f dlm_cfg constraint order start dlm-clone then clvmd-clone
    pcs -f dlm_cfg constraint colocation add clvmd-clone with dlm-clone
  7. Set the no-quorum-policy of the cluster to freeze so that when quorum is lost, the remaining partition will do nothing until quorum is regained – GFS2 requires quorum to operate
    pcs -f dlm_cfg property set no-quorum-policy=freeze
  8. Let us check the configuration, would show up only on current node
    pcs -f dlm_cfg constraint
    pcs -f dlm_cfg resource show
  9. Commit changes from dlm_cfg xml file to cluster on current node. The same changes would automatically get applied on other nodes.
    pcs cluster cib-push dlm_cfg
  10. Check the status of the clone resources on all nodes
    pcs constraint
    pcs resource status
    pcs property list no-quorum-policy
    pcs status resources

Unable to get dlm-clone running with above steps. 'pcs status' shows error such as below for each participating node

  * dlm_monitor_0 on saurabh-cloudstack-host2 'not installed' (5): call=10, status='complete', exitreason='Setup problem: couldn't find command: dlm_controld', last-rc-change='2021-02-11 11:35:05 +05:30', queued=0ms, exec=36ms
  * clvmd_monitor_0 on saurabh-cloudstack-host2 'not installed' (5): call=15, status='complete', exitreason='Setup problem: couldn't find command: dlm_tool', last-rc-change='2021-02-11 11:35:05 +05:30', queued=0ms, exec=23ms
  * gfs2_storage_res_start_0 on saurabh-cloudstack-host2 'error' (1): call=21, status='complete', exitreason='Couldn't mount device [/dev/vgstorage/lvstorage] as /mnt/primary', last-rc-change='2021-02-11 11:41:10 +05:30', queued=0ms, exec=142ms


gfs2 file-system creation

  1. Check current cluster name configured while configuring cluster
    pcs property list cluster-name
    grep name /etc/corosync/corosync.conf
  2. One by one create the GFS2 filesystem with correct cluster name
    mkfs.gfs2 -p lock_dlm -t pcscluster1:storage1 -j 2 /dev/vgstorage/lvstorage
    Enter 'y' on (y/n) continue prompt on warning related to loss of all data on selected device in both cases.



Create Pacemaker Filesystem Resource

  1. Do this on all nodes
    mkdir /mnt/primary
  2. We will not use /etc/fstab to specify the mount, rather we’ll use a Pacemaker-controlled resource
    pcs resource create gfs2_storage_res Filesystem device="/dev/vgstorage/lvstorage" directory="/mnt/primary" fstype="gfs2" options="noatime,nodiratime,acl,rw,_netdev" op monitor interval=90s on-fail=fence clone interleave=true
  3. This is configured as a clone resource so it will run on both nodes at the same time. Confirm that the mount has succeeded on both nodes
    pcs resource show
    mount | grep gfs2



Create Pacemaker Resource Ordering

  1. Next, create an ordering constraint so that the filesystem resource is started after the CLVMD resource, and a colocation constraint so that both start on the same node
    pcs constraint order start clvmd-clone then gfs2_storage_res-clone
    pcs constraint colocation add gfs2_storage_res-clone with clvmd-clone
    pcs constraint show
    Thus first dlm-clone will start. Then clvmd-clone will start. Then finally gfs2_storage_res-clone will start.



Home > CentOS > CentOS 8.x > Virtualization > Cloudstack 4.15 > Setup primary storage