Powering off and powering on a VSAN cluster
From Notes_Wiki
Home > VMWare platform > vMWare VSAN > Powering off and powering on a VSAN cluster
Powering off and powering on a vSAN 7.0 cluster
To poweroff and poweron a vSAN 7.0 cluster use following steps:
- Go to Hosts and clusters view and select cluster. Go to monitor and check vSAN health. vSAN should be healthy before we consider shutting down vSAN cluster.
- Power off all virtual machines on the cluster except vCenter.
- Enable retreat mode to poweroff vCLS (vSphere Cluster Services) VMs using
- Navigate to the cluster on which vCLS has to be disabled. Copy the cluster domain id from the URL of the browser. It should be similar to 'domain-c<number>'.
- For example When you navigate to cluster in vSphere client, your URL will be similar to this: https://<fqdn-of-vCenter-server>/ui/app/cluster;nav=h/urn:vmomi:ClusterComputeResource:domain-c8:eef257af-fa50-455a-af7a-6899324fabe6/summary. You only need to copy domain-c8 to use in the steps below.
- Navigate to the vCenter Server and then to Configure tab.
- Click on Advanced setting section and then on Edit settings button.
- Modify entry with name = config.vcls.clusters.domain-c<number>.enabled and value = false.
- If entry does not exists you can add it using 'Add' option in the same popup window.
- Note: True and False are case insensitive, so any case of these two values should be accepted.
- Click Save.
- vCLS monitoring service will initiate the clean-up of vCLS VMs and user will start noticing the tasks with the VM deletion.
- If this cluster has DRS enabled, then it will not be functional and additional warning will be displayed in the cluster summary. DRS will be disabled until vCLS is re-enabled on this cluster.
- Navigate to the cluster on which vCLS has to be disabled. Copy the cluster domain id from the URL of the browser. It should be similar to 'domain-c<number>'.
- Note the host on which vCenter is running and open ESXi host web UI.
- Disable HA on the cluster
- Verify that all resynchronization tasks are complete. Click the Monitor tab and select vSAN > Resyncing Objects.
- Shutdown vCenter VM
- On each and every ESXi host run:
- esxcfg-advcfg -s 1 /VSAN/IgnoreClusterMemberListUpdates
- Log in to any host in the cluster other than the witness host.
- Run below command Only on one host:
- python /usr/lib/vmware/vsan/bin/reboot_helper.py prepare
- The above fails if systems do not have matching time. To configure matching time on servers refer Configure NTP from cli or UI
- Run the below command on all ESXi hosts
- esxcli system maintenanceMode set -e true -m noAction
- After all hosts are in maintenance mode we can poweroff all hosts one by one.
To power on the cluster back use following steps:
- Power on all ESXi hosts one by one
- After all hosts have booted without issue, bring all hosts one by one out of maintenance mode using:
- esxcli system maintenanceMode set -e false
- On only one of the hosts run the following command:
- python /usr/lib/vmware/vsan/bin/reboot_helper.py recover
- Verify that all hosts are in cluster using:
- esxcli vsan cluster get
- Run this command on all hosts:
- esxcfg-advcfg -s 0 /VSAN/IgnoreClusterMemberListUpdates
- Start the vCenter using the same host from which it was powered off
- Disable the retreat mode using same steps and used during powering down except this time in Advanced settings for domain-c<n> set value to true
- Again verify that all hosts are participating in cluster using:
- esxcli vsan cluster get
- Start the remaining VMs using vCenter
- Check vCenter vSAN health issues, if any and resolve them
- Enable HA again on the cluster.
Refer:
- https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vsan-monitoring.doc/GUID-31B4F958-30A9-4BEC-819E-32A18A685688.html
- https://kb.vmware.com/s/article/80472
Powering off and powering on a vSAN 6.7 or earlier cluster
To power off and power on a VSAN cluster use following steps:
- Consider increasing default repair delay as mentioned at https://kb.vmware.com/s/article/2075456 This might be important if cluster will not be powered on within 60 minutes default repair delay timings
- Check VSAN health using retest and repair unhealthy components before powering off the cluster ( https://kb.vmware.com/s/article/2144650 )
- Power of all non vCenter and PSC VMs.
- Note vCenter and PSC host and login into corresponding esxi host(s) using web console
- Also SSH to vcenter and PSC ESXi host(s)
- Put all other hosts in maintenance mode with "no action"
- Shutdown vCenter and PSC VMs using esxi host web console
- Put vcenter and PSC host(s) in maintenance mode using:
- esxcli system maintenanceMode set -e true -m noAction
- Power of all ESXi hosts
For power on:
- Power on all ESXi hosts
- SSH to vcenter and PSC ESXi host
- Exit maintenance mode using:
- esxcli system maintenanceMode set -e false
- Start vcenter VM
- Check VSAN cluster health
- Exit other hosts from maintenance mode
- Start necessary VMs.
Refer:
Home > VMWare platform > vMWare VSAN > Powering off and powering on a VSAN cluster