Build VxRail 7.0

From Notes_Wiki
Revision as of 06:16, 10 May 2022 by Saurabh (talk | contribs) (Created page with "Home > VMWare platform > VxRail > Build VxRail 7.0 =Required VLANs= For building a VxRail cluster we need to have following VLANs: ; VLAN 3939 : This is used by VxRail nodes to communicate (discover) with each other. This is often referred as internal management VLAN. We should enable IGMP and IPv6 at least for this VLAN *: See Build_VxRail_4.7#Enabling_IPv6_and_IGMP_snooping ; VLAN for vSAN : For vSAN connectivity ; VLAN for vMotion : Fo...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Home > VMWare platform > VxRail > Build VxRail 7.0

Required VLANs

For building a VxRail cluster we need to have following VLANs:

VLAN 3939
This is used by VxRail nodes to communicate (discover) with each other. This is often referred as internal management VLAN. We should enable IGMP and IPv6 at least for this VLAN
VLAN for vSAN
For vSAN connectivity
VLAN for vMotion
For vMotion connectivity
VLAN for management
For giving IPs to vCenter, ESXi hosts, VxRail manager, etc. This is often referred as external management VLAN
VLAN for hardware management
Optionally you can decide to have iDRAC IPs in a different VLAN
One or more VLAN for VMs
These are VLANs are VMs. These can be created and specified later, after the build.


Network setup required

To build the cluster we should trunk all the above VLANs to the ESXi hosts / VxRail nodes which would be used to build the cluster. We use only 10G or 25G ports in modern VxRail nodes. There is no point in connect 1G ports, even if present, except the iDRAC ports.

During trunking the management VLAN should be forwarded untagged (native). Rest all VLANs can be tagged. Hardware management VLAN is not required to be trunked. That is optional.

After the network configuration we should boot (or reboot) the VxRail nodes. The nodes automatically discover each other and VxRail manager VM (Present on all nodes by default) gets booted on one of the nodes. The default IP of this VM is 192.168.10.200. This is available by default in the management VLAN (Untagged / native VLAN) being forwarded to ESXi hosts.

After booting the nodes we should configure another machine in same VLAN (management VLAN) with additional (Secondary IP) of 192.168.10.X/24 where X is anything other than 200. Then we should be able to ping 192.168.10.200 from this node. Also we should be able to open the VxRail wizard http://192.168.10.200 from this node.


Build Wizard

During the Build following inputs are required:

  • Language :: English, Get started (Click)
  • License agreement :: Accept check box, Next (Click)
  • Cluster type :: VxRail cluster type: Standard cluster (3 or more nodes), Storage type: Standard vSAN
  • Resources :: Select appropriate ESXi hosts (3 to 6) based on service tag or iDRAC IP
  • Network confirmation :: Select both check boxes
  • Configuration method :: Step by step user input
  • Global settings :: Use
    • TLD :: <AD-or-org-domain-name>
      • vCenter :: VxRail provided
      • DNS :: External - IPs: <various DNS IPs separated by comman>
      • NTP :: Local or if Internet access is there we can use global such as pool.ntp.org, time.google.com, etc.
      • Syslog server :: <leave-empty>
  • VDS Settings :: Use
    • VDS configuration :: predefined
    • NIC configuration :: 2x10 GbE
      Use following to check NIC on each machine:
  esxcli network nic list
  • Unfortunately automation assumes NICs to be vmnic0, vmnic1, vmnic2, vmnic3. In case you are having any 1G ports in between such that vmnic0-1,4-5 are 10G and vmnic2-3 are 1g then we cant build with 4x10. We should build with 2x10g and we can add additional uplinks later.
    • All these NICs should have all VLANs in trunk except the management network which should be untagged/native. Ideally enable Jumbo frames (MTU 9216) on all switches
  • vCenter Settings :: Use
    • Automatically accept certificate :: Yes
    • vCenter hostname :: Appropriate hostname eg vcenter. This should resolve via AD to the IP given next before build is started.
    • IP :: <vcenter-IP>
    • Join an existing SSO domain :: No (Do yes for secondary DR servers appropriately)
    • Same password for all accounts :: Yes
    • vCenter Server Management Username :: admin
    • Passwords :: ABC@1234@ABCD@abcde12
      Careful about https://www.dell.com/support/kbdoc/en-us/000158231/vxrail-account-and-password-rules-in-vxrail
      Specially note special characters to avoid, Ideally have 16 character length to avoid re-work
      You can use the password template specified above and replace alphabets and numbers with other alphabets and numbers
  • Host settings :: Enter ESXi hostname, management username, management and root password, rack name, rack location, IP address etc. for all ESXi hosts
    All these hosts must have identical hard-disk configuration (SSD, cache, capacity) for build to work. The hard-disks should also be in correct slots. That is if first server has its cache disk in slot-1, then second server should have identical capacity, make and model disk in slot-1. In case of same number and type of disks, with differing slots also build (validation) fails.
    • Example rack names :: dcrack01, dcrack02 and drrack01
    • Example Position :: Used 1 for lowest server, 2 for server above it and so on
    • ESXi management username :: admin
    • ESXi root password :: ABC@1234@ABCD@abcde34
    • ESXi admin password :: ABC@1234@ABCD@abcde56
  • VxRail manager settings :: Use
    • Hostname :: vxrail (or other appropriate hostname). This should resolve the IP specified next via DNS before starting build.
    • IP :: <VxRail-manager-IP>
    • Root password :: ABC@1234@ABCD@abcde78
    • mystic password :: ABC@1234@ABCD@abcde90
      This cannot be same as manager root password
  • Virtual network settings :: Use
    • Management subnet mask :: 255.255.255.0 (or other appropriate netmask)
    • GW :: Gateway for management VLAN.
    • VLAN ID :: 0 (We are sending management VLAN untagged / native)
    • port binding :: Ephemeral Binding
    • vSAN :: Use autofill with
    • vSAN Starting IP :: Starting IP to use in vSAN VLAN
    • vSAN ending IP :: Automatically taken based on no. of hosts and starting IP
    • vSAN subnet mask :: 255.255.255.0 (or other appropriate Mask)
    • vSAN VLAN :: vSAN VLAN ID. This should be trunked to all ESXi hosts 10G or 25G ports.
    • vMotion :: Use autofill with
    • vMotion Starting IP :: Starting IP to use in vMotion VLAN
    • vMotion ending IP :: Automatically taken based on no. of hosts and starting IP
    • vMotion subnet mask :: 255.255.255.0 (or other appropriate Mask)
    • vMotion VLAN :: vMotion VLAN ID. This should be trunked to all ESXi hosts 10G or 25G ports.
    • Guest networks :: We can create later
    • vCenter port binding :: Ephemeral Binding

After this validate and build the cluster. After cluster is built we can login at https://<vcenter-fqdn> Note that there is no separate VxRail UI. VxRail related options will be visible in vCenter at cluster level under Configure -> VxRail.


Adding additional nodes to cluster

To add additional nodes use:

  1. Login into vCenter
  2. Go to cluster -> Configure -> VxRail -> hosts
  3. Click Add
  4. The node should appear automatically. Select node and proceed.
  5. Enter vcenter authentication details and proceed
  6. Select NIC configuration similar to other existing hosts
  7. Enter hostname, IP address, esxi management username, management and root passwords
  8. Enter rack name and position (host location)
  9. Enter vSAN and vMotion IP addresses in same subnet as other hosts
  10. Validate configuration
  11. Add node


Troubleshooting VxRail build or add node issues

Password complexity

This is absolutely critical as it leads to considerable time waste in again Factory reset of all nodes and is also not obvious / easy to troubleshoot

If password complexity is not correct then VxRail accepts passwords during Wizards and then fails during build process with errors such as:

An internal error occurred.  Failed to add exception accounts for hosts

Failed to create vCenter management account vcentermgmt.  Please pick a password that is in compliance with vCenter password policy and try again.

For proper password complexity rules Refer https://www.dell.com/support/kbdoc/en-us/000158231/vxrail-account-and-password-rules-in-vxrail

To save time use the template mentioned in above article after replacing ABCD1234 etc with another characters or numbers. Do not introduce any new special characters. Do not reduce the length by too much.


Default ESXi root credentials for new VxRail nodes

Default account for VxRail ESXi root account

Default ESXi account
root:Passw0rd!

Refer:


Nodes not getting detected

Node detection depends upon VLAN 3939, management VLAN untagged being forward to all ESXi hosts properly and also on IPv6. To diagnose further use:

On VxRail manager below command should list appliance ID of all nodes discovered so far:

/usr/lib/vmware-loudmouth/bin/loudmouthc query | grep -o applianceID=[A-Z0-9]*

Login into VxRail manager using mystic user, if deployment is not done yet we can login with root:Passw0rd! using default IP 192.168.10.200

On ESXi host:

  • Check the vmkernel port (Eg vmk2) for port group private management network using:
    esxcli network ip interface list| less
  • Check Ipv6 address of management network vmKernel port
    esxcli network ip interface ipv6 address list
  • Ping from ESXi host to VxRail manager
    ping6 -I vmk0 <VxRail manager VLAN 3939 MAC ipv6-ip>

Also you can ping from VxRail manager to ESXi host IPv6 IP for private management network port-group (3939 VLAN)


  • If network wise ping is working then use the following on ESXi host which is not getting discovered in manager:
/etc/init.d/loudmouth restart
/etc/init.d/loudmouth status 

/etc/init.d/vxrail-pservice restart
/etc/init.d/vxrail-pservice status
  • On network switches we can check whether IGMP is enabled on VLAN 3939 or not using:
show ipv6 mld snooping interface vlan 3939


References


Disk related and disk position related errors

If you get error such as:

   Cache slots 21 has no disk but with capacity disks followed on host GYZTSK3.\

removing and reinserting disk on slot 21 might solve the problem. Note that as specified earlier for build to work all nodes must have identical make, model and capacity of disks in each server. They should also be in same order / same server slot. If there are additional disks also in any server, build wont work until we remove extra disks.

If disks are removed and changed. There might be iDRAC level warning / amber light on server. To solve that in iDRAC go to Maintenance -> Diagnostics and choose options to "Restart iDRAC" or "Reset iDRAC" appropriately.




Home > VMWare platform > VxRail > Build VxRail 7.0