CentOS 7.x ctdb cluster with LDAP integration over gfs2 filesystem

From Notes_Wiki

Home > CentOS > CentOS 7.x > File sharing > Samba > CentOS 7.x ctdb cluster with LDAP integration over gfs2 filesystem

Creating disks for the cluster setup along with lvm volumes

Overall atleast two nodes are required for ctdb, ldap, samba, etc (ctdb1, ctdb2). If there is no central storage then a third node is required for iscsid (target).

  1. If there is no central storage use CentOS 7.x iSCSI target server setup and create one or more iSCSI disks for gfs2-fence (1GB), ctdb-lock(2GB) and gfs2-storage (5GB+) on target machine.
    If you create a single disk for all three then we can use lvm logical volumes to get separate partitions for all these three purposes.
  2. Make these devices available on ctdb1 and ctdb2 using iscsiadmin or other appropriate mechanism.
  3. Create LVM physical volumes on all three disks. Create one or more volume groups from these physical volumes. Create three logical volumes from created volume groups. We need one logical volume for gfs2-fence (1GB), one logical volume for ctdb-lock (2GB) and remaining (5GB+) for gfs2-storage.
    Example commands on ctdb1 when there are three separate iSCSI disks are as follows:
    #For gfs2-fence
    #pvcreate tab based autocomplete does not plays well with /dev/disk/by-path devices
    #so get full device names by copy-paste
    pvcreate /dev/disk/by-path/<appropriate-device-based-on-iqn>
    vgcreate vgfence /dev/disk/by-path/<appropriate-device-based-on-iqn>
    lvcreate -L 900M -n lvfence vgfence
    #For ctdb-lock
    pvcreate /dev/disk/by-path/<appropriate-device-based-on-iqn>
    vgcreate vgctdb /dev/disk/by-path/<appropriate-device-based-on-iqn>
    lvcreate -L 1.9G -n lvctdb vgctdb
    #For ctdb-storage
    pvcreate /dev/disk/by-path/<appropriate-device-based-on-iqn>
    vgcreate vgstorage /dev/disk/by-path/<appropriate-device-based-on-iqn>
    lvcreate -L 4.9G -n lvstorage vgstorage


Pacemaker cluster and gfs2 setup

Configure cluster

  1. Cluster nodes should be reachable by FQDN through DNS, LDAP, etc. or add node name to IP address pairing in /etc/hosts on all nodes (ctdb1, ctdb2) as:
    192.168.122.202 ctdb1
    192.168.122.52 ctdb2
  2. yum -y install pcs fence-agents-all
  3. Set same password for hacluster user on all nodes using:
    passwd hacluster
  4. Start and enable pcsd service:
    systemctl start pcsd
    systemctl enable pcsd
  5. From now on All pcs commands should be executed from a single node unless otherwise noted. Status should be checked on all nodes..
  6. From one any one of the cluster nodes, setup authentication across all nodes using:
    pcs cluster auth ctdb1 ctdb2
    Authenticate with username hacluster and password set during previous steps.
  7. Create and start the cluster.
    pcs cluster setup --start --name pcscluster1 ctdb1 ctdb2
  8. Start the cluster on all nodes
    pcs cluster start --all
    (Precautionary: As such cluster is already running and can be verified using 'pcs status' commmand)
  9. Enable the cluster to start automatically at boot:
    pcs cluster enable --all
    (IMP: Without this 'pcs status' will show 'Daemon status' with /disabled)
  10. Check the cluster status (check on all nodes)
    pcs cluster status


Configure corosync

Cluster should have quorum

       	corosync-quorumtool -s


Configure Stonith fence

Most commands are required on one node only. Status commands should be run on all nodes

  1. Get iscsi device ids (Failover & fencing) by running below command:
    ls -l /dev/disk/by-path/
  2. Add a STONITH device(Fencing device). In our case this is the 1GB LUN presented to both nodes over iSCSI
    pcs stonith create pcs-stonith-device fence_scsi pcmk_host_list="ctdb1 ctdb2" devices="/dev/disk/by-path/<appr-name>" meta provides=unfencing
    pcs stonith show pcs-stonith-device
  3. Check all currently configured STONITH properties
    pcs property list --all|grep stonith
    pcs property list --defaults
    pcs stonith show
    pcs stonith show --full


Configure pacemaker for gfs2

Most commands are required on one node only. Status commands should be run on all nodes

  1. On all cluster nodes, install below packages
    yum -y install gfs2-utils lvm2-cluster
  2. Enable clustered locking for LVM on *all* nodes
    lvmconf --enable-cluster
  3. Change default stickiness
    pcs resource defaults resource-stickiness=200
    pcs resource defaults
  4. The DLM needs to run on all nodes, so we’ll start by creating a resource for it and clone it. This creates dlm_cfg xml file which we will modify using future commands.
    pcs cluster cib dlm_cfg
    pcs -f dlm_cfg resource create dlm ocf:pacemaker:controld op monitor interval=120s on-fail=fence clone interleave=true ordered=true
  5. Set up clvmd as a cluster resource.
    pcs -f dlm_cfg resource create clvmd ocf:heartbeat:clvm op monitor interval=120s on-fail=fence clone interleave=true ordered=true
  6. Set up clvmd and dlm dependency and start up order. Create the ordering and the colocation constraint so that clvm starts after dlm and that both resources start on the same node.
    pcs -f dlm_cfg constraint order start dlm-clone then clvmd-clone
    pcs -f dlm_cfg constraint colocation add clvmd-clone with dlm-clone
  7. Set the no-quorum-policy of the cluster to freeze so that when quorum is lost, the remaining partition will do nothing until quorum is regained – GFS2 requires quorum to operate
    pcs -f dlm_cfg property set no-quorum-policy=freeze
  8. Let us check the configuration, would show up only on current node
    pcs -f dlm_cfg constraint
    pcs -f dlm_cfg resource show
  9. Commit changes from dlm_cfg xml file to cluster on current node. The same changes would automatically get applied on other nodes.
    pcs cluster cib-push dlm_cfg
  10. Check the status of the clone resources on all nodes
    pcs constraint
    pcs resource show
    pcs property list no-quorum-policy
    pcs status resources


gfs2 file-system creation

  1. Check current cluster name configured while configuring cluster
    pcs property list cluster-name
    grep name /etc/corosync/corosync.conf
  2. One by one create the GFS2 filesystem with correct cluster name
    mkfs.gfs2 -p lock_dlm -t pcscluster1:storage1 -j 2 /dev/vgstorage/lvstorage
    mkfs.gfs2 -p lock_dlm -t pcscluster1:ctdb1 -j 2 /dev/vgctdb/lvctdb
    Enter 'y' on (y/n) continue prompt on warning related to loss of all data on selected device in both cases.


Create Pacemaker Filesystem Resource

  1. Do this on all nodes
    mkdir /mnt/storage1
    mkdir /mnt/ctdb1
  2. We will not use /etc/fstab to specify the mount, rather we’ll use a Pacemaker-controlled resource
    pcs resource create gfs2_storage_res Filesystem device="/dev/vgstorage/lvstorage" directory="/mnt/storage1" fstype="gfs2" options="noatime,nodiratime,acl,rw,_netdev" op monitor interval=90s on-fail=fence clone interleave=true
    pcs resource create gfs2_ctdb_res Filesystem device="/dev/vgctdb/lvctdb" directory="/mnt/ctdb1" fstype="gfs2" options="noatime,nodiratime,acl,rw,_netdev" op monitor interval=90s on-fail=fence clone interleave=true
  3. This is configured as a clone resource so it will run on both nodes at the same time. Confirm that the mount has succeeded on both nodes
    pcs resource show
    mount | grep gfs2


Create Pacemaker Resource Ordering

  1. Next, create an ordering constraint so that the filesystem resource is started after the CLVMD resource, and a colocation constraint so that both start on the same node
    pcs constraint order start clvmd-clone then gfs2_storage_res-clone
    pcs constraint colocation add gfs2_storage_res-clone with clvmd-clone
    pcs constraint show
    pcs constraint order start clvmd-clone then gfs2_ctdb_res-clone
    pcs constraint colocation add gfs2_ctdb_res-clone with clvmd-clone
    pcs constraint show
    Thus first dlm-clone will start. Then clvmd-clone will start. Then finally gfs2_ctdb_res-clone and gfs2_storage_res-clone will start.


Setup 389-ds in iSCSI target node

  1. All machines should have FQDN resolvable via DNS or hostname resolvable via /etc/hosts. For example ctdb1, ctdb2, centos7 in our case.
  2. Create /etc/sysctl.d/10-ldap.conf file and enter below lines
    net.ipv4.tcp_keepalive_time = 300
    net.ipv4.ip_local_port_range = 1024 65000
    fs.file-max = 64000
  3. Add the following lines at the bottom of /etc/security/limits.conf
    * soft nofile 8192
    * hard nofile 8192
  4. Add the line at the end of /etc/profile
    ulimit -n 8192
  5. Add the line after last "session required" line in /etc/pam.d/login
    session required /lib/security/pam_limits.so
  6. Restart the server (shutdown -r now)
  7. Install epel & remi repository
    yum install -y epel-release
    wget http://rpms.famillecollet.com/enterprise/remi-release-7.rpm
    yum localinstall remi-release-7.rpm
  8. Create a LDAP user account.
    useradd ldapadmin
    passwd ldapadmin
  9. Install 389-ds and dependency packages
    yum install -y 389-ds-base openldap-clients idm-console-framework 389-adminutil 389-admin 389-admin-console 389-console 389-ds-console
  10. Configure LDAP server using setup-ds-admin.pl script as follows:
    Would you like to continue with set up? [yes]:
    Would you like to continue? [yes]:
    Choose a setup type [2]:
    Computer name [centos7.rnd.com]:
    System User [dirsrv]: ldapadmin
    System Group [dirsrv]: ldapadmin
    configuration directory server? [no]:
    administrator ID [admin]:
    password:
    Administration Domain [rnd.com]:
    Directory server network port [389]:
    Directory server identifier [centos7]:
    Suffix [dc=rnd,dc=com]:
    Directory Manager DN [cn=Directory Manager]:
    password:
    Administration port [9830]:
    Are you ready to set up your servers? [yes]:
  11. Start & enable 389-ds services
    systemctl enable dirsrv.target
    systemctl enable dirsrv-admin
    systemctl start dirsrv.target
    systemctl start dirsrv-admin
  12. Optinally test LDAP server by connecting to it from *all nodes*
    ldapsearch -x -b "dc=rnd,dc=com" -h centos7.rnd.com

Refer:


Setup CTDB cluster, Samba and samba-ldap

Install samba and ctdb. Configure ctdb. Create private network.

  1. On all cluster nodes, Install required packeages
    yum install -y ctdb samba samba-common samba-winbind-clients
  2. On all cluster nodes, The CTDB configuration file is located at /etc/ctdb/ctdbd.conf. The mandatory fields that must be configured for CTDB operation are as follows
    CTDB_RECOVERY_LOCK="/mnt/ctdb1/.ctdb.lock"
    CTDB_NODES=/etc/ctdb/nodes
    CTDB_PUBLIC_ADDRESSES=/etc/ctdb/public_addresses
    CTDB_MANAGES_SAMBA=yes
    CTDB_MANAGES_WINBIND=yes
    CTDB_LOGGING=file:/var/log/ctdb.log
  3. Create another private network between nodes for communication. In case of kvm based VMs use following on base host and not on VM
    1. Create private1.xml file on base machine with following details:
      <network>
      <name>private1</name>
      <forward mode='nat'>
      <nat>
      <port start='1024' end='65535'/>
      </nat>
      </forward>
      <bridge name='virbr1' stp='on'/>
      <ip address='192.168.123.1' netmask='255.255.255.0'>
      <dhcp>
      <range start='192.168.123.20' end='192.168.123.200'/>
      </dhcp>
      </ip>
      </network>
    2. Define persistent network with xml file
      virsh net-define private1.xml
    3. Start new network
      virsh net-start private1
    4. Enable new network to automatically start on host (base) boot
      virsh net-autostart private1
    5. Verify new network is created for persistence and auto-start. Also verify it has started properly
      virsh net-list --all
      ifconfig virbr1
    6. Update VM configuration to have a interface in the new IP range. NIC can be hot-added without requiring VM reboot, even if there is pop-up error indicating changes require reboot.
  4. On all cluster nodes, Enter cluster nodes private ipaddress in /etc/ctdb/nodes file.
    192.168.122.52
    192.168.122.202
  5. On all cluster nodes, Enter cluster nodes public ipaddress in /etc/ctdb/public_addresses file. Dont bind public ipaddress to interfaces. CTDB automatically assign the ipaddress to interfaces.
    192.168.123.71/24 ens9
    192.168.123.142/24 ens9
  6. On all cluster nodes, Disable samba service
    systemctl disable smb
    systemctl stop smb


Samba Configuration

  1. On all cluster nodes, The Samba configuration file is located at /etc/samba/smb.conf. Edit it as follows:
    [global]
    workgroup = SAMBARND
    security = user
    workgroup = SAMBA
    log file = /var/log/samba/log.%m
    log level = 5
    max log size = 50
    passdb backend = ldapsam:ldap://centos7.rnd.com
    #Must be a normal user added to "Directory Administrators" group
    #ldap admin dn = uid=luser1,ou=People,dc=rnd,dc=com
    ldap admin dn = uid=admin,ou=administrators,ou=topologymanagement,o=netscaperoot
    ldap suffix = dc=rnd,dc=com
    ldap user suffix = ou=People
    ldap machine suffix = ou=Computers
    ldap group suffix = ou=Groups
    ldap ssl = off
    add machine script = /usr/sbin/smbldap-useradd -w "%u"
    admin users = admin root
    wins support = yes
    dns proxy = yes
    [rndshare1]
    comment = RND share1
    path = /mnt/storage1/rndshare1
    read only = no
    inherit acls = yes
  2. Any node, Create samba share directory
    mkdir /mnt/storage1/rndshare1
  3. On all cluster nodes, Check samba configuration
    testparm


LDAP client configuration on nodes

  1. Install pam_ldap module
    yum install -y pam_ldap
  2. Configure PAM client and /etc/nsswitch.conf by running the following
    authconfig --enableldap --enableldapauth --disablenis --ldapserver=centos7.rnd.com --ldapbasedn=dc=rnd,dc=com --enablemkhomedir --enablelocauthorize --updateall
  3. Create an example ldap user (eg luser1) and ldap group (eg luser1) and test. The user and group should have POSIX attribute for it to be available on Linux. Test on all nodes
    getent passwd luser1
    getent group luser1
    id luser1
    Example output of these commands:
    [root@ctdb1 ~]# getent passwd luser1
    luser1:*:1001:1001:luser1:/home/luser1:/bin/bash
    [root@ctdb1 ~]# getent group luser1
    luser1:*:1001:
    [root@ctdb1 ~]# id luser1
    uid=1001(luser1) gid=1001(luser1) groups=1001(luser1)
    [root@ctdb1 ~]#


Samba LDAP schema configuration on LDAP server

  1. Copy samba.ldif schema file from ctdb samba nodes to ldap server
    cd /usr/share/doc/samba-<ver>/LDAP/
    rsync -va samba.ldif ol-schema-migrate.pl samba.schema root@centos7.rnd.com:/root
  2. On LDAP server import samba schema using
    #Ignore depreciation warning for below commands
    wget https://www.freedomit.co.nz/_downloads/61samba.ldif
    cp 61samba.ldif /etc/dirsrv/slapd-centos7/schema
    perl ol-schema-migrate.pl -b samba.schema > /etc/dirsrv/slapd-centos7/schema/61samba.schema
    cd /etc/dirsrv/slapd-centos7/schema
    chown -R ldapadmin:ldapadmin *
    systemctl restart dirsrv.target
    systemctl restart dirsrv-admin
  3. Open 389-console and go to "Directory Server" -> "Configuration" -> Schema. You must see various samba classes as part of schema.


Samba LDAP integration on Samba server

  1. Install epel
    yum -y install epel-release
  2. Install samba tools
    yum -y install smbldap-tools
  3. Enter correct master DN and master pw. Disable slave related lines in /etc/smbldap-tools/smbldap_bind.conf
    #Should be a normal user with administrator privileges
    #masterDN="uid=luser1,ou=People,dc=rnd,dc=com"
    masterDN="uid=admin,ou=administrators,ou=topologymanagement,o=netscaperoot"
    masterPw="<ldap-admin-password>"
  4. Create samba password
    smbpasswd -W
  5. Retrieve the SID only on one node
    net getlocalsid
  6. Enter below details in /etc/smbldap-tools/smbldap.conf on all nodes. Basically enter SID of one node in configuration files of all nodes so that all nodes use same SID.
    SID="S-1-5-21-1268041045-2979691576-414051897"
    sambaDomain="SAMBARND"
    masterLDAP="ldap.rnd.com"
    masterPort="389"
    ldapTLS="0"
    verify="none"
    suffix="dc=rnd,dc=com"
    password_hash="CLEARTEXT"
    userSmbHome=""
    userProfile=""
    Without setting password_hash to CLEARTEXT we get errors such as:
    Failed to modify UNIX password: invalid password syntax - passwords with storage scheme are not allowed at /usr/share/perl5/vendor_perl/smbldap_tools.pm line 1498, <STDIN> line 2.
    while using smbldap-populate in next steps.
  7. To fill the Directory Server with the correct entries
    smbldap-populate
  8. We have to create same user account in LDAP and CTDB cluster node. After that, we have to run smbpasswd command. smbpasswd will update LDAP user with samba parameters.
    1. Add user in LDAP (Eg user1)
    2. Add same user in samba
      smbpasswd -a user1


Start CTDB cluster

  1. On all nodes the address mentioned in /etc/ctdb/public_addresses should not be assigned to any interfaces. If they are assigned remove then using:
    ip addr del <ip-address> dev <interface-name>
  2. On all nodes try these
    systemctl start smb
    #Samba should be able to start with next command
    systemctl status smb
    systemctl stop smb
    Note that samba should be able to contact LDAP without needing public_addresses specified in /etc/ctdb/public_addresses file. Samba communication should happen with private IPs mentioned in /etc/ctdb/nodes file for setup to work.
  3. On all nodes ensure that pcs cluster and gfs filesystems are working properly
    pcs status
    df -h
  4. Start ctdb on all nodes
    systemctl start ctdb
    systemctl enable ctdb


References


Troubleshooting

Troubleshoot gfs2 not coming online

To troubleshoot gfs2 issues

  1. Unmount gfs2 on all nodes. Failure to unmount from all nodes, before running fsck will most likely result in filesystem corruption.
    Example commands to fsck gfs2_storage_res filesystem are:
    pcs resource disable --wait=5 gfs2_storage_res
    fsck.gfs2 /dev/vgstorage/lvstorage
    pcs resource enable gfs2_storage_res


Troubleshoot 389ds not starting issue after samba schema import

  1. If you are unable to access 389ds, remove 61samba.ldif file and restart dirsrv server.
  2. Copy data from this link https://www.freedomit.co.nz/_downloads/61samba.ldif and create 61samba.ldif file in same location and restart dirsrv server.


Troubleshooting samba authentication failure with NT_STATUS_INVALID_SID

  1. yum -y update --skip-broken
  2. Add following to /etc/samba/smb.conf on the machine where authentication is failing
    ntlm auth = yes
    netbios name = SAMBAGBB
    where "netbios name" value matches value of workgroup
  3. smbpasswd -W
  4. smbldap-populate
  5. Restart ctdb
  6. The following command:
    pdbedit -L
    should not show
    smbldap_search_paged: search was successful
    sid S-1-5-21-1676699995-2002923080-1310088020-1001 does not belong to our domain
    Skipping entry uid=luser1,ou=People,dc=rnd,dc=com
    sid S-1-5-21-1676699995-2002923080-1310088020-500 does not belong to our domain
    Skipping entry uid=root,ou=People,dc=rnd,dc=com
    sid S-1-5-21-1676699995-2002923080-1310088020-501 does not belong to our domain
    The ideal output is similar to
    Home server: sambarnd
    Home server: sambarnd
    Finding user root
    Trying _Get_Pwnam(), username as lowercase is root
    Get_Pwnam_internals did find user [root]!
    root:0:root
    Finding user nobody
    Trying _Get_Pwnam(), username as lowercase is nobody
    Get_Pwnam_internals did find user [nobody]!
    Finding user nobody
    Trying _Get_Pwnam(), username as lowercase is nobody
    Get_Pwnam_internals did find user [nobody]!
    nobody:99:Nobody
  7. Log lines in /var/log/samba/smb.log should not have
    pid_to_procid: messaging_dgm_get_unique failed: No such file or directory
    the above error while trying to connect from samba client


Refer:


Home > CentOS > CentOS 7.x > File sharing > Samba > CentOS 7.x ctdb cluster with LDAP integration over gfs2 filesystem