CentOS 8.x Glusterfs basic setup of distributed volume
<yambe:breadcrumb self="Basic setup of distributed volume">CentOS 8.x Glusterfs|Glusterfs</yambe:breadcrumb>
CentOS 8.x Glusterfs basic setup of distributed volume
Terminology and concepts
Glusterfs can be used to combine local storage of various individual servers into a shared multi-write distributed filesystem. This way multiple clients can access this distributed storage for simultaneous read/write. These distributed volumes are useful for any kind of virtualization or big data applications.
In case you have central storage. You can mount smaller disks (eg 1TB) with each of the participating servers. Then each server can provide access to this 1TB disk to entire cluster. This way whether you are using FC / iSCSI, the storage access is distributed across multiple nodes providing faster access.
Gluster term for individual disks that are being used for storage is 'brick'. Hence each node in gluster setup can contribute one or more disks (Partitions / Filesystems) to overall storage. These bricks contain local (non-distributed) filesystem such as ext3 or xfs. Gluster combines storage of these bricks to create volume. Volumes are then mounted on clients to access distributed storage.
Gluster does not requires any OS level clustering the way it is required for OCSFS2 / GFS2. It also does not requires any metadata server similar to Moosefs. Hence the data is distributed among nodes based on hashing.
While creating volume there are different options:
- Distributed volume
- Data is distributed among bricks. Hence if two files are create file1 and file2, it is possible that file1 will get stored on brick1 and file2 on brick2.
- Replication
- A same file can be stored on at least two different bricks with replication set to 2.
- Striping
- A single file can be split into smaller files and these can be distributed among bricks for faster access to a single large file
There are also options related to allowing only certain nodes to mount the volume based on IP or password. There are options to set quota in terms of directory space usage. There are also options to set limit on how much storage gluster can use from each brick. For example for 1TB brick we can set limit to 900GB so that gluster does not uses more than 900GB space from that brick even when 100GB space is left available.
Refer multiple pages of documentation at https://staged-gluster-docs.readthedocs.io/en/release3.7.0beta1/Quick-Start-Guide/Architecture/ for more information
Server and distributed volume setup steps
To setup glusterfs among multiple machines use following steps:
- Disable firewalld and selinux
- setenforce 0
- #vim /etc/sysconfig/selinux
- systemctl stop firewalld
- systemctl disable firewalld
- You can always look into ports used by gluster version being deployed and only open those ports in firewall.
- Configure NTP with CentOS 8.x chronyc ntp client configuration
- Install glusterfs server side packages using:
- Install glusterfs repo
- dnf install -y centos-release-gluster
- Enable power tools repo
- dnf config-manager --set-enabled powertools
- Install glusterfs
- dnf -y install glusterfs-server
- Enable glusterfs service
- systemctl enable glusterd
- systemctl start glusterd
- systemctl status glusterd
- Install glusterfs repo
- Create desired filesystem on drives / partitions that will be used to store glusterfs data. If there are 3 servers participating (Provide disk space) in glusterfs setup, you need to follow above steps on all three servers, including creation of appropriate drive / partition. Each of these partition should be mounted locally on appropriate path. Ideally the final brick location should be sub-folder inside parent mounted folder. This way if drive is not mounted gluster will fail to start:
- If you want to use glusterfs with FQDN/names instead of IPs, appropriate resolution of these names from all servers and glusterfs clients is required for the gluster volume to function proprely.
- Add glusterfs nodes to trusted storage pool
- Start cluster from first host trying to add second host to trusted pool. Syntax is
- gluster peer probe <peer-ip-or-hostname>
- Check the connected peers on individual cluster nodes
- gluster peer status
- List the hosts in the Gluster Cluster
- gluster pool list
- Start cluster from first host trying to add second host to trusted pool. Syntax is
- Setup and start Glusterfs Distributed volume
- Create glusterfs distributed volume
- gluster volume create <vol-name> transport tcp <host1-ip-or-fqdn>:<host1-local-brick-mount-point> <host2-ip-or-fqdn>:<host2-local-brick-mount-point>
- Example
- gluster volume create distributed_volume transport tcp glusterfs1:/mnt/brick1/dist_vol glusterfs2:/mnt/brick2/dist_vol
- Start the created volume
- gluster volume start <vol-name>
- More info on the volume
- gluster volume info <vol-name>
- Check the status of the glusterfs distributed volume
- gluster volume status <vol-name>
- Create glusterfs distributed volume
Access glusterfs volume from client machine
To access the created glustefs volume from client machine use:
- Install glusterfs fuse
- dnf -y install glusterfs-fuse
- Mount the GlusterFS Distributed Volume
- mount -t glusterfs <any-gluster-host-or-ip>:/<vol-name> <mount-point>
- If you are mounting on a host which is also participating in trusted storage pool, then the host-ip can be local ip. This way the storage is accessed via local machine networking for higher performance and avoid other machine as potential point of failure
Refer: