Check cluster status via systemd service
From Notes_Wiki
Home > Suse > SAP setup and maintenance > Check cluster status via systemd service
Home > CentOS > CentOS 8.x > System Administration > systemd or systemctl > Check cluster status via systemd service
We can check cluster status via a systemd script using: Not tested in production
- Refer CentOS 8.x systemd or systemctl for help on using systemd or creating new systemd services
- Setup outgoing email via postfix on the system so that email can be sent using mail command via CentOS 8.x postfix send email through relay or smarthost with smtp authentication
- Create a systemd script '/etc/systemd/system/cluster_status_check.service' with:
[Unit] Description=Check cluster status and send email if not healthy [Service] Type=oneshot ExecStart=/sbin/cluster_status_check_script.sh Environment="EMAIL_ADDRESS=your_email@example.com" Environment="HOSTNAME=$(hostname)" Environment="IP_ADDRESS=$(hostname -I | awk '{print $1}')" ExecStartPost=/bin/sh -c 'if [ $? -ne 0 ]; then echo "Cluster status check failed on $HOSTNAME ($IP_ADDRESS)." | mail -s "Cluster Alert" $EMAIL_ADDRESS; fi' Restart=on-failure [Timer] OnUnitActiveSec=1h Unit=cluster_status_check.service [Install] WantedBy=multi-user.target
- In the script replace EMAIL_ADDRESS appropriately
- As per ExecStart path given in systemd service create '/sbin/cluster_status_check_script.sh with
# #!/bin/bash # Run crm status and store output in a variable crm_output=$(crm status) # Check for any errors or warnings (ignoring case) if [[ $crm_output =~ (error|warning) ]]; then echo "Error or warning found in cluster status" echo "$crm_output" exit 1 fi # Check if all nodes are online num_nodes=$(crm_node -l | wc -l) num_online_nodes=$(crm_mon -1 | grep "Online:" | awk '{print $2}' | wc -w) if [[ $num_nodes -ne $num_online_nodes ]]; then echo "Not all nodes are online" echo "$crm_output" exit 1 fi # Check if all resources are started properly num_resources=$(crm_mon -1 | grep -c "resource") if [[ $num_resources -eq 0 ]]; then echo "No resources found in the cluster" echo "$crm_output" exit 1 fi num_started_resources=$(crm_mon -1 | grep "resource" | grep "Started" | wc -l) if [[ $num_resources -ne $num_started_resources ]]; then echo "Not all resources are started properly" echo "$crm_output" exit 1 fi echo "Cluster status is OK" exit 0
- Set execute permissions on script and reload, enable, start service via:
- chmod +x /sbin/cluster_status_check_script.sh
- systemctl daemon-reload
- systemctl start cluster_status_check.service
- systemctl enable cluster_status_check.service
- If feasible stop a resource and validate whether email is received or not. You can consider adding a virtual IP resource for testing and remove this resource later.
Home > Suse > SAP setup and maintenance > Check cluster status via systemd service
Home > CentOS > CentOS 8.x > System Administration > systemd or systemctl > Check cluster status via systemd service