Ubuntu HPC Slurm DB setup for Slurm accounting

From Notes_Wiki

Home > Ubuntu > Ubuntu HPC setup with slurm and linux containers > Ubuntu HPC Slurm DB setup for Slurm accounting

Slurm Accounting DB Server Configuration

This guide provides step-by-step instructions to configure a MariaDB server for Slurm accounting, including secure database setup, integration with `slurmdbd`, and verification of accounting functionality.

1. Install MariaDB in LXC Container

On the infra node, access the Slurm DB container and install MariaDB:

lxc exec slurm-db bash
apt update && apt install mariadb-server -y

2. Secure the MariaDB Installation

Run the secure installation script:

mysql_secure_installation

Respond to the prompts as follows:

  • Enter current password: <Enter>
  • Switch to unix_socket: `n`
  • Change root password: `y` → YourPassword
  • Remove anonymous users: `y`
  • Disallow root login remotely: `n`
  • Remove test database: `y`
  • Reload privilege tables: `y`

3. Configure MariaDB to Listen on Container IP

Edit the following configuration file:

vim /etc/mysql/mariadb.conf.d/50-server.cnf

Set the bind address to the container IP:

bind-address = 192.168.2.7

Restart the MariaDB service:

systemctl restart mariadb

Verify that MariaDB is listening:

netstat -tunlp | grep 3306

4. Create Slurm Accounting Database and User

Login to the MariaDB shell:

mysql -u root -p

Execute the following SQL statements:

CREATE DATABASE slurm_acct_db;
CREATE USER 'slurm'@'localhost' IDENTIFIED BY 'slurmdbpass';
GRANT ALL ON slurm_acct_db.* TO 'slurm'@'localhost';
GRANT ALL PRIVILEGES ON slurm_acct_db.* TO 'slurm'@'192.168.2.7' IDENTIFIED BY 'slurmdbpass';
FLUSH PRIVILEGES;
EXIT;

5. Test DB Connection from Slurm Master

On the Slurm master node, install the MySQL client and test the database connection:

apt install mysql-client-core-8.0 -y
mysql -u slurm -p -h slurm-db -P 3306

In the MySQL prompt, verify the database:

SHOW DATABASES;

6. Install and Configure slurmdbd on Master Node

Install `slurmdbd` on the master node:

apt-get install slurmdbd -y

Create the `slurmdbd.conf` configuration file:

vim /etc/slurm/slurmdbd.conf

Add the following content:

DbdHost=localhost
DbdPort=6819
StorageType=accounting_storage/mysql
StorageHost=slurm-db     # Replace with the actual hostname or IP of your DB server
StorageUser=slurm
StoragePass=slurmdbpass
StorageLoc=slurm_acct_db
LogFile=/var/log/slurm/slurmdbd.log
PidFile=/var/run/slurm/slurmdbd.pid

Restart the `slurmdbd` service:

systemctl restart slurmdbd

7. Configure slurm.conf for Accounting

From the infra node, open the shared `slurm.conf` file:

vim /export/tmp/slurm/slurm.conf

Add or modify the following lines:

AccountingStorageType=accounting_storage/slurmdbd
AccountingStorageHost=slurm-master    # Replace with the actual Slurm master hostname
AccountingStoragePort=6819
AccountingStorageUser=slurm

8. Distribute Updated slurm.conf to All Nodes

On Login Node:

cp /export/tmp/slurm/slurm.conf /etc/slurm/slurm.conf

On Compute Nodes:

cp /export/tmp/slurm/slurm.conf /etc/slurm/slurm.conf
systemctl restart slurmd

On Master Node:

cp /export/tmp/slurm/slurm.conf /etc/slurm/slurm.conf
systemctl restart slurmctld

9. Register the Cluster

Run this on the Slurm master node to add the cluster named test_hpc-cluster:

sacctmgr add cluster test_hpc-cluster

Note: If the cluster already exists, the command will safely skip.

10. Submit Test Jobs and Verify Accounting

Submit a few test jobs, then verify accounting with:

sacct

Example Output

JobID     JobName  Partition  Account  AllocCPUS  State     ExitCode
--------  -------- ---------- -------- ---------- --------- --------
1         bash     gpu                 128        COMPLETED 0:0
2         bash     gpu                 128        FAILED    127:0
...

Home > Ubuntu > Ubuntu HPC setup with slurm and linux containers > Ubuntu HPC Slurm DB setup for Slurm accounting