Difference between revisions of "PBS job submission and execution"
(Created page with "Home > CentOS > CentOS 7.x > CentOS 7.x Rocks cluster 7.0 > PBS job submission and execution =Sample pbs stress job for testing= We can test PBS installation by validating whether we are able to submit jobs and they are executing properly on various nodes using: # Install stress package on all the nodes #:<pre> #:: yum install -y stress #:</pre> #: Assumes epel, etc. are enabled as part of compute/execution node setup # Create the PBS scri...") |
m |
||
Line 58: | Line 58: | ||
#:: htop | #:: htop | ||
#:</pre> | #:</pre> | ||
=Email status of PBS job execution= | |||
For sending mails after/before job execution, Postfix or SMTP should be configured in the master server. | |||
==Normal emails== | |||
Below two parameters should be added in the pbs job script (ex: load.sh) along with other parameters. | |||
<pre> | |||
#PBS -m abe | |||
#PBS -M pavan@gbb.co.in | |||
</pre> | |||
Explanation of " abe " | |||
; a : Mail is sent when the job is aborted by the batch system. | |||
; b : Mail is sent when the job begins execution. | |||
; e : Mail is sent when the job terminates. | |||
==Exception emails== | |||
If mails are not required from particuler job, then change the below parameter in the pbs job script. | |||
<pre> | |||
#PBS -m n | |||
</pre> | |||
Where: | |||
;n : No normal mail is sent. Mail for job cancels and other events outside of normal job processing are still sent. | |||
==Configure from email address== | |||
Change the mail from name by modifying below server attribute | |||
<pre> | |||
qmgr -c "set server mail_from = <user>@<domain>" | |||
</pre> | |||
Latest revision as of 06:28, 18 July 2022
Home > CentOS > CentOS 7.x > CentOS 7.x Rocks cluster 7.0 > PBS job submission and execution
Sample pbs stress job for testing
We can test PBS installation by validating whether we are able to submit jobs and they are executing properly on various nodes using:
- Install stress package on all the nodes
- yum install -y stress
- Assumes epel, etc. are enabled as part of compute/execution node setup
- Create the PBS script 'load.sh' to run workloads on 8 specific nodes using:
#! /bin/bash #PBS -N jobname #PBS -q queuename #PBS -V #PBS -l nodes=8 /opt/openmpi/bin/mpirun -host compute-0-0,compute-0-1,compute-0-2,compute-0-3,compute-0-4,compute-0-5,compute-0-6,compute-0-7 -n 8 /usr/bin/stress --cpu 20 --vm 5 --vm-bytes 10G --timeout 60s
- Note: This script will use 20 cpus, around 50G memory and run for one minute in 8 nodes
- Other better option is to use below load.sh script instead
#!/bin/bash #PBS -N job #PBS -q testq1 #PBS -V #PBS -l nodes=2:ppn=2 /opt/openmpi/bin/mpirun -machinefile $PBS_NODEFILE -np 4 /usr/bin/stress --cpu 1 --vm-bytes 10G --timeout 60s
- This allows pbs to pick 2 nodes (nodes=2) and run 2 process per node (ppn=2) among these nodes for total of 4 processes (-np 4). This way we dont need to provide hostnames in the script file.
- If submittion of jobs as root is not enabled (It is disabled by default, recommended to leave it disabled), then we need to create common users across all nodes for the job submission process to work.
- For example
- useradd user1
- passwd user1
- rocks sync users
- For example
- Also the users created should have ssh-keybased password-less ssh from one machine to another machines for the same user. In case of rocks this is not working by default unless we do following (Required only once on master)
- ls -l /usr/libexec/openssh/ssh-keysign
- chmod u+s /usr/libexec/openssh/ssh-keysign
- ls -l /usr/libexec/openssh/ssh-keysign
- Submit the jobs in multiple nodes using normal user eg user1 as:
- qsub load.sh
- This assume /opt/pbs/bin is in $PATH before /opt/gridengine/bin
- Check the CPU/RAM utilization on the compute nodes
- top
- OR for better graphics use:
- yum -y install htop
- htop
Email status of PBS job execution
For sending mails after/before job execution, Postfix or SMTP should be configured in the master server.
Normal emails
Below two parameters should be added in the pbs job script (ex: load.sh) along with other parameters.
#PBS -m abe #PBS -M pavan@gbb.co.in
Explanation of " abe "
- a
- Mail is sent when the job is aborted by the batch system.
- b
- Mail is sent when the job begins execution.
- e
- Mail is sent when the job terminates.
Exception emails
If mails are not required from particuler job, then change the below parameter in the pbs job script.
#PBS -m n
Where:
- n
- No normal mail is sent. Mail for job cancels and other events outside of normal job processing are still sent.
Configure from email address
Change the mail from name by modifying below server attribute
qmgr -c "set server mail_from = <user>@<domain>"
More ways to submit jobs
Basic Job submission
qsub <script-name>
For Example
qsub /home/test5/myscript.sh
Specify job name with -N option while submitting the job
qsub -N <job-name> <script>
For Example:
qsub -N firstJob /home/test5/myscript.sh
Select resources while submitting jobs
qsub -l ncpus=<cpu-=count>:mem=<mem-count-in-gb>gb <script>
For example:
qsub -l ncpus=20:mem=40gb /home/test5/myscript.sh
This example Job will select 20 cpus and 40gb memory
Select single node while submitting jobs
qsub -l nodes=<nodename1>:ncpus=20 <script>
For Example:
qsub -l nodes=rockscompute1:ncpus=20 /home/test5/myscript.sh
This job will select one node specified with hostname.
Select multiple nodes while submitting jobs
qsub -l nodes=<nodename1>+<nodename2>:ncpus=20 <script>
For Example:
qsub -l nodes=rockscompute1+rockscompute2:ncpus=20 /home/test5/myscript.sh
Submit multiple jobs with same script
qsub -J 1-20 /home/test5/myscript.sh
Check Job status
- To print all running jobs:
- qstat -a
- To print all finished jobs:
- qstat -x
- To see job attributes:
- qstat -f <job ID>
- To see job attributes when history is enabled use:
- qstat -xf <job-id>
Home > CentOS > CentOS 7.x > CentOS 7.x Rocks cluster 7.0 > PBS job submission and execution