CentOS 7.x Rocks cluster 7.0 Reinstall OS on compute node
From Notes_Wiki
Home > CentOS > CentOS 7.x > CentOS 7.x Rocks cluster 7.0 > CentOS 7.x Rocks cluster 7.0 Reinstall OS on compute node
Reinstall OS on one specific compute node
To reinstall OS on compute node use:
rocks list host boot rocks set host boot <hostname> action=install ssh <hostname> "shutdown -r now"
This assumes that the boot order on compute node is properly set to boot from network.
By default there is /state/partition1 partition created on compute nodes. This partition is not affected during the reinstall process. Any data on this partition remains as it is after the reinstallation.
Reinstall OS on all compute nodes
If the reinstallation has to be done on all compute nodes then use:
- You must have a non-root user. If not there create one with useradd
- Note we cannot run sge jobs as root user
- The non-root user must have manager privilege. If not there add via:
- qconf -am <username>
- This is required because jobs with positive priority can be submitted only by managers.
- Edit '/opt/gridengine/examples/jobs/sge-reinstall.sh' and replace the qsub line with (might have been split into two lines):
- runuser -l <non-root-username> -c "qsub -p 1024 -pe mpi $numprocs -q all.q@$TARGETHOST /opt/gridengine/examples/jobs/reboot.qsub"
- Now run the script to submit job that configures each node host action as install
- /opt/gridengine/examples/jobs/sge-reinstall.sh
- Validate that host action has updated properly
- rocks list host boot
- Restart the nodes using:
- for A in $(rocks list host | cut -f 1 -d ' ' | grep -v HOST | sed 's/.$//' | grep -v <master-hostname>); do ssh $A "shutdown -r now"; done
- Ensure to replace <master-hostname> with proper name to avoid rebooting of master itself
- If for one or two nodes reinstallation is not desired we can always change their boot action using:
- rocks set host boot <hostname> action=os
- rocks list host boot
- http://central-7-0-x86-64.rocksclusters.org/roll-documentation/base/7.0/sge-cluster-reinstall.html
- https://docs.oracle.com/cd/E19957-01/820-0698/6ncdvjclp/index.html
- https://stackoverflow.com/questions/37733095/unable-to-run-jobs-on-cfncluster
- https://stackoverflow.com/questions/30645020/what-does-sge-mean-by-positive-submission-priority-requires-operator-privileges
Home > CentOS > CentOS 7.x > CentOS 7.x Rocks cluster 7.0 > CentOS 7.x Rocks cluster 7.0 Reinstall OS on compute node