CentOS 7.x Rocks cluster 7.0 Reinstall OS on compute node
From Notes_Wiki
Home > CentOS > CentOS 7.x > CentOS 7.x Rocks cluster 7.0 > CentOS 7.x Rocks cluster 7.0 Reinstall OS on compute node
Reinstall OS on one specific compute node
To reinstall OS on compute node use:
rocks list host boot rocks set host boot <hostname> action=install ssh <hostname> "shutdown -r now"
This assumes that the boot order on compute node is properly set to boot from network.
By default there is /state/partition1 partition created on compute nodes. This partition is not affected during the reinstall process. Any data on this partition remains as it is after the reinstallation.
Refer:
Reinstall OS on all compute nodes
If the reinstallation has to be done on all compute nodes then use:
- You must have a non-root user. If not there create one with useradd
- Note we cannot run sge jobs as root user
- The non-root user must have manager privilege. If not there add via:
- qconf -am <username>
- This is required because jobs with positive priority can be submitted only by managers.
- Edit '/opt/gridengine/examples/jobs/sge-reinstall.sh' and replace the qsub line with (might have been split into two lines):
- runuser -l <non-root-username> -c "qsub -p 1024 -pe mpi $numprocs -q all.q@$TARGETHOST /opt/gridengine/examples/jobs/reboot.qsub"
- Now run the script to submit job that configures each node host action as install
- /opt/gridengine/examples/jobs/sge-reinstall.sh
- Validate that host action has updated properly
- rocks list host boot
- Restart the nodes using:
- for A in $(rocks list host | cut -f 1 -d ' ' | grep -v HOST | sed 's/.$//' | grep -v <master-hostname>); do ssh $A "shutdown -r now"; done
- Ensure to replace <master-hostname> with proper name to avoid rebooting of master itself
- If for one or two nodes reinstallation is not desired we can always change their boot action using:
- rocks set host boot <hostname> action=os
- rocks list host boot
Refer:
- http://central-7-0-x86-64.rocksclusters.org/roll-documentation/base/7.0/sge-cluster-reinstall.html
- https://docs.oracle.com/cd/E19957-01/820-0698/6ncdvjclp/index.html
- https://stackoverflow.com/questions/37733095/unable-to-run-jobs-on-cfncluster
- https://stackoverflow.com/questions/30645020/what-does-sge-mean-by-positive-submission-priority-requires-operator-privileges
Home > CentOS > CentOS 7.x > CentOS 7.x Rocks cluster 7.0 > CentOS 7.x Rocks cluster 7.0 Reinstall OS on compute node