The way I start slurm:
mkdir -p /tmp/slurmstate/clustername
sudo slurmd
sudo munged -f
/etc/init.d/munge start
sudo slurmdbd
sudo slurmctld -c
-
sacctmgr list cluster
Cluster ControlHost ControlPort RPC Share GrpJobs GrpTRES GrpSubmit MaxJobs MaxTRES MaxSubmit MaxWall QOS Def QOS
---------- --------------- ------------ ----- --------- ------- ------------- --------- ------- ------------- --------- ----------- -------------------- ---------
cluster 0 7936 1 normal
Running slurmctld -cD
gives me following error. Cluster name returns some invalid string that I don't know. How could I fix it?
> slurmctld -cD
slurmctld: fatal: CLUSTER NAME MISMATCH.
slurmctld has been started with "ClusterName=�����", but read "cluster" from the state files in StateSaveLocation.
Running multiple clusters from a shared StateSaveLocation WILL CAUSE CORRUPTION.
Remove /tmp/slurmstate/clustername to override this safety check if this is intentional (e.g., the ClusterName has changed).
Note: When I try to run slurm as root user and switch back, this problem start occurring. I had to re-install mysql to make it fix.
Thank you for your valuable time and help.