Log onto the server ernest.phys.cmu.edu using secure shell (ssh).
If your executable needs non-standard libraries, you may need to statically
link these into your binaries (the -static
compiler option). Commonly-used shared library files
are available on the compute nodes.
Submit jobs to the compute nodes using the sbatch
command on ernest. See the SLURM documentation for information on how
to write job batch files.
Jobs can be monitored using the text-based squeue
command.
For serial jobs, the script file typically performs the following sequence of tasks:
Issue the necessary SLURM directives. These must come first in the
script file.
Stage in: copy all needed files, including the executable, into the directory
/scratch/slurm_<job_id>
which SLURM automatically creates on
the local disk on the compute node.
Run the program executable (in the foreground, not in the background).
Stage out: after program execution is done, copy appropriate files from the local disk
back into the user's permanent directory.
Provide some clean up instructions in case the job must be killed
using scancel.
To run multiple serial programs in a single job, run each executable with
srun in the
background, followed by a wait command (see above
sample script).
To run executables with multiple threads, you need to include OpenMP directives
in your code, compile your program with g++/gcc with the -fopenmp
option, and in your SLURM script, it is advisable to use
export OMP_NUM_THREADS=3