A job script contains the definition of a computing job. It is a good practice to include everything needed for the job to run in the jobscript itself, this includes loading modules, initialization of the environment, and setting variables. Details of Slurm jobscripts are documented in the following locations:

https://slurm.schedmd.com/quickstart.html

https://slurm.schedmd.com/sbatch.html

CPU job script

This is the simple script for submitting basic CPU jobs:

Example CPU job
#!/bin/bash
#SBATCH --job-name=job_name
#SBATCH --time=01:00:00
#SBATCH --account=<grantname-cpu>
#SBATCH --partition=plgrid

module load python 
srun python myapp.py

The job will be named "job_name", declares a run time of 1 hour, is being run with the "grantname-cpu" account, is submitted to "plgrid" (default for CPU jobs) partition. The job operates in the directory where the batch command was issued, loads a python module, and executes a python application. Job's output will be written to a file named slurm-<JOBID>.out in the current directory. The srun before python invocation is a good practice, as in more complex cases srun allows for more precise control of resources assigned to the application.

The advanced job could look like the following example:

Example advanced CPU job
#!/bin/bash
#SBATCH --job-name=job_name
#SBATCH --time=01:00:00
#SBATCH --account=grantname-cpu
#SBATCH --partition=plgrid
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=48
#SBATCH --cpus-per-task=1
#SBATCH --mem=180G
#SBATCH --output="joblog-%j.txt"
#SBATCH --error="joberr-%j.txt"

module load openmpi 
mpiexec myapp.bin

Please note the additional parameters and the MPI-enabled application! This job uses 2 nodes, with 48 tasks on each node, and each task uses 1 CPU. Each node will allocate 180GB of memory for the job. The job's stdout and stderr are redirected to joblog-<JOBID>.txt and joberr-<JOBID>.txt files. In the example, the myapp.bin application uses MPI. The mpiexec command is responsible for spawning the additional application ranks (processes). In most cases of MPI applications, the mpiexec's parameters are configured by the system, so there is no need to specify the -np argument explicitly. Note that using mpiexec allows us to omit the 'srun' command, as it is used by mpiexec internally.

GPU job script

The simple script for submitting GPU jobs:

Example GPU job
#!/bin/bash
#SBATCH --job-name=job_name
#SBATCH --time=01:00:00
#SBATCH --account=grantname-gpu
#SBATCH --partition=plgrid-gpu-v100
#SBATCH --cpus-per-task=4
#SBATCH --mem=40G
#SBATCH --gres=gpu

module load cuda 
srun ./myapp

Please note the specific account name and partition for GPU jobs. The job allocated one GPU with the --gres parameter. The whole GPU is allocated for the job, --memory parameter refers to the system memory used by the job. More information on how to use GPU's can be found here: https://slurm.schedmd.com/gres.html

  • No labels