- ARCH
- Short Tutorials
- Short Tutorial: How to Create a Slurm Script
How to create a Slurm script
Slurm scripts are used to submit and manage jobs in a high-performance computing (HPC) environment that uses the Slurm workload manager. Slurm is a popular open-source resource management and job scheduling application used on many HPC clusters and supercomputers.
A basic example of a Slurm script
#!/bin/bash
#SBATCH --job-name=my_job_name # Job name
#SBATCH --output=output.txt # Standard output file
#SBATCH --error=error.txt # Standard error file
#SBATCH --partition=partition_name # Partition or queue name
#SBATCH --nodes=1 # Number of nodes
#SBATCH --ntasks-per-node=1 # Number of tasks per node
#SBATCH --cpus-per-task=1 # Number of CPU cores per task
#SBATCH --time=1:00:00 # Maximum runtime (D-HH:MM:SS)
#SBATCH --mail-type=END # Send email at job completion
#SBATCH --mail-user=your@email.com # Email address for notifications
#Load necessary modules (if needed)
#module load module_name
#Your job commands go here
#For example:
#python my_script.py
#Optionally, you can include cleanup commands here (e.g., after the job finishes)
#For example:
#rm some_temp_file.txt
Here’s an explanation of the key Slurm directives in the script:
#SBATCH These lines are comments in a Slurm script and specify various options for the job.
–job-name A name for your job.
–output and –error: The paths to the standard output and error log files.
–partition: The name of the Slurm partition or queue where the job should run.
–nodes: The number of nodes needed for the job.
–ntasks-per-node: The number of tasks per node or processes to run.
–cpus-per-task: The number of CPU cores allocated to each task.
–time: The maximum runtime for the job.
–mail-type and –mail-user: Email notification settings.
--ntasks
on Rockfish also, no need to set --mem
It will automatically set to the number of cores, 4GB per core.#SBATCH
directives, you can load any necessary modules or execute your job’s commands. In the example, it’s assumed that you will run a Python script named my_script.py
. You can replace this with your specific job commands.
To submit a Slurm job, you can save the script to a file (e.g., my_job.slurm
) and then use the sbatch
command to submit the job:
[userid@local ~]$ sbatch my_job.slurm
The provided script is a Slurm job script written in Bash for submitting a job array to a Slurm cluster. Here’s a breakdown of the script:
How to run a matlab job array
$SLURM_ARRAY_TASK_ID
, which ranges from 1 to 20.
#!/bin/bash -l
#SBATCH --job-name=job-array2 # Job name
#SBATCH --time=1:1:0 # Maximum runtime (D-HH:MM:SS)
#SBATCH --array=1-20 # Defines a job array from task ID 1 to 20
#SBATCH --ntasks=1 # Number of tasks (in this case, one task per array element)
#SBATCH -p shared # Partition or queue name
#SBATCH --reservation=Training # Reservation name
#SBATCH # This is an empty line to separate Slurm directives from the job commands
#run your job
echo "Start Job $SLURM_ARRAY_TASK_ID on $HOSTNAME" # Display job start information
sleep 10 # Sleep for 10 seconds
export alpha=1 # Set an environment variable alpha to 1
export beta=2 # Set an environment variable beta to 2
module load matlab # Load the Matlab module
matlab -nodisplay -singleCompThread -r "myRand($SLURM_ARRAY_TASK_ID, $alpha, $beta), pause(20), exit"
#Run a Matlab script with parameters: $SLURM_ARRAY_TASK_ID, $alpha, and $beta, and then exit
- The script specifies Slurm directives at the beginning of the file. These directives provide instructions to the Slurm scheduler for managing the job array, such as the job name, maximum runtime, array definition, number of tasks, partition, and reservation.
- After the Slurm directives, the script contains actual job commands. It starts by echoing a message indicating the start of the job with the current task ID and the hostname where the job is running.
- It then
sleeps
for 10 seconds using the sleep command.- Two environment variables,
alpha
andbeta
, are exported with values 1 and 2, respectively.- The Matlab module is loaded with the
module load
command.- Finally, Matlab is invoked with the specified parameters using the
mpirun
flag. TheMyRand
Matlab function is called with the current$SLURM_ARRAY_TASK_ID
,$alpha
, and$beta
. It also includes a pause(20) to pause execution for 20 seconds and then exits.
job_array_script.sh
) and then submit it using the sbatch
command:
[userid@local ~]$ sbatch job_array_script.sh
The scheduler will take care of running the job array with the specified parameters.
How to run job array task with a step size
When using #SBATCH –array=1-100%10, it defines a job array where the task IDs range from 1 to 100, and each job array element runs every 10 task IDs. This means that you will have a total of 10 job instances, each running a subset of the task IDs from 1 to 100. Here’s an example script using this array configuration:
#!/bin/bash -l
#SBATCH --job-name=job-array-example
#SBATCH --time=1:0:0
#SBATCH --array=1-100%10 # Job array from task ID 1 to 100, with a step size of 10
#SBATCH --ntasks-per-node=1
#SBATCH --partition=shared
#SBATCH --mail-type=end
#SBATCH --mail-user=userid@jhu.edu
#SBATCH --reservation=Training
ml intel/2022.2
#Your executable or script goes here
#Example: Running a Python script
#python my_script.py $SLURM_ARRAY_TASK_ID
#In this example, each job instance will execute the script with a different SLURM_ARRAY_TASK_ID.
To submit this job array to the Slurm scheduler, save it to a file (e.g.,
-#SBATCH --array=1-100%10
defines a job array with task IDs ranging from 1 to 100, where each job instance will run a subset of 10 consecutive task IDs. So, you’ll have 10 job instances withSLURM_ARRAY_TASK_ID
values like 1, 11, 21, …, 91.- The
ml intel/2022.2
line loads the Intel compiler module, which can be used for compilation if your job requires it.- The actual job commands, such as running an executable or script, should be placed below the comments. In this example, I’ve left a placeholder comment indicating how you might run a Python script with the
SLURM_ARRAY_TASK_ID
. You should replace it with the actual commands or scripts you want to execute for your job.
job_array_example.sh
) and then submit it using the sbatch command:
[userid@local ~]$ sbatch job_array_example.sh
The scheduler will create 10 job instances, each running a subset of task IDs according to the specified array configuration.
How to run an MPI (Message Passing Interface) program
To perform a Slurm job script for running an MPI (Message Passing Interface) program on a high-performance computing (HPC) Rockfish Cluster.
Here’s a breakdown of the script:
#!/bin/bash -l
#SBATCH --job-name=mpi-job # Job name
#SBATCH --time=1:0:0 # Maximum runtime (1 hour)
#SBATCH --nodes=1 # Number of nodes requested
#SBATCH --ntasks-per-node=4 # Number of MPI tasks per node
#SBATCH --partition=shared # Partition or queue name
#SBACTH --mail-type=end # Email notification type (end of job)
#SBATCH --mail-user=userid@jhu.edu # Email address for notifications
#SBATCH --reservation=Training # Reservation name
ml intel/2022.2 # Load the Intel compiler module with version 2022.2
# compile
mpiicc -o hello-mpi.x hello-mpi.c # Compile the MPI program from source code
mpirun -np 4 ./hello-mpi.x > my-mpi.log # Run the MPI program with 4 MPI processes, redirecting output to a log file
Here’s what the script does:
- It specifies various Slurm directives at the beginning of the script. These directives provide instructions to the Slurm scheduler for managing the MPI job:
- –job-name Specifies a name for the job.
- –time Sets the maximum runtime for the job to 1 hour.
- –nodes Requests 1 compute node for the job.
- –ntasks-per-node Specifies that there will be 4 MPI tasks per node.
- –partition Specifies the Slurm partition or queue where the job should run (in this case,
shared
). - –mail-type Requests email notifications at the end of the job.
- –mail-user Specifies the email address where notifications will be sent.
- –reservation Associates the job with a reservation named “Training.”
- The script loads the Intel compiler module with version 2022.2 using the
ml
command. This is done to ensure that the correct compiler environment is set up for compilation. - It compiles the MPI program named
hello-mpi.c
using thempiicc
compiler and generates an executable named “hello-mpi.x.” - Finally, it runs the MPI program using the mpirun command with 4 MPI processes. The standard output of the program is redirected to a log file named
my-mpi.log
.
To submit this MPI job to the Slurm scheduler, save it to a file (e.g., mpi_job_script.sh
) and then submit it using the sbatch command:
[userid@local ~]$ sbatch mpi_job_script.sh
The scheduler will allocate resources and run the MPI program with the specified parameters.
How to run a mixed MPI / OpenMP program
To submit a Slurm job script for running a mixed MPI/OpenMP program on a high-performance computing (HPC) cluster. This script combines both message-passing parallelism (MPI) and shared-memory parallelism (OpenMP). Here’s a breakdown of the script:
#!/bin/bash -l
#BATCH --job-name=omp-job # Job name
#SBATCH --time=1:0:0 # Maximum runtime (1 hour)
#SBATCH --nodes=2 # Number of nodes requested
#SBATCH --ntasks-per-node=1 # Number of MPI tasks per node
#SBATCH --cpus-per-task=4 # Number of CPU cores per task
#SBATCH --partition=shared # Partition or queue name
#SBACTH --mail-type=end # Email notification type (end of job)
#SBATCH --mail-user=$USER@jhu.edu # Email address for notifications (using the user's environment variable)
#SBATCH --reservation=Training # Reservation name
ml intel/2022.2 # Load the Intel compiler module with version 2022.2
#Compile the code using Intel and mix MPI/OpenMP
echo "mpiicc -qopenmp -o hello-mix.x hello-world-mix.c"
#How to compile
#mpiicc -qopenmp -o hello-mix.x hello-world-mix.c
#Run the code
mpirun -np 2 ./hello-mix.x # Run the mixed MPI/OpenMP program with 2 MPI processes
Here’s what the script does:
- The script specifies various Slurm directives at the beginning of the script. These directives provide instructions to the Slurm scheduler for managing the mixed MPI/OpenMP job:
- –job-name Specifies a name for the job.
- –time Sets the maximum runtime for the job to 1 hour.
- –nodes Requests 2 compute nodes for the job.
- –ntasks-per-node Specifies that there will be 1 MPI task per node.
- –cpus-per-task Specifies that each MPI task will use 4 CPU cores.
- –partition Specifies the Slurm partition or queue where the job should run (in this case, “shared”).
- –mail-type Requests email notifications at the end of the job.
- –mail-user Uses the
$USER
environment variable to specify the email address where notifications will be sent. This assumes that the user’s email is in the formatusername@jhu.edu
.
- –reservation Associates the job with a reservation named
Training
.
- The script loads the Intel compiler module with version 2022.2 using the
ml
command. This is done to ensure that the correct compiler environment is set up for compilation. - It echoes the compilation command that will be used (
mpiicc - qopenmp -o hello-mix.x hello-world-mix.c
). This is commented out because it’s not actually compiling the code in the script, but you can uncomment it and run it outside the script. - Finally, it runs the mixed MPI/OpenMP program using the
mpirun
command with 2 MPI processes. The program is expected to use OpenMP for shared-memory parallelism.
To submit this mixed MPI/OpenMP job to the Slurm scheduler, save it to a file (e.g., mpi_omp_job_script.sh
) and then submit it using the sbatch command:
[userid@local ~]$ sbatch mpi_omp_job_script.s
The scheduler will allocate resources and run the mixed MPI/OpenMP program with the specified parameters.