Slurm User Guide

Writing Slurm Job Scripts

A Slurm job script is a shell script (typically bash) that contains both Slurm directives and the commands you want to run on the cluster. Let's break down the components and syntax of a Slurm job script:

Basic Structure


#!/bin/bash
#SBATCH [options]
#SBATCH [more options]

# Your commands here

Shebang

The first line of your script should be the shebang:

#!/bin/bash

This tells the system to interpret the script using the bash shell.

Slurm Directives

Slurm directives are special comments that start with `#SBATCH`. They tell Slurm how to set up and run your job. Here are some common directives:

#SBATCH --job-name=my_job        # Name of the job 
#SBATCH --output=output_%j.log   # Standard output log file (%j is replaced by the job ID)
#SBATCH --error=error_%j.log     # Standard error log file
#SBATCH --time=01:00:00          # Time limit (HH:MM:SS)
#SBATCH --ntasks=1               # Number of tasks (processes)
#SBATCH --cpus-per-task=1        # Number of CPU cores per task
#SBATCH --mem=1G                 # Memory limit
#SBATCH --partition=general      # Partition (queue) name
#SBATCH --gres=gpu:2             # Request 2 GPUs

Common Slurm Directives

Here's a more comprehensive list of Slurm directives:

`--job-name=`: Set a name for the job

`--output=`: Specify the file for standard output

`--error=`: Specify the file for standard error

`--time=`: Set a time limit for the job (format: DD-HH:MM:SS)

`--ntasks=`: Specify the number of tasks to run

`--cpus-per-task=`: Set the number of CPU cores per task

`--mem=`: Set the total memory required (e.g., 1G for 1 gigabyte)

`--partition=`: Specify the partition to run the job on

`--array=`: Create a job array (e.g., --array=1-10 for 10 array jobs)

`--mail-type=`: Specify email notification events (e.g., BEGIN, END, FAIL)

`--mail-user=`: Set the email address for notifications

`--nodes=`: Request a specific number of nodes

`--gres=`: Request generic consumable resources (e.g., GPUs)

Environment Variables

Slurm sets several environment variables that you can use in your script:

- `$SLURM_JOB_ID`: The ID of the job

- `$SLURM_ARRAY_TASK_ID`: The array index for job arrays

- `$SLURM_CPUS_PER_TASK`: Number of CPUs allocated per task

- `$SLURM_NTASKS`: Total number of tasks in a job

Example Job Script

Here's an example of a more complex Slurm job script:

    #!/bin/bash 
#SBATCH --job-name=complex_job
#SBATCH --output=output_%A_%a.log
#SBATCH --error=error_%A_%a.log
#SBATCH --array=1-5
#SBATCH --time=02:00:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem=8G
#SBATCH --partition=general
#SBATCH --mail-type=BEGIN,END,FAIL
#SBATCH --mail-user=your.email@example.com

# Load any necessary modules
module load python/3.8


# Run the main command
python my_script.py --input-file input.txt --output-file output.txt

# Optional: Run some post-processing
if [ $? -eq 0 ]; then
    echo "Job completed successfully"
    python post_process.py output.txt
else
    echo "Job failed"
fi

Understanding --ntasks in Slurm

When you use the `--ntasks` option in Slurm without other specifications, it's important to understand how Slurm interprets and applies this setting.

When you specify `--ntasks=4` without other options:

Slurm will allocate resources for 4 tasks.
By default, each task is allocated 1 CPU (core).
The tasks may be distributed across multiple nodes, depending on the cluster's configuration and available resources.

#SBATCH --ntasks=4

# This will run your command with 4 tasks
srun ./my_program

In this scenario:

Your job will be allocated 4 CPUs in total.
These 4 CPUs could be on a single node or spread across multiple nodes, depending on availability and the cluster's configuration.
Each task will have access to 1 CPU by default.

Important Considerations

CPU Allocation: Without specifying `--cpus-per-task`, each task gets 1 CPU by default.
Memory Allocation: The default memory allocation per task depends on the cluster's configuration. It's often a good practice to specify memory requirements explicitly.
Node Distribution: Tasks may be distributed across nodes unless you specify `--nodes` or use the `--ntasks-per-node` option.
Parallel Execution: This setting is particularly useful for MPI jobs where you want to run multiple parallel processes.

Examples with Additional Specifications

1. Specifying CPUs per Task

#SBATCH --ntasks=4
#SBATCH --cpus-per-task=2

srun ./my_multi_threaded_program

This allocates 4 tasks, each with 2 CPUs, totaling 8 CPUs for the job.

2. Constraining to a Single Node

#SBATCH --ntasks=4
#SBATCH --nodes=1

srun ./my_program

This ensures all 4 tasks run on the same node.

3. Specifying Tasks per Node

#SBATCH --ntasks=4
#SBATCH --ntasks-per-node=2

srun ./my_program

This distributes the 4 tasks across 2 nodes, with 2 tasks per node.

Best Practices

Be explicit about your resource requirements when possible (CPUs, memory, etc.).
Consider the nature of your program (MPI, multi-threaded, etc.) when deciding how to allocate tasks and CPUs.
Use `--ntasks` in combination with other options like `--cpus-per-task` or `--nodes` for more precise control over resource allocation.
Test your job submissions with smaller task counts before scaling up to ensure proper resource utilization.

Best Practices for Job Scripts

Use variables: For repeated values or for clarity, use shell variables.
Comment your script: Explain complex parts of your script for better maintainability.
Error handling: Include error checking and handling in your script.
Modularity: For complex workflows, consider breaking your job into multiple scripts.
Resource estimation: Start with conservative resource estimates and adjust based on actual usage.
Environment setup: Load necessary modules and set environment variables at the beginning of your script.
Output management: Use job ID and array ID in output file names to avoid overwrites.