Slurm User Guide

Slurm Job Arrays

What are Slurm Job Arrays?

Slurm job arrays are a mechanism for submitting and managing collections of similar jobs quickly and easily. Instead of submitting hundreds or thousands of individual jobs, you can use a job array to submit a single job script that will spawn multiple job tasks.

How to Use Job Arrays

To create a job array, you use the --array option in your Slurm batch script:

#SBATCH --array=1-100

This will create 100 job array tasks, numbered from 1 to 100.

More Complex Array Specifications:

Range with step: #SBATCH --array=1-100:10 (1, 11, 21, ..., 91)
Comma-separated list: #SBATCH --array=1,5,7,9
Combination: #SBATCH --array=1-5,10,20-25

Using the Array Task ID

Within your job script, you can use the $SLURM_ARRAY_TASK_ID environment variable to distinguish between tasks:

#!/bin/bash
#SBATCH --array=1-100

echo "Processing file_${SLURM_ARRAY_TASK_ID}.txt"
./my_program input_${SLURM_ARRAY_TASK_ID}.dat output_${SLURM_ARRAY_TASK_ID}.result

Why Use Job Arrays?

Efficiency in Job Submission: Submit many similar jobs with a single script.
Easier Management: Manage a group of related jobs as a single unit.
Improved Scheduling: Slurm can schedule array tasks more efficiently than individual jobs.
Reduced System Overhead: Less load on the scheduling system compared to submitting many individual jobs.
Simplified Dependency Management: You can make other jobs depend on the entire array or specific tasks.

Best Practices and Tips

Limit Concurrent Tasks: Use #SBATCH --array=1-1000%20 to limit to 20 concurrently running tasks.
Output Files: Use %A for array job ID and %a for task ID in output file names:
```
#SBATCH --output=output_%A_%a.log
```
Resource Allocation: Ensure each task has appropriate resources. Arrays are good for many small, similar jobs.
Task Independence: Array tasks should be independent of each other to run efficiently.

Example: Processing Multiple Datasets

#!/bin/bash
#SBATCH --job-name=data_process
#SBATCH --output=output_%A_%a.log
#SBATCH --error=error_%A_%a.log
#SBATCH --array=1-100
#SBATCH --time=01:00:00
#SBATCH --mem=4G

# List of datasets
datasets=(dataset1.csv dataset2.csv dataset3.csv ... dataset100.csv)

# Get the dataset for this task
dataset=${datasets[$SLURM_ARRAY_TASK_ID - 1]}

# Run the processing script
python process_data.py $dataset

Managing Job Arrays

View array jobs: squeue -a
Cancel entire array: scancel [array_job_id]
Cancel specific task: scancel [array_job_id]_[task_id]

When to Use Job Arrays

Job arrays are ideal for:

Parameter sweeps in simulations or analyses
Processing multiple input files with the same program
Running the same analysis on different datasets
Embarrassingly parallel problems where tasks are independent

By using job arrays effectively, you can significantly streamline your workflow, especially when dealing with large numbers of similar computational tasks.