What is Slurm?
Slurm (Simple Linux Utility for Resource Management) is a job scheduler that manages computational resources in a cluster. It allocates resources to jobs, dispatches them, monitors their execution, and cleans up after job completion.
Why use Slurm?
- Resource allocation: Once resources are allocated to your job, they're exclusively yours for the duration of execution, regardless of system load.
 - Detached execution: No need to keep an open terminal session.
 - Efficient resource use: Jobs start as soon as requested resources are available, even outside working hours.
 - Fair scheduling: Jobs are prioritized based on requested resources, user's system share, and queue time.
 
Slurm Concepts
Before diving into Slurm usage, it's important to understand some key concepts:Which Partition can I use?
You have 3 partitions on the Cluster
Basic Usage
Loading Software as modules
To use a software that's not part of the system you can load it as a module
module avail
list all available modulesmodule load R/4.4.0
load R version 4.40module list
list loaded modulesmodule unload module_name
unload loaded module module_namemodule purge
unload all loaded modulesSimple Job Submission
Prefix your command with
srun myprogram
Run an interactive bash session
srun --pty bash
Note: This uses default settings, which may not always be suitable.
Specifying a Partition
Use the -p option with srun:
srun -p partition_name myprogram
            Running Detached Jobs (Batch Mode)
- Create a shell script (batch script) containing:
                    
- Slurm directives (lines starting with 
#SBATCH) - Any necessary preparatory steps (e.g., loading modules)
 - Your 
sruncommand 
 - Slurm directives (lines starting with 
 - Submit the script using 
sbatch:sbatch myscript.sh 
Using Conda
You can use conda inside your Batch script
# Load Conda Monitoring Jobs
Checking Job Status
Use squeue to see which jobs are running or queued:
squeue
            To see only your jobs:
squeue -u yourusername
            Viewing Job Details
Use scontrol:
scontrol show job <jobid>
            Checking Job Output
Slurm captures console output to a file named slurm-<jobid>.out in the submission directory. You can examine this file while the job is running or after it finishes.
Resource Requests
CPUs
To request multiple CPU threads:
#SBATCH --cpus-per-task=X
srun --cpus-per-task=X myprogram
            Note: This argument must be given to both sbatch (via #SBATCH) and srun. The first one for the job allocation, the second for the task e
Other Resources
Specify in your batch script using #SBATCH directives:
            #SBATCH --mem=8G
            #SBATCH --time=02:00:00 
            #SBATCH --gres=gpu:1 
            
Here you have the options for Memory, Time Limit and GPUs
Example batch script
#!/bin/bash #SBATCH --job-name=conda_job #SBATCH --output=output_%j.log #SBATCH --error=error_%j.log #SBATCH --time=01:00:00 #SBATCH --ntasks=1 #SBATCH --cpus-per-task=1 #SBATCH --mem=4G # Load Conda module load anaconda3 # Activate your environment conda activate myenv # Run your Python script python my_script.py
An example script with conda, launch it with:
sbatch conda_job.sh
Useful Slurm Commands
squeue: Show job queue informationsinfo: Display node and partition informationscancel <jobid>: Delete a jobsacct: View accounting data for jobsscontrol: Detailed info