SLURM 19.04
Conceptos básicos
Jobs
Particiones
Task
Comandos básicos
Consultar cola
Enviar un programa
Cancelar un trabajo
Trabajando con SLURM
Simple usage for soroban
0. Note.
'-p intel' (equivalent long option: '–partition=intel') is required for soroban.
1. Save below as text file (e.g. my_first_slurm.sh).
#!/bin/bash #SBATCH --job-name=example # Nombre para el trabajo a ejecutar en el cluster #SBATCH --partition=intel #SBATCH --output=example_%j.out #SBATCH --error=example_%j.err ls -lh pwd
2. Submit it as SLURM job.
sbatch (e.g. sbatch my_first_slurm.sh)
3. Check progress.
squeue
Ejecutando un programa con openMPI , usando un script base para SLURM:
#!/bin/bash #SBATCH --job-name=example # Nombre para el trabajo a ejecutar en el cluster #SBATCH --partition=intel #SBATCH -n 32 # Debe de ser un número múltiplo de 16, número de procesos #SBATCH --ntasks-per-node=16 # máximo por nodo #SBATCH --output=example_%j.out #SBATCH --error=example_%j.err #SBATCH --mail-user=username@ufrontera.cl #correo para notificacion #SBATCH --mail-type=ALL srun ./mpi_programa
4. Ejemplo básico 3
Este es un ejemplo de un script (ejemplo3.sh) con los elementos minimos para ejecutar el programa R-3.6.1 a través de slurm:
#!/bin/bash #SBATCH -J R-NOMBRE-SIMULACION #SBATCH -a 1-11%3 #SBATCH --nodes=1 #SBATCH --tasks-per-node=1 #SBATCH --mem=100G #SBATCH --partition=intel module load R/3.6.1 cmds=( 'sleep 10;echo 10' 'sleep 20;echo 20' 'sleep 30;echo 30' 'sleep 40;echo 40' 'sleep 50;echo 50' ) eval ${cmds[$SLURM_ARRAY_TASK_ID - 1]}
Para enviar este script a slurm, crear un job, y comenzar el procesamiento se requiere lo siguiente:
chmod +x ejemplo3.sh
sbatch ejemplo3.sh
List of available clusters and partitions
List of available clusters
List of available partitions
FAQ (Frequently Asked Questions)
Q. What is the difference of cluster and particion?
Q. I always use only one cluster. Is there any way to omit –clusters=[cluster_name] when I check/delete jobs by scontrol/scancel?
Further information
Use 'man' command after login to servers
man sbatch
SLURM commands
- sacct
- salloc
- sbatch
- scancel
- scontrol
- sinfo
- squeue
- sreport
Useful reference pages
About job array
https://slurm.schedmd.com/job_array.html
https://rcc.uchicago.edu/docs/running-jobs/array/index.html
https://www.accre.vanderbilt.edu/wp-content/uploads/2016/04/UsingArrayJobs.pdf
How to rewrite PBSPro/SGE script to SLURM script
Common command
PBS command | SGE command | SLURM command | |
Job submission | qsub [scriptfile] | qsub [scriptfile] | sbatch [scriptfile] |
Job deletion | qdel [job_id] | qdel [job_id] | scancel –clusters=[cluster_name] [job_id] |
Job status (for user) | qstat -u [username] | qstat -u [username] | squeue -u [username] |
Extended job status | qstat -f [job_id] | qstat -f -j [job_id] | scontrol –clusters=[cluster_name] show jobid=[job_id] |
Hold a job temporarily | qhold [job_id] | qhold [job_id] | scontrol hold [job_id] |
Release job hold | qrls [job_id] | qrls [job_id] | scontrol release [job_id] |
List of usable queues | qstat -Q | qconf -sql | sinfo, squeue |
Resource speciation
PBS command | SGE command | SLURM command | |
Queue | #PBS -q [queue] | #$ -q [queue] | #SBATCH -M=[queue] / #SBATCH –clusters=[queue] |
Processors (Single host) | #PBS -l select=1:ncpus=[#] | #$ -pe smp [#] | #SBATCH -c=[#]? |
Wall clock limit | #PBS -l walltime=[hh:mm:ss] | #$ -l time=[hh:mm:ss] | #SBATCH -t=[#] (“minutes”, “minutes:seconds”, “hours:minutes:seconds”, “days-hours”, “days-hours:minutes” and “days-hours:minutes:seconds”) |
Memory requirement | #PBS -l mem=XXXXmb | #$ -mem [#]G | #SBATCH –mem=[#][unit;K/M/G/T]? |
Standard output file | #PBS -o [file] | #$ -o [path] | #SBATCH -o [path] |
Standard error | #PBS -e [file] | #$ -e [path] | #SBATCH -e [path] |
Array job | #PBS -J [#-#] | #$ -t [#-#] | #SBATCH -a=[#-#] |
Array number Variable name | ${PBS_ARRAY_INDEX} | ${SGE_TASK_ID} | ${SLURM_ARRAY_TASK_ID} |
Max simulaneously running tasks for an array job | n/a? | #$ -tc [#] | #SBATCH -a=[#-#]%[#] (e.g. -a=0-15%4) |
Copy environment | #PBS -V | #$ -V | #SBATCH –get-user-env |
Notification event | #PBS -m abe | #$ -m abe | ? |
Email address | #PBS -M [email] | #$ -M [email] | #SBATCH -M [email] |
Job name | #PBS -N [name] | #$ -N [name] | #SBATCH -J [name] |
Job restart | #PBS -r [y/n] | #$ -r [yes/no] | #SBATCH –requeue / #SBATCH –no-requeue? |
Move current directory | n/a | #$ -cwd | ? |
Move working directory | n/a (in the main part of script, add cd ${PBS_O_WORKDIR}) | #$ -wd | #SBATCH -D [working_dirpath] |
Use BASH | #PBS -S /bin/bash | ? | shebang line (At the first line of the script, add #!/usr/bin/bash |