slurm

This is an old revision of the document!


SLURM

Conceptos básicos

Jobs

Particiones

Task

Comandos básicos

Consultar cola

Enviar un programa

Cancelar un trabajo

Simple usage for soroban 0. Note. '-p intel' (–partition=intel) is required for soroban.

1. Save below as text file (e.g. my_first_slurm.sh).

#!/bin/bash
#SBATCH --job-name=example  # Nombre para el trabajo a ejecutar en el cluster
#SBATCH --partition=intel
#SBATCH -n 32  # Debe de ser un número múltiplo de 16
#SBATCH --ntasks-per-node=16 # máximo por blade
#SBATCH --output=example_%j.out
#SBATCH --error=example_%j.err

ls -lh
pwd

2. Submit it as SLURM job. sbatch <filename> (e.g. sbatch my_first_slurm.sh)

3. Check progress. squeue

Ejecutando un programa con openMPI , usando un script base para SLURM:

#!/bin/bash
#SBATCH --job-name=example  # Nombre para el trabajo a ejecutar en el cluster
#SBATCH --partition=troquil
#SBATCH -n 32  # Debe de ser un número múltiplo de 16
#SBATCH --ntasks-per-node=16 # máximo por blade
#SBATCH --output=example_%j.out
#SBATCH --error=example_%j.err
#SBATCH --mail-user=username@ufrontera.cl
#SBATCH --mail-type=ALL

srun ./mpi_programa

man sbatch

SLURM commands

  • sacct
  • salloc
  • sbatch
  • scancel
  • scontrol
  • squeue
  • sreport

About job array

PBS command SGE command SLURM command
Job submissionqsub [scriptfile]qsub [scriptfile]sbatch [scriptfile]
Job deletionqdel [job_id]qdel [job_id]scancel –clusters=[cluster_name] [job_id]
Job status (for user)qstat -u [username]qstat -u [username]squeue -u [username]
Extended job statusqstat -f [job_id]qstat -f -j [job_id]scontrol –clusters=[cluster_name] show jobid=[job_id]
Hold a job temporarilyqhold [job_id]qhold [job_id]scontrol hold [job_id]
Release job holdqrls [job_id]qrls [job_id]scontrol release [job_id]
List of usable queuesqstat -Qqconf -sqlsinfo, squeue
PBS command SGE command SLURM command
Queue#PBS -q [queue]#$ -q [queue]#SBATCH -M=[queue] / #SBATCH –clusters=[queue]
Processors (Single host)#PBS -l select=1:ncpus=[#]#$ -pe smp [#]#SBATCH -c=[#]?
Wall clock limit#PBS -l walltime=[hh:mm:ss]#$ -l time=[hh:mm:ss]#SBATCH -t=[#] (“minutes”, “minutes:seconds”, “hours:minutes:seconds”, “days-hours”, “days-hours:minutes” and “days-hours:minutes:seconds”)
Memory requirement#PBS -l mem=XXXXmb#$ -mem [#]G#SBATCH –mem=[#][unit;K/M/G/T]?
Standard output file#PBS -o [file]#$ -o [path]#SBATCH -o [path]
Standard error#PBS -e [file]#$ -e [path]#SBATCH -e [path]
Array job#PBS -J [#-#]#$ -t [#-#]#SBATCH -a=[#-#]
Array number Variable name${PBS_ARRAY_INDEX}${SGE_TASK_ID}${SLURM_ARRAY_TASK_ID}
Max simulaneously running tasks for an array jobn/a?#$ -tc [#]#SBATCH -a=[#-#]%[#] (e.g. -a=0-15%4)
Copy environment#PBS -V#$ -V#SBATCH –get-user-env
Notification event#PBS -m abe#$ -m abe?
Email address#PBS -M [email]#$ -M [email]#SBATCH -M [email]
Job name#PBS -N [name]#$ -N [name]#SBATCH -J [name]
Job restart#PBS -r [y/n]#$ -r [yes/no]#SBATCH –requeue / #SBATCH –no-requeue?
Move current directoryn/a#$ -cwd?
Move working directoryn/a (in the main part of script, add cd ${PBS_O_WORKDIR})#$ -wd#SBATCH -D [working_dirpath]
Use BASH#PBS -S /bin/bash?shebang line (At the first line of the script, add #!/usr/bin/bash


  • slurm.1589864627.txt.bz2
  • Last modified: 2020/05/19 02:03
  • by mickey