Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
slurm [2019/09/23 17:26]
mickey [Useful reference pages]
slurm [2020/09/17 13:19] (current)
admin [Trabajando con SLURM]
Line 1: Line 1:
-====== SLURM ======+====== SLURM 19.04 ======
  
 Conceptos básicos Conceptos básicos
Line 17: Line 17:
 Cancelar un trabajo Cancelar un trabajo
  
-===== How to rewrite PBSPro/SGE script to SLURM script ​=====+===== Trabajando con SLURM =====
  
-==== Common command ====+**Simple usage for soroban**
  
-| |**PBS command** |**SGE command** |**SLURM command** | +0. Note.
-|Job submission|qsub [scriptfile]|qsub [scriptfile]|sbatch [scriptfile]| +
-|Job deletion|qdel [job_id]|qdel [job_id]|scancel –clusters=[cluster_name] [job_id]| +
-|Job status (for user)|qstat -u [username]|qstat -u [username]|squeue -u [username]| +
-|Extended job status|qstat -f [job_id]|qstat -f -j [job_id]|scontrol –clusters=[cluster_name] show jobid=[job_id]| +
-|Hold a job temporarily|qhold [job_id]|qhold [job_id]|scontrol hold [job_id]| +
-|Release job hold|qrls [job_id]|qrls [job_id]|scontrol release [job_id]| +
-|List of usable queues|qstat -Q|qconf -sql|scontrol show queue?|+
  
-==== Resource speciation ====+'-p intel' (equivalent long option: '​–partition=intel'​) is required for soroban.
  
-|   ​|**PBS command** ​  ​|**SGE command** ​  ​|**SLURM command** ​  | +1. Save below as text file (e.g. my_first_slurm.sh).
-|Queue|#PBS -q [queue]|#$ -q [queue]|#​SBATCH -M=[queue] / #SBATCH –clusters=[queue]| +
-|Processors (Single host)|#PBS -l select=1:​ncpus=[#​]|#​$ -pe smp [#]|#SBATCH -c=[#]?| +
-|Wall clock limit|#PBS -l walltime=[hh:​mm:​ss]|#​$ -l time=[hh:​mm:​ss]|#​SBATCH -t=[#] ("​minutes",​ "​minutes:​seconds",​ "​hours:​minutes:​seconds",​ "​days-hours",​ "​days-hours:​minutes"​ and "​days-hours:​minutes:​seconds"​)| +
-|Memory requirement|#​PBS -l mem=XXXXmb|#​$ -mem [#​]G|#​SBATCH –mem=[#​][unit;​K/​M/​G/​T]?​| +
-|Standard output ​file|#PBS -o [file]|#$ -o [path]|#​SBATCH -o [path]| +
-|Standard error|#PBS -e [file]|#$ -e [path]|#​SBATCH -e [path]| +
-|Array job|#PBS -J [#-#]|#$ -t [#​-#​]|#​SBATCH -a=[#-#]| +
-|Array number Variable name|${PBS_ARRAY_INDEX}|${SGE_TASK_ID}|${SLURM_ARRAY_TASK_ID}| +
-|Max simulaneously running tasks for an array job|n/a?|#$ -tc [#]|#SBATCH -a=[#​-#​]%[#​] ​(e.g. -a=0-15%4)+
-|Copy environment|#​PBS -V|#$ -V|#SBATCH –get-user-env| +
-|Notification event|#PBS -m abe|#$ -m abe|?| +
-|Email address|#​PBS -M [email]|#$ -M [email]|#​SBATCH -M [email]| +
-|Job name|#PBS -N [name]|#$ -N [name]|#​SBATCH -J [name]| +
-|Job restart|#​PBS -r [y/n]|#$ -r [yes/​no]|#​SBATCH –requeue / #SBATCH –no-requeue?​| +
-|Move current directory|n/​a|#​$ -cwd|?| +
-|Move working directory|n/​a (in the main part of script, add cd ${PBS_O_WORKDIR})|#​$ -wd|#SBATCH -D [working_dirpath]| +
-|Use BASH|#PBS -S /​bin/​bash|?​|shebang line (At the first line of the script, add #​!/​usr/​bin/​bash|+
  
-===== Trabajando con SLURM =====+<​code>​ 
 +#​!/​bin/​bash 
 +#SBATCH --job-name=example ​ # Nombre para el trabajo a ejecutar en el cluster 
 +#SBATCH --partition=intel 
 +#SBATCH --output=example_%j.out 
 +#SBATCH --error=example_%j.err 
 + 
 +ls -lh 
 +pwd 
 +</​code>​ 
 + 
 +2. Submit it as SLURM job. 
 + 
 +sbatch (e.g. sbatch my_first_slurm.sh) 
 + 
 +3. Check progress. 
 + 
 +squeue
  
 **Ejecutando un programa con openMPI , usando un script base para SLURM:** **Ejecutando un programa con openMPI , usando un script base para SLURM:**
Line 57: Line 50:
 #!/bin/bash #!/bin/bash
 #SBATCH --job-name=example ​ # Nombre para el trabajo a ejecutar en el cluster #SBATCH --job-name=example ​ # Nombre para el trabajo a ejecutar en el cluster
-#SBATCH --partition=troquil +#SBATCH --partition=intel 
-#SBATCH -n 32  # Debe de ser un número múltiplo de 16 +#SBATCH -n 32  # Debe de ser un número múltiplo de 16, número de procesos 
-#SBATCH --ntasks-per-node=16 # máximo por blade+#SBATCH --ntasks-per-node=16 # máximo por nodo
 #SBATCH --output=example_%j.out #SBATCH --output=example_%j.out
 #SBATCH --error=example_%j.err #SBATCH --error=example_%j.err
-#SBATCH --mail-user=username@ufrontera.cl+#SBATCH --mail-user=username@ufrontera.cl ​ #​correo para notificacion
 #SBATCH --mail-type=ALL #SBATCH --mail-type=ALL
  
 srun ./​mpi_programa srun ./​mpi_programa
 +</​code>​
 +
 +
 +===== 4. Ejemplo básico 3 =====
 +
 +Este es un ejemplo de un script (ejemplo3.sh) con los elementos minimos para ejecutar el programa R-3.6.1 a través de slurm:
 +
 +<​code>​
 +#!/bin/bash
 +
 +#SBATCH -J R-NOMBRE-SIMULACION
 +#SBATCH -a 1-11%3
 +#SBATCH --nodes=1
 +#SBATCH --tasks-per-node=1
 +#SBATCH --mem=100G
 +#SBATCH --partition=intel
 +
 +module load R/3.6.1
 +
 +cmds=(
 +'sleep 10;echo 10'
 +'sleep 20;echo 20'
 +'sleep 30;echo 30'
 +'sleep 40;echo 40'
 +'sleep 50;echo 50'
 +)
 +eval ${cmds[$SLURM_ARRAY_TASK_ID - 1]}
 +</​code>​
 +
 +Para enviar este script a slurm, crear un job, y comenzar el procesamiento se requiere lo siguiente:
 +
 +<​code>​
 +chmod +x ejemplo3.sh
 +</​code>​
 +
 +<​code>​
 +sbatch ejemplo3.sh
 </​code>​ </​code>​
  
Line 93: Line 123:
   * scancel   * scancel
   * scontrol   * scontrol
 +  * sinfo
   * squeue   * squeue
   * sreport   * sreport
- 
  
 ==== Useful reference pages ==== ==== Useful reference pages ====
Line 109: Line 139:
 [[https://​www.accre.vanderbilt.edu/​wp-content/​uploads/​2016/​04/​UsingArrayJobs.pdf|https://​www.accre.vanderbilt.edu/​wp-content/​uploads/​2016/​04/​UsingArrayJobs.pdf]] [[https://​www.accre.vanderbilt.edu/​wp-content/​uploads/​2016/​04/​UsingArrayJobs.pdf|https://​www.accre.vanderbilt.edu/​wp-content/​uploads/​2016/​04/​UsingArrayJobs.pdf]]
  
-https://​help.rc.ufl.edu/​doc/​SLURM_Job_Arrays+[[https://​help.rc.ufl.edu/​doc/​SLURM_Job_Arrays|https://​help.rc.ufl.edu/​doc/​SLURM_Job_Arrays]] 
 + 
 +===== How to rewrite PBSPro/SGE script to SLURM script ===== 
 + 
 +==== Common command ==== 
 + 
 +| |**PBS command** |**SGE command** |**SLURM command** | 
 +|Job submission|qsub [scriptfile]|qsub [scriptfile]|sbatch [scriptfile]| 
 +|Job deletion|qdel [job_id]|qdel [job_id]|scancel –clusters=[cluster_name] [job_id]| 
 +|Job status (for user)|qstat -u [username]|qstat -u [username]|squeue -u [username]| 
 +|Extended job status|qstat -f [job_id]|qstat -f -j [job_id]|scontrol –clusters=[cluster_name] show jobid=[job_id]| 
 +|Hold a job temporarily|qhold [job_id]|qhold [job_id]|scontrol hold [job_id]| 
 +|Release job hold|qrls [job_id]|qrls [job_id]|scontrol release [job_id]| 
 +|List of usable queues|qstat -Q|qconf -sql|sinfo, squeue| 
 + 
 +==== Resource speciation ==== 
 + 
 +|   ​|**PBS command** ​  ​|**SGE command** ​  ​|**SLURM command** ​  | 
 +|Queue|#PBS -q [queue]|#$ -q [queue]|#​SBATCH -M=[queue] / #SBATCH –clusters=[queue]| 
 +|Processors (Single host)|#PBS -l select=1:​ncpus=[#​]|#​$ -pe smp [#]|#SBATCH -c=[#]?| 
 +|Wall clock limit|#PBS -l walltime=[hh:​mm:​ss]|#​$ -l time=[hh:​mm:​ss]|#​SBATCH -t=[#] ("​minutes",​ "​minutes:​seconds",​ "​hours:​minutes:​seconds",​ "​days-hours",​ "​days-hours:​minutes"​ and "​days-hours:​minutes:​seconds"​)| 
 +|Memory requirement|#​PBS -l mem=XXXXmb|#​$ -mem [#​]G|#​SBATCH –mem=[#​][unit;​K/​M/​G/​T]?​| 
 +|Standard output file|#PBS -o [file]|#$ -o [path]|#​SBATCH -o [path]| 
 +|Standard error|#PBS -e [file]|#$ -e [path]|#​SBATCH -e [path]| 
 +|Array job|#PBS -J [#-#]|#$ -t [#​-#​]|#​SBATCH -a=[#-#]| 
 +|Array number Variable name|${PBS_ARRAY_INDEX}|${SGE_TASK_ID}|${SLURM_ARRAY_TASK_ID}| 
 +|Max simulaneously running tasks for an array job|n/a?|#$ -tc [#]|#SBATCH -a=[#​-#​]%[#​] (e.g. -a=0-15%4)| 
 +|Copy environment|#​PBS -V|#$ -V|#SBATCH –get-user-env| 
 +|Notification event|#PBS -m abe|#$ -m abe|?| 
 +|Email address|#​PBS -M [email]|#$ -M [email]|#​SBATCH -M [email]| 
 +|Job name|#PBS -N [name]|#$ -N [name]|#​SBATCH -J [name]| 
 +|Job restart|#​PBS -r [y/n]|#$ -r [yes/​no]|#​SBATCH –requeue / #SBATCH –no-requeue?​| 
 +|Move current directory|n/​a|#​$ -cwd|?| 
 +|Move working directory|n/​a (in the main part of script, add cd ${PBS_O_WORKDIR})|#​$ -wd|#SBATCH -D [working_dirpath]| 
 +|Use BASH|#PBS -S /​bin/​bash|?​|shebang line (At the first line of the script, add #​!/​usr/​bin/​bash| 
 + 
 +\\
  
  
  • slurm.1569270365.txt.bz2
  • Last modified: 2019/09/23 17:26
  • by mickey