Compute Clusters
Table of Contents
General Information
At the moment the following compute clusters are available:
Cluster | Nodes | CPUs | Clock | Cache | Memory | spec. queue |
---|---|---|---|---|---|---|
WAP/ITP | 43+2 | 2x6 Xeon E5-2640 | 2.5 GHz | 16 MB | 64GB | |
WAP/STP | 27+2 | 2x6 Xeon E5-2640 | 2.5 GHz | 16 MB | 64GB | |
SFB | 20+4 | 2x6 Xeon E5-2640 | 2.5 GHz | 16 MB | 64GB | |
CRC | 9 | 2x10 Xeon E5-2640 v4 | 2.4 GHz | 25 MB | 128GB | |
BuildMona | 35 | 2x4 Opteron 2376 | 2.3 GHz | 512 kB | 16GB | |
For 877 | 7 | 2x4 Xeon E5430 | 2.7 GHz | 12 MB | 16 GB |
- The WAP/ITP, WAP/STP compute clusters can be used by any regular ITP member.
- For a user account or scratch data space ask mailto:helpdesk@itp.uni-leipzig.de
See our performance monitoring page.
Usage
The compute cluster resources are managed by (SLURM) – the Simple Linux Utility for Resource Management queueing system. Please take a look on the SLURM cheat sheet for an overview.
Job Queueing System
Jobs can be submitted from the frontend servers
- kreacher.physik.uni-leipzig.de
- dobby.physik.uni-leipzig.de (CQT)
- emmy.physik.uni-leipzig.de (STP)
- stark.physik.uni-leipzig.de
Partitions
partition | description |
---|---|
default | Routing queue to batch |
batch | common execution queue, limited to 2 days |
bigmem | execution queue for large memory calculations |
gpu | jobs that use GPUs |
student | execution queue for students |
Common Commands
- Resource Manager
The top commands are sufficient for basic use of the compute clusters.
sbatch submit a new job scancel delete a job sinfo show queue and job status Refer to the SLURM documentation for detailed information: SLURM online documentation
Job submission
Typically a job file is used to describe a job. These are simply shell scripts with slurm directives in the form of comments.
Scalar Jobs
To submit a simple scalar job, that can run for a day create
a file scalartest.job
# execution queue #SBATCH -p batch # one node: #SBATCH -N 1 # maximum number of tasks per node: #SBATCH -n 1 # estimated time (hh:mm:ss): #SBATCH -t 24:00:00 # file for error messages #SBATCH -e scalar.stderr # file for normal output #SBATCH -o scalar.stdout cd /home/user/scalar scalartest
and submit it using
sbatch scalartest.job
The line beginning with #SBATCH
marks a SLURM directive, see man
sbatch
for reference on available options. Multiple options may be
given per line. Here the following is done:
-p batch
selects nodes from that partition for the jobs-N 1
or--nodes 1
selects a single node-n 1
or--ntasks 1
selects a single task, wich defaults to a single CPUs-t 24:00:00
or--time 1-00
allocates one day of runtime-e <error file>
and-o <output file>
redirects the standard output streams to these files
Without any of those settings, the job would run as well (on the
default partition though), with default resources (single CPU with 2GB
of RAM for 3 hours, check scontrol show partitions
)
This script executes the programm scalartest
on one node of the
queue batch. In this case, resource allocation could also be skipped,
as it mostly corresponds to the default resource allocations that any
job gets.
Here, the job needs one node and one processor (per node) on batch
for 24 hours. It is important to note the change into the correct
directory using the cd command. You can also use #SBATCH -D workdir
to change into working directory with name workdir.
Parallel jobs
To start an MPI job the srun command is used. OpenMP jobs can be
run over multiple processors on the same node. To submit such jobs
you have to load the openmpi environment when compiling the job
and when submitting the job module load openmpi/1.10
.
#!/bin/bash #SBATCH -p batch #SBATCH --ntasks=64 #SBATCH -t 24:00:00 #SBATCH -e scalar.stderr #SBATCH -o scalar.stdout cd /home/user/mpi srun -n 64 ./mpitest
This jobs distributes a total of 64 tasks. The argument to srun (-n) is the (same) total number of processes. The actual program executed on the nodes is mpitest. Note that you can also demand the number of nodes and tasks per node explicitly, see the last example.
Long jobs
If you want to submit a job which needs more than 2 days you have to
specify the QOS label long
for that job. Just add
#SBATCH --qos=long
to your job file. But be careful, "long" jobs can only access a smaller part of the cluster. This is necessary since allowing "long" jobs to fill the complete cluster could prevent all other users to run jobs for weeks.
GPU jobs
To use our GPU queue, your slurm script must contains the following directives:
#!/bin/bash #Use gpu queue #SBATCH -p gpu #Use specified graphic card #SBATCH --gres=gpu:GTX1080:1
Your script must be put in the "gpu" partition and ask for gpu ressources via the "–gres" directive. It consists of: the type of ressource (always "gpu" here), the type (possible values: "K20", "GTX1080", "RTX2080TI", "TitanRTX" in ascending order of performance) and the requested number of gpus (there are 2 of each available per node, except for the TitanRTX).
Normally, you would choose the "GTX1080" for usual tasks, "K20" for basic testing and the upper two for greater requirements. As always, be mindful of your ressource usage, as the gpu ressources are especially sparse.
Useful slurm directives
-e | standard error stream |
-o | standard output stream |
-N | number of nodes |
-p | partition of the job (queue) |
-a | specify an array of job using $SLURMARRAYTASKID index variable |
Scratch data space
To avoid high load on the main frontend server and to improve I/O performance for users it is highly recommended to write simulation data to scratch space.
Additional to the common workstation scratch spaces, the following scratch spaces are available:
mount point | size | notes |
---|---|---|
/scratch/dobby | 11TB | RAID6 volume |
/scratch/emmy | 7TB | RAID6 volume |
/scratch/shodan | 11TB | RAID6 volume |
/scratch/grawp | 9TB | RAID6 |
For i/o-intensive programs always use cluster-local storage (ie. scratch space should be near to computations in terms of distance in the network):
Development Tools
Hardware
batch | grawp | for877 | mona | crc | |
---|---|---|---|---|---|
Nodes | 100 | 31 | 7 | 15 | 9 |
Throughput | |||||
CPU | 2x6 Intel E5-2640 0 @ 2.50GHz | 2x2 Opteron 2218 @ 2.6GHz | 2x4 Xeon E5430 @ 2.7GHz | 2x4 Opteron 2376 @ 2.3GHz | 2x10 Intel E5-2640 v4 @ 2.40GH |
CPU Cache | 2560kB per Core | 1MB | 12MB | 512kB | |
CPU arch | Core i7 AVX | AMD K8 | Core2 | AMD k10 | |
Memory | 32 / 64 / 128 / 256 / 384 GB | 4GB | 16GB | 16GB | 128GB |