Compute Clusters

General Information
Usage
- Job Queueing System
- Job submission
Scratch data space
Development Tools
Hardware

General Information

At the moment the following compute clusters are available:

Cluster	Nodes	CPUs	Clock	Cache	Memory
WAP/ITP	43+2	2x6 Xeon E5-2640	2.5 GHz	16 MB	64GB
WAP/STP	27+2	2x6 Xeon E5-2640	2.5 GHz	16 MB	64GB
SFB	20+4	2x6 Xeon E5-2640	2.5 GHz	16 MB	64GB
CRC	9	2x10 Xeon E5-2640 v4	2.4 GHz	25 MB	128GB
BuildMona	35	2x4 Opteron 2376	2.3 GHz	512 kB	16GB
For 877	7	2x4 Xeon E5430	2.7 GHz	12 MB	16 GB

The WAP/ITP, WAP/STP compute clusters can be used by any regular ITP member.
For a user account or scratch data space ask mailto:helpdesk@itp.uni-leipzig.de

See our performance monitoring page.

Usage

The compute cluster resources are managed by (SLURM) – the Simple Linux Utility for Resource Management queueing system. Please take a look on the SLURM cheat sheet for an overview.

Job Queueing System

Jobs can be submitted from the frontend servers

kreacher.physik.uni-leipzig.de
dobby.physik.uni-leipzig.de (CQT)
emmy.physik.uni-leipzig.de (STP)
stark.physik.uni-leipzig.de

Partitions

partition	description
default	Routing queue to batch
batch	common execution queue, limited to 2 days
bigmem	execution queue for large memory calculations
gpu	jobs that use GPUs
student	execution queue for students

Common Commands

Resource Manager

The top commands are sufficient for basic use of the compute clusters.

sbatch submit a new job

scancel delete a job

sinfo show queue and job status

Refer to the SLURM documentation for detailed information: SLURM online documentation

Job submission

Typically a job file is used to describe a job. These are simply shell scripts with slurm directives in the form of comments.

Scalar Jobs

To submit a simple scalar job, that can run for a day create a file scalartest.job

# execution queue
#SBATCH -p batch
# one node:
#SBATCH -N 1
# maximum number of tasks per node:
#SBATCH -n 1
# estimated time (hh:mm:ss):
#SBATCH -t 24:00:00
# file for error messages
#SBATCH -e scalar.stderr
# file for normal output
#SBATCH -o scalar.stdout
cd /home/user/scalar
scalartest

and submit it using

sbatch scalartest.job

The line beginning with #SBATCH marks a SLURM directive, see man sbatch for reference on available options. Multiple options may be given per line. Here the following is done:

-p batch selects nodes from that partition for the jobs
-N 1 or --nodes 1 selects a single node
-n 1 or --ntasks 1 selects a single task, wich defaults to a single CPUs
-t 24:00:00 or --time 1-00 allocates one day of runtime
-e <error file> and -o <output file> redirects the standard output streams to these files

Without any of those settings, the job would run as well (on the default partition though), with default resources (single CPU with 2GB of RAM for 3 hours, check scontrol show partitions)

This script executes the programm scalartest on one node of the queue batch. In this case, resource allocation could also be skipped, as it mostly corresponds to the default resource allocations that any job gets.

Here, the job needs one node and one processor (per node) on batch for 24 hours. It is important to note the change into the correct directory using the cd command. You can also use #SBATCH -D workdir to change into working directory with name workdir.

Parallel jobs

To start an MPI job the srun command is used. OpenMP jobs can be run over multiple processors on the same node. To submit such jobs you have to load the openmpi environment when compiling the job and when submitting the job module load openmpi/1.10.

#!/bin/bash
#SBATCH -p batch
#SBATCH --ntasks=64
#SBATCH -t 24:00:00
#SBATCH -e scalar.stderr
#SBATCH -o scalar.stdout
cd /home/user/mpi
srun -n 64 ./mpitest

This jobs distributes a total of 64 tasks. The argument to srun (-n) is the (same) total number of processes. The actual program executed on the nodes is mpitest. Note that you can also demand the number of nodes and tasks per node explicitly, see the last example.

Long jobs

If you want to submit a job which needs more than 2 days you have to specify the QOS label long for that job. Just add

#SBATCH --qos=long

to your job file. But be careful, "long" jobs can only access a smaller part of the cluster. This is necessary since allowing "long" jobs to fill the complete cluster could prevent all other users to run jobs for weeks.

GPU jobs

To use our GPU queue, your slurm script must contains the following directives:

#!/bin/bash
#Use gpu queue
#SBATCH -p gpu
#Use specified graphic card
#SBATCH --gres=gpu:GTX1080:1

Your script must be put in the "gpu" partition and ask for gpu ressources via the "–gres" directive. It consists of: the type of ressource (always "gpu" here), the type (possible values: "K20", "GTX1080", "RTX2080TI", "TitanRTX" in ascending order of performance) and the requested number of gpus (there are 2 of each available per node, except for the TitanRTX).

Normally, you would choose the "GTX1080" for usual tasks, "K20" for basic testing and the upper two for greater requirements. As always, be mindful of your ressource usage, as the gpu ressources are especially sparse.

Useful slurm directives

-e	standard error stream
-o	standard output stream
-N	number of nodes
-p	partition of the job (queue)
-a	specify an array of job using $SLURM_ARRAY_TASK_ID index variable

Scratch data space

To avoid high load on the main frontend server and to improve I/O performance for users it is highly recommended to write simulation data to scratch space.

Additional to the common workstation scratch spaces, the following scratch spaces are available:

mount point	size	notes
/scratch/dobby	11TB	RAID6 volume
/scratch/emmy	7TB	RAID6 volume
/scratch/shodan	11TB	RAID6 volume
/scratch/grawp	9TB	RAID6

For i/o-intensive programs always use cluster-local storage (ie. scratch space should be near to computations in terms of distance in the network):

Development Tools

compilers

Hardware

	batch	grawp	for877	mona	crc
Nodes	100	31	7	15	9
Throughput
CPU	2x6 Intel E5-2640 0 @ 2.50GHz	2x2 Opteron 2218 @ 2.6GHz	2x4 Xeon E5430 @ 2.7GHz	2x4 Opteron 2376 @ 2.3GHz	2x10 Intel E5-2640 v4 @ 2.40GH
CPU Cache	2560kB per Core	1MB	12MB	512kB
CPU arch	Core i7 AVX	AMD K8	Core2	AMD k10
Memory	32 / 64 / 128 / 256 / 384 GB	4GB	16GB	16GB	128GB

sbatch	submit a new job
scancel	delete a job
sinfo	show queue and job status