Table of Contents
GALILEO
NEWS
2015/02/23 Job accounting starts today. Users must be authorized by the "Iniziativa Specifica" to whom they belong.
2015/01/28 Galileo is open to the INFN users for the "pre-production" tests. The account for this activity is INFNG_test.
The MIC accelerators are not available yet.
GALILEO login
- In order to become a CINECA user you have to register yourself on the CINECA UserDB ( https://userdb.hpc.cineca.it/user ). The procedure will create a new username associated to your identity (Skip this step if you already have a Cineca username).
- Each user must be associated to the Account raleted to the "iniziativa Specifica" (IS); this is needed for the accounting of the consumed budget. Please contact the responsible of your IS in order to be enabled.
- At the end of the previous step you can access the Galileo front-end login.galileo.cineca.it via ssh or other standard tools Access to the systems
The following command displays the accounts associated with your username and the relative usage:
login> saldo -b
GALILEO usage
Architecture
Model: IBM NeXtScale - Architecture: Linux Infiniband Cluster Nodes: 516 Processors: 8-cores Intel Haswell 2.40 GHz (2 per node) Cores: 16 cores/node, 8256 cores in total Accelerators: 2 Intel Phi 7120p per node on 384 nodes (768 in total) RAM: 128 GB/node, 8 GB/core Internal Network: Infiniband with 4x QDR switches Disk Space:2,500 TB of local storage Peak Performance: xxx TFlop/s (to be defined)
To get on-line details:
login> pbsnodes -a | egrep '(Mom|available.mem|available.cpuspeed|available.nmics|available.ngpus)'
Batch scheduler
THe job management facility adopted by CINECA is PBS: Batch Scheduler PBS
Routing Queue "route": This is the default queue. You have only to declare how many resources you need and your job will be directed into the right queue with a right priority. Normal parallel jobs will be routed to the "shared" execution queue. The maximum number of nodes that you can require is 128 with a maximum walltime of 24 hours.
Script example (script.pbs)
#!/bin/bash #PBS -N prova #PBS -l walltime=02:00:00 #PBS -l select=16:ncpus=16:mpiprocs=16 #PBS -A INFNG_test # module load intel/cs-xe-2015--binary module load intelmpi/5.0.2--binary cd working_dir mpirun executable
Submit your job
qsub script.pbs
Monitor your job
qstat [-u username]
Cancel your job
qdel JOB.id
Interactive example (option -I):
qsub -l select=1:ncpus=16 -A INFNG_test -I > cat $PBS_NODEFILE > exit
Default values assigned by the queue manager
- 1 CPU
- 8GB of memory (each node has 128 GB ram)
- Max Walltime: 30 minutes
- MICs : 0
- MPI processes : 1 per node
- cores allocation: Pack (try to pack requested CPU on smallest number of nodes)
The default walltime is 30 minutes.
More complex requests
qsub -A INFNG_test -I -l ncpus=16,walltime=24:00:00 # ask 16 CPUs and 1 day walltime qsub -A INFNG_test -I -l select=2:ncpus=16:mem=120gb # ask 2 chunks of 16 nodes each (2 whole nodes) qsub -A INFNG_test -I -l select=16:ncpus=1,place=scatter # Each chunk is allocated to a separate host (default) qsub -A INFNG_test -I -l select=16:ncpus=1,place=pack # All chunks are allocated from vnodes on the same host qsub -A INFNG_test -I -l select=2:ncpus=16:mem=124gb:nmics=2 # ask 2 whole node including MICs (16 cores and 124 GB and 2 MICs per node) qsub -A INFNG_test -I -l select=2:ncpus=16:mem=120gb:mpiprocs=1 # PBS_NODEFILE incluedes 1 istance per node (default) qsub -A INFNG_test -I -l select=2:ncpus=16:mem=120gb:mpiprocs=16 # PBS_NODEFILE incluedes 16 istances per node
Storage
CINECA documentation: Galileo Disks and file system
$HOME (/galileo/home/userexternal/<username>) (permanent/ backuped) $CINECA_SCRATCH (/gpfs/scratch/userexternal/<username>) (temporary) $WORK ( /gpfs/work/<YOUR_GROUP_ACCOUNT_AREA> )
Use the local command "cindata" to query for disk usage and quota ("cindata -h" for help):
cindata
Software Environment
- OS: RedHat CentOS release 7, 64 bit
- Compilers, scientific libraries and tools are installed using the software modules mechanism.
CINECA Documentation: Programming environment - Compilers - Debuggers and profilers
MIC job submission (Work in progress)
Compilation
- login on one mic-node using command
qsub -A INFNG_test -I -l select=1:ncpus=16:nmics=2 # select a whole node with 2 mics
- load needed modules and set variables
module load intel intelmpi mkl source $INTEL_HOME/bin/compilervars.sh intel64 export I_MPI_MIC=enable
- compile
- exit
Execution on mic-node
qsub -A INFNG_test -I -l select=1:ncpus=16:nmics=2 module load intel module load intelmpi source $INTEL_HOME/bin/compilervars.sh intel64 ./exe-offload.x
Execution using PBS from front-end
Example of PBS file
#!/bin/bash #PBS -l select=1:ncpus=16:nmics=2 #PBS -l walltime=00:20:00 #PBS -A INFNG_test # load required modules module load intel intelmpi mkl source $INTEL_HOME/bin/compilervars.sh intel64 export I_MPI_MIC=enable export MIC0=$(head -n 1 $PBS_NODEFILE | sed "s/[(DDD).]/$1-mic0./") export MIC1=$(head -n 1 $PBS_NODEFILE | sed "s/[(DDD).]/$1-mic1./") cd <workdir> export MIC_PATH= export MIC_PATH=$MIC_PATH:/eurora/prod/compilers/intel/cs-xe-2013/binary/composer_xe_2013/mkl/lib/mic/ export MIC_PATH=$MIC_PATH:/eurora/prod/compilers/intel/cs-xe-2013/binary/composer_xe_2013/lib/mic mpirun -genv LD_LIBRARY_PATH $MIC_PATH -host ${MIC0},${MIC1} -perhost 1 ./imb/3.2.4/bin/IMB-MPI1.mic pingpong
2015/02/23