User Tools

Site Tools


cn:csn4:calcolo:suma:galileo_howto

GALILEO

NEWS

2015/02/23 Job accounting starts today. Users must be authorized by the "Iniziativa Specifica" to whom they belong.

2015/01/28 Galileo is open to the INFN users for the "pre-production" tests. The account for this activity is INFNG_test.

The MIC accelerators are not available yet.

What is GALILEO

GALILEO login

  • In order to become a CINECA user you have to register yourself on the CINECA UserDB ( https://userdb.hpc.cineca.it/user ). The procedure will create a new username associated to your identity (Skip this step if you already have a Cineca username).
  • Each user must be associated to the Account raleted to the "iniziativa Specifica" (IS); this is needed for the accounting of the consumed budget. Please contact the responsible of your IS in order to be enabled.
  • At the end of the previous step you can access the Galileo front-end login.galileo.cineca.it via ssh or other standard tools Access to the systems

The following command displays the accounts associated with your username and the relative usage:

login> saldo -b  

GALILEO usage

Architecture

Galileo Architecture

Model: IBM NeXtScale    -  Architecture: Linux Infiniband Cluster 
Nodes: 516 
Processors: 8-cores Intel Haswell 2.40 GHz (2 per node)
Cores: 16 cores/node, 8256 cores in total
Accelerators: 2 Intel Phi 7120p per node on 384 nodes  (768 in total)
RAM: 128 GB/node, 8 GB/core
Internal Network: Infiniband with 4x QDR switches
Disk Space:2,500 TB of local storage
Peak Performance: xxx TFlop/s (to be defined)

To get on-line details:

login> pbsnodes -a | egrep '(Mom|available.mem|available.cpuspeed|available.nmics|available.ngpus)'

Batch scheduler

THe job management facility adopted by CINECA is PBS: Batch Scheduler PBS

Routing Queue "route": This is the default queue. You have only to declare how many resources you need and your job will be directed into the right queue with a right priority. Normal parallel jobs will be routed to the "shared" execution queue. The maximum number of nodes that you can require is 128 with a maximum walltime of 24 hours.

Script example (script.pbs)

#!/bin/bash
#PBS -N prova
#PBS -l walltime=02:00:00
#PBS -l select=16:ncpus=16:mpiprocs=16
#PBS -A INFNG_test
#
module load intel/cs-xe-2015--binary
module load intelmpi/5.0.2--binary
cd working_dir
mpirun executable

Submit your job

 qsub script.pbs

Monitor your job

 qstat [-u username]

Cancel your job

 qdel JOB.id

Interactive example (option -I):

qsub  -l select=1:ncpus=16  -A INFNG_test -I
> cat $PBS_NODEFILE
> exit
Default values assigned by the queue manager
  • 1 CPU
  • 8GB of memory (each node has 128 GB ram)
  • Max Walltime: 30 minutes
  • MICs : 0
  • MPI processes : 1 per node
  • cores allocation: Pack (try to pack requested CPU on smallest number of nodes)

The default walltime is 30 minutes.

More complex requests
qsub -A INFNG_test -I  -l ncpus=16,walltime=24:00:00                # ask 16 CPUs and 1 day walltime
qsub -A INFNG_test -I  -l select=2:ncpus=16:mem=120gb               # ask 2 chunks of 16 nodes each (2 whole nodes) 
qsub -A INFNG_test -I  -l select=16:ncpus=1,place=scatter           # Each chunk is allocated to a separate host (default)
qsub -A INFNG_test -I  -l select=16:ncpus=1,place=pack              # All chunks are allocated from vnodes on the same host
qsub -A INFNG_test -I  -l select=2:ncpus=16:mem=124gb:nmics=2       # ask 2 whole node including MICs (16 cores and 124 GB and 2 MICs per node) 
qsub -A INFNG_test -I  -l select=2:ncpus=16:mem=120gb:mpiprocs=1    # PBS_NODEFILE incluedes 1  istance per node (default)  
qsub -A INFNG_test -I  -l select=2:ncpus=16:mem=120gb:mpiprocs=16   # PBS_NODEFILE incluedes 16 istances per node

Storage

CINECA documentation: Galileo Disks and file system

$HOME  (/galileo/home/userexternal/<username>) (permanent/ backuped)   
$CINECA_SCRATCH  (/gpfs/scratch/userexternal/<username>) (temporary)
$WORK   ( /gpfs/work/<YOUR_GROUP_ACCOUNT_AREA> )

Use the local command "cindata" to query for disk usage and quota ("cindata -h" for help):

cindata

Software Environment

  • OS: RedHat CentOS release 7, 64 bit
  • Compilers, scientific libraries and tools are installed using the software modules mechanism.

CINECA Documentation: Programming environment - Compilers - Debuggers and profilers

MIC job submission (Work in progress)

Compilation
  • login on one mic-node using command
   qsub -A INFNG_test -I -l select=1:ncpus=16:nmics=2 # select a whole node with 2 mics
  • load needed modules and set variables
   module load intel intelmpi mkl
   source $INTEL_HOME/bin/compilervars.sh intel64
   export I_MPI_MIC=enable
  • compile
  • exit
Execution on mic-node
   qsub -A INFNG_test -I -l select=1:ncpus=16:nmics=2 
   module load intel
   module load intelmpi
   source $INTEL_HOME/bin/compilervars.sh intel64
   ./exe-offload.x
Execution using PBS from front-end

Example of PBS file

#!/bin/bash
#PBS -l select=1:ncpus=16:nmics=2
#PBS -l walltime=00:20:00
#PBS -A INFNG_test

# load required modules
module load intel intelmpi mkl
source $INTEL_HOME/bin/compilervars.sh intel64
export I_MPI_MIC=enable
export MIC0=$(head -n 1 $PBS_NODEFILE | sed  "s/[(DDD).]/$1-mic0./")
export MIC1=$(head -n 1 $PBS_NODEFILE | sed  "s/[(DDD).]/$1-mic1./")
cd  <workdir>

export MIC_PATH=
export MIC_PATH=$MIC_PATH:/eurora/prod/compilers/intel/cs-xe-2013/binary/composer_xe_2013/mkl/lib/mic/
export MIC_PATH=$MIC_PATH:/eurora/prod/compilers/intel/cs-xe-2013/binary/composer_xe_2013/lib/mic

mpirun -genv LD_LIBRARY_PATH $MIC_PATH -host ${MIC0},${MIC1} -perhost 1   ./imb/3.2.4/bin/IMB-MPI1.mic pingpong

2015/02/23

cn/csn4/calcolo/suma/galileo_howto.txt · Last modified: 2015/02/24 16:58 by roberto.alfieri@infn.it

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki