User Tools

Site Tools


cn:csn4:calcolo:suma:eurora_howto

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
cn:csn4:calcolo:suma:eurora_howto [2013/11/04 10:00] – [Batch scheduler] roberto.alfieri@infn.itcn:csn4:calcolo:suma:eurora_howto [2015/02/07 18:27] (current) – [MIC] roberto.alfieri@infn.it
Line 1: Line 1:
 +
 +====== EURORA  ======
 +
 +[[http://www.cineca.it/en/content/eurora | What is EURORA]]
 +
 +===== EURORA login =====
 +
 +  * In order to become a CINECA user you have to register yourself on the CINECA UserDB ( https://userdb.hpc.cineca.it/user ). The procedure will create a new **username** associated to your identity (Skip this step if you already have a Cineca username). 
 +
 +  * When you receive username and password you can send an e-mail to  superc@cineca.it, requesting to be enabled to the AURORA system in the framework of the INFN-CINECA agreement. CINECA will associate your username to the  **CON13_INFN** account (every account has a budget and a set of usernames associated with it).  http://www.hpc.cineca.it/content/accounting-0
 +
 +   * At the end of the previous step you can access the Eurora front-end  **login.eurora.cineca.it** via ssh or other standard tools  [[http://www.hpc.cineca.it/content/access-systems-0 | Access to the systems]]
 +
 +The command to view the usage summary of the accounts associated with your username is the following:
 +
 +  login> saldo -b  
 +=====  Eurora usage =====
 +
 +
 +[[http://www.hpc.cineca.it/content/general-information-0 | General Information]]
 +-
 +[[http://www.hpc.cineca.it/content/stay-tuned | Get in touch]]
 +-
 +[[http://www.hpc.cineca.it/content/eurora-user-guide | Eurora User guide ]]
 +
 +
 +==== Architecture ====
 +
 +[[http://www.hpc.cineca.it/hardware/eurora | Eurora Architecture]]
 +
 +  * 64 Computing Nodes : E5-2660 Sandy Bridge, 16 cores each (node001 -> node064)
 +    * CPUspeed:  2 GHz (node001 -> node032) or 3GHz (node033 -> node064) 
 +    * Memory: 32GB (node039-044-055-064) or 16GB (others)
 +    * GPUs: 2 NVIDIA K20 per node (32 nodes)
 +    * MICs: 2 MIC (Xeon-Phi 5120D )   per node (60x4 cores), named nodeXXX-mic0 and nodeXXX-mic1(32 nodes)
 +      * Each MIC: 1,053 GHz, 512-bit SIMD (1056 GFlops DP, 2012 GFlops SP peak ), 8GB RAM (352 GB/s peak)
 +    * Infiniband QDR 1.1us MPI lat,  40Gb/s (4x)( 8 Gb/s(1x) - 96 Gb/s (12x)  )
 +    * GPUs and MICs are connected to the host via PCIe (gen 2?  8 GB/s (16x) )
 +    * Total Peak perf. 150 TFlops
 +
 +  login> pbsnodes -a | egrep '(Mom|available.mem|available.cpuspeed|available.nmics|available.ngpus)'
 +==== Batch scheduler ====
 +
 +THe job management facility adopted by CINECA is PBS:
 +[[http://www.hpc.cineca.it/content/batch-scheduler-pbs-0 | Batch Scheduler PBS  ]]
 +
 +Available Queues: 
 +  * debug (max 2 nodes, 1/2 hour)
 +  * parallel (max 44 nodes, 6 hours)
 +  * longpar (max 22 nodes, 24 hours)
 +
 +Script example (script.pbs)
 +<code>
 +  #PBS -q debug
 +  #PBS -l select=2:ncpus=16:mem=15GB:cpuspeed=3GHz
 +  #PBS -A INFN_EURORA
 +  ...
 +</code>
 +
 +Submit your job 
 +   qsub script.pbs
 +Monitor your job
 +   qstat [-u username]
 +Cancel your job
 +   qdel JOB.id
 +
 +
 +Interactive example (option -I):
 +
 +  qsub -q debug -l nodes=node021:ncpus=1    -A CON13_INFN -I
 +  > cat $PBS_NODEFILE
 +  > exit
 +
 +Asking more memory to allow demanding  compilations
 +  qsub -q debug -l nodes=node021:ncpus=16:mem=15gb  -A CON13_INFN -I
 +  
 +  
 +==== Storage ====
 +
 +[[http://www.hpc.cineca.it/content/data-storage-and-filesystems-0 | Data storage and file systems]]
 +
 +  $HOME  (/eurora/home/userexternal/<username>) (permanent/ backuped)   
 +  $CINECA_SCRATCH  (/gpfs/scratch/userexternal/<username>) (temporary)
 +  
 +Use the local command "cindata" to query for disk usage and quota ("cindata -h" for help):
 +
 +  cindata
 +
 +==== Software Environment ====
 +
 +  * OS: RedHat CentOS release 6.3, 64 bit
 +  * Compilers, scientific libraries and tools are installed using the **software modules** mechanism.
 +
 + http://www.hpc.cineca.it/content/eurora-user-guide#programming
 +
 +NOTE: The MIC system libraries are ditributed through the following shared directories: 
 +
 +  * /cineca/prod/compilers/intel/cs-xe-2013/binary/lib/mic
 +  * /cineca/prod/compilers/intel/cs-xe-2013/binary/mkl/lib/mic
 +  * /cineca/prod/compilers/intel/cs-xe-2013/binary/impi/4.1.1.036/mic/lib
 +===== Job submission =====
 +
 +Basic set of examples for the different programming models (CPU only, CPU+GPU, CPU+MIC)
 +
 +==== CPU ====
 +
 +Example of PBS file
 +<code>
 +#!/bin/bash
 +#PBS -l select=2:mpiprocs=2:ncpus=16:mem=15GB:cpuspeed=3GHz
 +#PBS -N d2d_bdir-remote
 +#PBS -l walltime=00:10:00
 +#PBS -q debug
 +#PBS -A CON13_INFN
 +</code>
 +
 +==== GPU ====
 +
 +http://www.hpc.cineca.it/content/gpgpu-general-purpose-graphics-processing-unit
 +
 +==Compilation==
 +
 +  * login on one gpu-node using command
 +
 +     qsub -A CON13_INFN -I -l select=1:ncpus=16:ngpus=2 -q debug
 +
 +  * load necessary modules
 +
 +     module load gnu/4.6.3
 +     module load cuda/5.0.35
 +     .....
 + 
 +  * compile
 +  * exit
 +
 +==Execution==
 +
 +Example of PBS file
 +<code>
 +#!/bin/bash
 +#PBS -l select=2:mpiprocs=2:ncpus=16:ngpus=2
 +#PBS -N d2d_bdir-remote
 +#PBS -l walltime=00:10:00
 +#PBS -q debug
 +#PBS -A CON13_INFN
 +
 +# load required modules
 +module load gnu
 +module load cuda
 +
 +mpirun .....
 +</code>
 +
 +
 +
 +==== MIC ====
 +
 +[[http://www.hpc.cineca.it/content/quick-guide-intel-mic-usage | CINECA quick guide]]
 +-
 +[[http://www.prace-ri.eu/Best-Practice-Guide-Intel-Xeon-Phi-HTML?lang=en | PRACE best practice guide]]
 +
 +==Compilation==
 +
 +  * login on one mic-node using command
 +
 +     qsub -A INFNG_test -I -l select=1:ncpus=16:nmics=1 
 +
 +  * load needed modules and set variables
 +
 +     module load intel intelmpi mkl
 +     source $INTEL_HOME/bin/compilervars.sh intel64
 +     export I_MPI_MIC=enable
 +
 +   * compile
 +   * exit
 +
 +==Execution on mic-node ==
 +
 +     qsub -A CON13_INFN -I -l select=1:ncpus=16:nmics=2 -q debug
 +     module load intel
 +     module load intelmpi
 +     source $INTEL_HOME/bin/compilervars.sh intel64
 +     ./exe-offload.x
 + 
 +== Execution using PBS from front-end ==
 +
 +Example of PBS file
 +
 +<code>
 +#!/bin/bash
 +#PBS -l select=1:ncpus=16:nmics=2
 +#PBS -l walltime=00:20:00
 +#PBS -q debug
 +#PBS -A CON13_INFN
 +
 +# load required modules
 +module load intel intelmpi mkl
 +source $INTEL_HOME/bin/compilervars.sh intel64
 +export I_MPI_MIC=enable
 +export MIC0=$(head -n 1 $PBS_NODEFILE | sed  "s/[(DDD).]/$1-mic0./")
 +export MIC1=$(head -n 1 $PBS_NODEFILE | sed  "s/[(DDD).]/$1-mic1./")
 +cd  <workdir>
 +
 +export MIC_PATH=
 +export MIC_PATH=$MIC_PATH:/eurora/prod/compilers/intel/cs-xe-2013/binary/composer_xe_2013/mkl/lib/mic/
 +export MIC_PATH=$MIC_PATH:/eurora/prod/compilers/intel/cs-xe-2013/binary/composer_xe_2013/lib/mic
 +
 +mpirun -genv LD_LIBRARY_PATH $MIC_PATH -host ${MIC0},${MIC1} -perhost 1   ./imb/3.2.4/bin/IMB-MPI1.mic pingpong
 +</code>
 +
 +
 +== Network fabrics ==
 +
 +http://www.prace-ri.eu/Best-Practice-Guide-Intel-Xeon-Phi-HTML?lang=en#id-1.7.3
 +
 +Network fabrics available for the Intel Xeon Phi coprocessor: ** shm, tcp, ofa, dapl**
 +
 + The Intel MPI library tries to automatically use the best available network fabric detected (usually shm for intra-node communication and InfiniBand (dapl, ofa) for inter-node communication).
 +
 +The default can be changed by setting the I_MPI_FABRICS environment variable to I_MPI_FABRICS=<fabric> or I_MPI_FABRICS=<intra-node fabric>:<inter-nodes fabric>.
 +
 +The availability is checked in the following order: shm:dapl, shm:ofa, shm:tcp.
 +
 +
 +
 +
 + ----
 + 
 +// 2013/08/28//
 +
 +
  

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki