Usage instruction moved here: https://confluence.infn.it/display/TD/6+-+The+HPC+cluster
OLD STUFF:
To access the cluster you should first obtain an account at CNAF following the procedure you can find at this link.
In the application form specify in the "reason" field that you need to access the HPC cluster.
Please Specify "Daniele Cesini" as contact person.
Once the CNAF account will be provided, you can login to the bastion host.
This is not your user interface!
To access the cluster from the bastion log into:
ui-hpc.cr.cnaf.infn.it
using the same bastion credentials.
Information and Support can be asked to:
hpc-support <_at_> lists.cnaf.infn.it
Your home directory:
/home/HPC/<your_username/
in the user interface is shared among all the cluster nodes.
No quotas are currently enforced on the home directories and about only 4TB are available in the /home partition.
In the case you need more disk space for data and checkpointing every user can access the following directory:
/storage/gpfs_maestro/hpc/user/<your_username>/
which is on a shared gpfs storage.
Please, do not leave huge unused files in both home directories and gpfs storage areas. Quotas will be enforced in the near future.
The cluster is managed and accessible via the LSF (version 9.1.2) batch system.
A detailed LSF user guide can be found at this IBM page.
In the following there is a minimal how-to, describing basic operations needed to properly access the CNAF HPC cluster for various job types.
To obtain on overview of the nodes status:
bhosts -w
To obtain the queues status:
bqueues
Add the option "-l" to obtain detailed information.
Currently four queues have been defined:
To obtain nodes load information:
lsload
Much more details with:
lsload -l
To restrict the lsload query to a numerical fields (i.e. io and r15s) use the "-I" option :
lsload -I io:r15s
To restrict the query to string fields use the "-s" option (i.e. gpu_mode):
lsload -s gpu_model0
Single batch jobs can be submitted via the bsub command.
Use option "-o" and "-e" to redirect standard output and standard error.
Option "-m" selects specific nodes if needed.
I.e:
bsub -o test.out -e test.err /usr/bin/whoami bsub -o test.out -e test.err -m 'hpc-200-06-05' /bin/hostname
As previously stated standard output and standard error can be redirected with the "-o" and "-e" of the bsub command.
The files generated in this way are available at the end of the job, they are owned by root but can be read and removed by the user. They cannot be edited directly. Should you need to edit them you need to make a copy with "cp".
To have real time update files redirect the standard output and error using ">" after the executable name, enclosing in single quotes, i.e.:
bsub -o test.out -e test.err '/usr/bin/whoami > std.out 2>&1'
The single quotes is important otherwise the output of the bsub command will be redirected.
Job status can be queried with the bjobs command.
Use the "-w" option to get Wide format. Displays job information without truncating fields
Use the "-W" and "-l" option detailed information about the job.
Use the job number to get information of a single job.
Use option "-u" to get information of a single user jobs.
"-a" Displays information about jobs in all states, including recently finished jobs.
I.e:
bjobs -W bjobs -l <JOBID> bjobs -a -u <USERNAME>
To kill submitted jobs launch the bkill command. I.e. :
bkill <JOBID>
Currently only OpenMPI jobs have been tested on the HPC cluster.
To submit an OpenMPI Job please follow the following steps:
[cesinihpc@ui-hpc ~]$ cat .bashrc # .bashrc # Source global definitions if [ -f /etc/bashrc ]; then . /etc/bashrc fi export PATH=/usr/lib64/openmpi/bin:$PATH export LD_LIBRARY_PATH=/usr/lib64/openmpi/lib:$LD_LIBRARY_PATH
[cesinihpc@ui-hpc ~]$ cat .bash_profile # .bash_profile # Get the aliases and functions if [ -f ~/.bashrc ]; then . ~/.bashrc fi
[cesinihpc@ui-hpc ~]$ cat cpmpi_test.sh #!/bin/sh #can do initial environment setup here if needed #export <something if needed> echo "------------------------------------------------" /usr/share/lsf/9.1/linux2.6-glibc2.3-x86_64/bin/mpirun.lsf env PSM_SHAREDCONTEXTS_MAX=8 <PATH_TO_YOUR_EXECUTABLE>
PLEASE NOTE: mpirun.lsf has to be used instead of standard mpirun!
PLEASE NOTE: PSM_SHAREDCONTEXTS_MAX=8 has to be used if you are not using whole nodes (i.e. not using a number of mpi processes which is a multiple of 32 with a 32 processors per node, ptile in LSF). If you are using whole nodes you can skip this and your job will use the maximun number of shared contexts available on a node (which is 16). If you are not using whole nodes and do not set the PSM_SHAREDCONTEXTS_MAX variable to a number lower than 16 the next job landing on the same node will probably fail.
PLEASE NOTE: do not set the number of nodes to be used in the mpirun.lsf command, it will be in the bsub command and will be handled by LSF
bsub -q <queue_name> -a openmpi -n 32 -R "span[ptile=16]" -o testmpi.out -e testmpi.err '/usr/share/lsf/local/hpc/bin/cpmpi_test.sh
The option -R "span[ptile=16]" selects the process per node that will be used. Max is 32.
If you want to select specific nodes you can use the option "-m", i.e. -m 'hpc-200-06-02 hpc-200-06-03 hpc-200-06-04'.
To hide the complexity of the syntax of the submission command you can use this cnaf_launcher.sh script script to launch OpenMPI jobs from the User Interface.
Just customize the first lines according to your needs.
(Thanks to S.Sinigardi for sharing it)
It is possible to avoid the usage of mpirun.lsf and dinamically set the mpirun machine file in the following way:
1) Create automatically the machine file to be using in the mpirun:
echo $LSB_HOSTS | awk '{split($0,array," ")} END {for (i in array) printf ("%s\n",array[i])}' | awk '{count[$0]++} END {for (word in count) print word,"slots=" count[word]}' > /home/HPC/username/mymachine.txt
2) Use this command to launch mpirun:
mpirun --machinefile /home/HPC/username/machinefile.txt -x PSM_SHAREDCONTEXTS_MAX=8 -np $LSB_DJOB_NUMPROC /home/HPC/username/executablename
A possible bsub submission is:
bsub -q hpc_inf_SL7 -n 16 -R "span[ptile=8]" -o testmpimy.out -e testmpimy.err /home/HPC/username/run_this_example.sh
where in the run_this_example.sh script you launch the previous commands:
—-run_this_example.sh—-
#!/bin/bash
echo $LSB_HOSTS | awk '{split($0,array," ")} END {for (i in array) printf ("%s\n",array[i])}' | awk '{count[$0]++} END {for (word in count) print word,"slots=" count[word]}' > /home/HPC/username/mymachine.txt
mpirun --machinefile /home/HPC/username/machinefile.txt -x PSM_SHAREDCONTEXTS_MAX=8 -np $LSB_DJOB_NUMPROC /home/HPC/username/executablename
[cesinihpc@ui-hpc ~]$ cat test_2gpu_lsf.sh #!/bin/sh export BASE=/usr/local/cuda-5.5/ export PATH=$BASE/bin:$PATH export C_INCLUDE_PATH=$BASE/include:$C_INCLUDE_PATH export CPLUS_INCLUDE_PATH=$BASE/include:$CPLUS_INCLUDE_PATH export LD_LIBRARY_PATH=$BASE/lib:$BASE/lib64:/usr/local/cuda-5.0/lib64/:$LD_LIBRARY_PATH #env #echo "------------------------------------------------" #now your GPU executable /home/HPC/cesinihpc/test_2gpu.exe # if it a GPU and OPENMPI job: # /usr/share/lsf/9.1/linux2.6-glibc2.3-x86_64/bin/mpirun.lsf env PSM_SHAREDCONTEXTS_MAX=8 /home/HPC/cesinihpc/test_2gpu.exe # remember to add option "-a openmpi" and "-n <NP> in the bsub command #############
bsub -q hpc_inf -R "select [gpu_model0=='TeslaK20m' && gpu_model1=='TeslaK20m' ] rusage [ngpus_excl_p=2]" -o jtest.out -e jtest.err /home/HPC/cesinihpc/test_2gpu_lsf.sh**
The -R option showed in the example selects a node with two Tesla K20 GPUs. Customise it according to your requirements.
If your job does not use many CPU cores and the site is fully used by CPU-only jobs, to submit a GPU job you can use the hpc_gpu queue to access "extra" cores not accessible via the hpc_inf queue.
The hpc_gpu queue can use only 2 cores and only in the nodes where the GPUs are installed.
The hostgroups gpuk20 and gpuk40 have been defined to simplify the submission command.
I.e. :
bsub -q hpc_gpu -m gpuk40 -R "rusage [ngpus_excl_p=2]" -o jtest.out -e jtest.err /home/HPC/cesinihpc/test_2gpu_lsf.sh
PLEASE NOTE: the "rusage" directive is important - set it to the number of GPUs you need in the node.
LSF will subtract the number of GPUs specified in "rusage" from the amount of available GPUS to other jobs in the node.
If your job is also an OpenMPI job add the options -a openmpi and -n <NP> and use the mpirun.lsf launcher in the wrapper as described in the previous section.
To allow interactive access to the nodes for debugging, testing and compiling purposes, an interactive shell can be opened on the nodes submitting a job with the option -Is, i.e.:
bsub -q hpc_int -Is /bin/bash
PLEASE NOTE: After about two hours you will be logged out! Do not use interactive shell to submit real life long jobs .
Information and Support can be asked to:
hpc-support <_at_> lists.cnaf.infn.it