Date | Type | Tag |
---|---|---|
03/24 | MC | mc-v09_84_00_01-202403-cnaf-corrsce |
03/24 | DATA | run2-v09_84_00_01-202403-cnaf |
03/24 | MC | mc-v09_84_00_01-202403-cnaf |
02/24 | DATA | data_run2-v09_83_01202402-cnaf |
02/24 | MC | mc_nucosm-v09_83_01202402-cnaf |
12/23 | DATA | run2-v09_72_00_06-122023-variables |
This page details the steps needed to submit and monitor production campaigns (hereafter campaigns) at CNAF. Two main types of campaigns are possible: real and MC data. In both cases, a first setup is needed for each new campaign. After that, the capaign is submitted in multiple steps, with each step submitting a batch of jobs. At the end, a final check of the completion of the campaign is requested.
Below, details on the initial setup, what to do while on shift and how to check the completion of the campaign are given.
The first step is to download and setup all the needed scripts. This must be done only once per campaign. All the needed batches of jobs for the current campaign will be submitted with the same scripts. Each production request has its own configuration and will be associated to a (git) tag (<selected-tag>
) used to download the correct version of the scripts. Once the tag has been provided, the shifter has to create a working directory in the default production folder (/storage/gpfs_data/icarus/local/prod
) with the same name as the selected tag and access it:
cd /storage/gpfs_data/icarus/local/prod mkdir <selected-tag> cd <selected-tag>
From this folder, setup the ICARUS environment,
source /cvmfs/icarus.opensciencegrid.org/products/icarus/setup_icarus.sh
download the correct version of the scripts,
git clone https://baltig.infn.it/icarus/prod-scripts/ --recurse-submodules --branch <selected-tag>
and access the prod-scripts folder from where all steps will be submitted.
cd prod-scripts
Now, The initial setup is complete. Here is a complete example:
cd /storage/gpfs_data/icarus/local/prod mkdir run2-v09_72_00_06-122023-variables cd run2-v09_72_00_06-122023-variables source /cvmfs/icarus.opensciencegrid.org/products/icarus/setup_icarus.sh git clone https://baltig.infn.it/icarus/prod-scripts/ --recurse-submodules --branch run2-v09_72_00_06-122023-variables cd prod-scripts
During the campaign, it's requested to check, every 6 hours, the status of the submitted jobs and submit new batch of jobs if needed. The steps, in sequential order, are:
Once the submission of the production is complete, the shifter should check the completion of the campaing
The first step is to check the number of jobs in the pending state. This can be done as follows:
If this number is smaller than 300, go to the next step. If not, repeat this step in 6 hours.
If the current number of pending jobs is smaller than 300, a new batch of jobs can be submitted. First the shifter has to configure the job submission. To do so, the shifter has to go to the prod-scripts
folder inside the working area created in the intial setup section (i.e cd /storage/gpfs_data/icarus/local/prod/<selected-tag>/prod-scripts
). Example:
cd /storage/gpfs_data/icarus/local/prod/run2-v09_72_00_06-122023-variables/prod-scripts
Then, the shifter has to configure the job submission editing and modifing the variable.sh
file. This step is different based on the production type, real or mc data.
Here, it's requested to modify the variables.sh
file with the details of the batch of jobs to be submitted. The only variable to be modified is the YOUR_CUSTOM_RUN_LIST variable. This should be a list of numbers corresponding to the runs to submit in the batch. The list of runs for each batch is provided in the batch.info
file, located in the same folder created during the setup.
Here what to do for each batch:
vim
, nano
, emacs
or whatever you like) the variables.sh file and check the YOUR_CUSTOM_RUN_LIST variable (it should be empty the first time)vim
, nano
, emacs
or whatever you like) the batch.info file and find the batch corresponding to the list of the YOUR_CUSTOM_RUN_LIST variableHere, it's requested to modify the variables.sh file with the details of the batch of jobs to be submitted. Two variables need to be modified:
The values for both variables for each step are provided in the batch.info file, located in the same folder created during the setup.
Here what to do for each step:
vim
, nano
, emacs
or whatever you like) the variables.sh file and check the STARTING_RUN and NUMBER_OF_RUNS variables (they should both be 0 the first time)vim
, nano
, emacs
or whatever you like) the batch.info file and find the step corresponding to the values of the STARTING_RUN and NUMBER_OF_RUNS variablesYou have to create a proxy with the voms extension, this step should be done every time your are going to submit a batch of jobs:
voms-proxy-init --voms icarus-exp.org --valid 72:00
Heads-up: have you created a proxy with the voms extension? If not:
voms-proxy-init --voms icarus-exp.org --valid 72:00
Then, go to the prod-scripts
folder inside the working area created in the intial setup section (i.e cd /storage/gpfs_data/icarus/local/prod/<selected-tag>/prod-scripts
). Example:
cd /storage/gpfs_data/icarus/local/prod/run2-v09_72_00_06-122023-variables/prod-scripts
After updating the file variables.sh with the new batch, submit it:
./submit_production
The script will automatically submit all the needed jobs. This could take a few minutes during which the shell will look unresponsive (it is not).
Once the submission of the production is complete, the shifter should check the completion of the campaign.
For standard campaigns, this is done by running:
./get_info.sh
The script creates a logs
folder in the prod-scripts
folder, with four files inside:
./logs/all_raw_files.log # The list of all submitted folders ./logs/duplicated_folders.log # The list of folders with multiple output files ./logs/missing_files.log # The list of folders without output files ./logs/missing_folders.log # The list of missing folders ./logs/ok_files.log # The list of folders of the correctly processed files
The shifter should open (with an editor, i.e. vim
, nano
, emacs
or whatever you like) all the log files and check the number of rows of each file:
./logs/all_raw_files.log # Check the number of rows of this file ./logs/duplicated_folders.log # This should have 0 rows ./logs/missing_files.log # This should have 0 rows ./logs/missing_folders.log # This should have 0 rows ./logs/ok_files.log # This should have the same number of rows as the all_raw_files.log
If everything is as described above, the campaign is completed! If not, ask for guidance.
For non-standard campaigns, specific instructions will be provided and added to this page each time.
TO DO