====== Production Guide ====== ====== List of Productions ====== ^ Date ^ Type ^ Tag ^ | 03/24 | MC | mc-v09_84_00_01-202403-cnaf-corrsce | | 03/24 | DATA | run2-v09_84_00_01-202403-cnaf | | 03/24 | MC | mc-v09_84_00_01-202403-cnaf | | 02/24 | DATA | data_run2-v09_83_01202402-cnaf | | 02/24 | MC | mc_nucosm-v09_83_01202402-cnaf | | 12/23 | DATA | run2-v09_72_00_06-122023-variables | ====== General Info ====== This page details the steps needed to submit and monitor production campaigns (hereafter campaigns) at CNAF. Two main types of campaigns are possible: **real** and **MC** data. In both cases, a first setup is needed for each new campaign. After that, the capaign is submitted in multiple steps, with each step submitting a batch of jobs. At the end, a final check of the completion of the campaign is requested. Below, details on the [[https://wiki.infn.it/progetti/icarus/production-guide#initial_setup|initial setup]], [[https://wiki.infn.it/progetti/icarus/production-guide#what_to_do_while_on_shift|what to do while on shift]] and how to [[https://wiki.infn.it/progetti/icarus/production-guide#check_the_completion_of_the_campaing| check the completion of the campaign]] are given. ====== Initial Setup ====== The first step is to download and setup all the needed scripts. This must be done **only once** per campaign. All the needed batches of jobs for the current campaign will be submitted with the **same** scripts. Each production request has its own configuration and will be associated to a (//git//) tag ('''') used to download the correct version of the scripts. Once the tag has been provided, the shifter has to create a working directory in the default production folder (''/storage/gpfs_data/icarus/local/prod'') with the same name as the selected tag and access it: cd /storage/gpfs_data/icarus/local/prod mkdir cd From this folder, setup the ICARUS environment, source /cvmfs/icarus.opensciencegrid.org/products/icarus/setup_icarus.sh download the correct version of the scripts, git clone https://baltig.infn.it/icarus/prod-scripts/ --recurse-submodules --branch and access the prod-scripts folder from where all steps will be submitted. cd prod-scripts Now, The initial setup is complete. Here is a complete example: cd /storage/gpfs_data/icarus/local/prod mkdir run2-v09_72_00_06-122023-variables cd run2-v09_72_00_06-122023-variables source /cvmfs/icarus.opensciencegrid.org/products/icarus/setup_icarus.sh git clone https://baltig.infn.it/icarus/prod-scripts/ --recurse-submodules --branch run2-v09_72_00_06-122023-variables cd prod-scripts ====== What to do while on shift ====== During the campaign, it's requested to check, **every 6 hours**, the status of the submitted jobs and submit new batch of jobs if needed. The steps, in sequential order, are: - [[https://wiki.infn.it/progetti/icarus/production-guide#check_queue_s_status|Check queue's status]] - [[https://wiki.infn.it/progetti/icarus/production-guide#configure_the_next_job_submission|Configure the next job submission (either real or MC)]] - [[https://wiki.infn.it/progetti/icarus/production-guide#create_a_proxy_with_voms_extensions|Create a proxy with voms extensions]] - [[https://wiki.infn.it/progetti/icarus/production-guide#submit_the_batch_of_jobs|Submit the batch of jobs]] Once the submission of the production is complete, the shifter should [[https://wiki.infn.it/progetti/icarus/production-guide#check_the_completion_of_the_campaing|check the completion of the campaing]] ===== Check queue's status ===== The first step is to check the number of jobs in the //pending// state. This can be done as follows: {{ :progetti:icarus:monitoring.png?&800 |}} - Open the [[https://t1metria.cr.cnaf.infn.it/d/DPv3p6zGz/batch-overview?orgId=18&refresh=10s&var-retention=one_week&var-queue=icarus&viewPanel=18|grafana page]] - Click on "Local Pending Jobs" - Move the cursor to the rightmost part the plot (corresponding to the current situation). A popup should appear with the number of local //pending// jobs. If this number is smaller than **300**, go to the [[https://wiki.infn.it/progetti/icarus/production-guide#configure_the_next_job_submission|next step.]] If not, repeat this step in 6 hours. ===== Configure the next job submission ===== If the current number of //pending// jobs is smaller than **300**, a new batch of jobs can be submitted. First the shifter has to configure the job submission. To do so, the shifter has to go to the ''prod-scripts'' folder inside the working area created in the intial setup section (i.e ''cd /storage/gpfs_data/icarus/local/prod//prod-scripts''). Example: cd /storage/gpfs_data/icarus/local/prod/run2-v09_72_00_06-122023-variables/prod-scripts Then, the shifter has to configure the job submission editing and modifing the ''variable.sh'' file. This step is different based on the production type, [[https://wiki.infn.it/progetti/icarus/production-guide#configure_the_job_submission_for_real_data_production|real]] or [[https://wiki.infn.it/progetti/icarus/production-guide#configure_the_job_submission_for_mc_data_production|mc]] data. ==== Configure the job submission for real data production ==== Here, it's requested to modify the ''variables.sh'' file with the details of the batch of jobs to be submitted. The **only** variable to be modified is the **YOUR_CUSTOM_RUN_LIST** variable. This should be a list of numbers corresponding to the runs to submit in the batch. The list of runs for each batch is provided in the ''batch.info'' file, located in the same folder created during the setup. Here what to do for each batch: * open (with an editor, i.e. ''vim'', ''nano'', ''emacs'' or whatever you like) the //variables.sh// file and check the **YOUR_CUSTOM_RUN_LIST** variable (it should be empty the first time) * open (with an editor, i.e. ''vim'', ''nano'', ''emacs'' or whatever you like) the //batch.info// file and find the batch corresponding to the list of the **YOUR_CUSTOM_RUN_LIST** variable * go to the next batch in the //batch.info// file and copy the corresponding list of runs in the **YOUR_CUSTOM_RUN_LIST** variable * save //variables.sh// {{ :progetti:icarus:raw-data-step.png?direct&1000 |}} ==== Configure the job submission for MC data production ==== Here, it's requested to modify the //variables.sh// file with the details of the batch of jobs to be submitted. Two variables need to be modified: * **STARTING_RUN** * **NUMBER_OF_RUNS** The values for both variables for each step are provided in the //batch.info// file, located in the same folder created during the setup. Here what to do for each step: * open (with an editor, i.e. ''vim'', ''nano'', ''emacs'' or whatever you like) the //variables.sh// file and check the **STARTING_RUN** and **NUMBER_OF_RUNS** variables (they should both be 0 the first time) * open (with an editor, i.e. ''vim'', ''nano'', ''emacs'' or whatever you like) the //batch.info// file and find the step corresponding to the values of the **STARTING_RUN** and **NUMBER_OF_RUNS** variables * go to the next step in the step.info file and copy the corresponding values in the **STARTING_RUN** and **NUMBER_OF_RUNS** variables * save //variables.sh// {{ :progetti:icarus:mc-step.png?direct&1000 |}} ===== Create a proxy with voms extensions ===== You have to create a proxy with the voms extension, this step should be done **every time** your are going to submit a batch of jobs: voms-proxy-init --voms icarus-exp.org --valid 72:00 ===== Submit the batch of jobs ===== **Heads-up**: have you created a proxy with the voms extension? If not: voms-proxy-init --voms icarus-exp.org --valid 72:00 Then, go to the ''prod-scripts'' folder inside the working area created in the intial setup section (i.e ''cd /storage/gpfs_data/icarus/local/prod//prod-scripts''). Example: cd /storage/gpfs_data/icarus/local/prod/run2-v09_72_00_06-122023-variables/prod-scripts After updating the file //variables.sh// with the new batch, submit it: ./submit_production The script will automatically submit all the needed jobs. This could take a few minutes during which the shell will look unresponsive (it is not). ====== Check the completion of the campaign ====== Once the submission of the production is complete, the shifter should check the completion of the campaign. For **standard campaigns**, this is done by running: ./get_info.sh The script creates a ''logs'' folder in the ''prod-scripts'' folder, with four files inside: ./logs/all_raw_files.log # The list of all submitted folders ./logs/duplicated_folders.log # The list of folders with multiple output files ./logs/missing_files.log # The list of folders without output files ./logs/missing_folders.log # The list of missing folders ./logs/ok_files.log # The list of folders of the correctly processed files The shifter should open (with an editor, i.e. ''vim'', ''nano'', ''emacs'' or whatever you like) all the log files and check the number of rows of each file: ./logs/all_raw_files.log # Check the number of rows of this file ./logs/duplicated_folders.log # This should have 0 rows ./logs/missing_files.log # This should have 0 rows ./logs/missing_folders.log # This should have 0 rows ./logs/ok_files.log # This should have the same number of rows as the all_raw_files.log If everything is as described above, the campaign is completed! If not, ask for guidance. For **non-standard campaigns**, specific instructions will be provided and added to this page each time. ====== FAQ ====== TO DO