strutture:roma1:experiments:ams2:internal_note_naia

Table of Contents

Technical Guide for Extracting and Processing AMS-02 Data Using ROOT, the NAIA Framework (Ntuples for AMS-Italy Analysis), and HTCondor

A Practical Setup and Workflow Document

INFN Roma Internal Note - Version 1.0 - 1 June 2025

Authors: Alessandro Bartoloni and Mustafa Mohammad Rafiei

Affiliations: INFN Roma I

Last Updated: — Mustafa Mohammad Rafiei 2025/06/18 12:19

Abstract

This guide explains the process of extracting and processing AMS-02 data using ROOT, the NAIA framework, and HTCondor. First, having CERN and CNAF accounts is essential, and connecting to the "ui-ams" server is required. The initial setup includes creating SSH keys for secure remote access.

After preparing accounts, AlmaLinux 9.5 is installed as the recommended operating system. Then, Visual Studio Code (VS Code) is installed for code development, and the Remote - SSH extension is configured to connect to "ui-ams."

Once VS Code is set up, the NAIA library is installed on the CNAF server or a local machine. Projects using NAIA require configuring the CMakeLists.txt file, and directory structures are defined for organization. The NSL library is then installed, and the usage of both libraries within a project is explained.

Next, relevant C++ code is written, compiled, and executed. To enable parallel processing for ROOT data, HTCondor is configured, and necessary ".bashrc" and ".bash_profile" settings are applied on "ui-ams."

A list of ROOT data files is prepared, and a "run.sh" script is written for execution. Job files (.sub) are created for each data file and submitted to HTCondor for parallel processing. Any held jobs can be resubmitted.

After processing, output ".root" files are merged. A similar procedure is followed for Monte Carlo simulation data, where directory structures are set, file lists are created, and job submission occurs.

Monte Carlo jobs are submitted in parallel within HTCondor, and if needed, resubmissions are performed. Finally, the processed output ".root" files are merged.

This guide covers all essential steps for efficiently and systematically executing AMS-02 data analysis, ensuring an organized workflow from setup to final processing.

1) Having a CNAF and CERN account and connect to the ui-ams machine

1.1) CNAF and CERN account

To access AMS computing resources, you must first request valid CNAF and CERN accounts. Please note that approval may take some time.

Once your accounts are approved and you receive your username and password, you can connect to CNAF services.

1.2) Connecting to the “ui-ams” machine

The ui-ams machine is a virtual machine hosted on a physical server known as the bastion.

To access ui-ams, you must first log in to the bastion server.

Steps:

Open a terminal and connect to the bastion:

  ssh UserName@bastion.cnaf.infn.it

After logging in, connect to the virtual machine:

  ssh UserName@ui-ams

Note: bastion.cnaf.infn.it is a real server, ui-ams is a virtual machine running on the bastion.

1.3) Create ssh public and private keys

1.3 Creating SSH Keys for Passwordless Access This step is optional but highly recommended for avoiding repeated password entry.

a) Install SSH (if not already installed)

  sudo dnf install openssh-clients -y

On AlmaLinux, this package is usually already installed.

b) Generate SSH Key Pair

Recommended (modern and secure):

  ssh-keygen -t ed25519 -C "your_email@example.com"

Alternative (if required for compatibility):

  ssh-keygen -t rsa -b 4096 -C "your_email@example.com"

You can leave the passphrase empty or enter one for added security.

c) Copy Your Public Key to the Bastion Server

  ssh-copy-id username@bastion.cnaf.infn.it

If this fails, use the manual method below.

Manual Method (only if ssh-copy-id fails): Connect to the bastion:

  ssh username@bastion.cnaf.infn.it

On the server:

  mkdir -p ~/.ssh
  nano ~/.ssh/authorized_keys

On your local machine, open your public key file (~/.ssh/id_ed25519.pub or .id_rsa.pub) and copy its contents.

Paste the key into the authorized_keys file on the server.

Set proper permissions:

  chmod 700 ~/.ssh
  chmod 600 ~/.ssh/authorized_keys

d) Copy Public Key to ui-ams via Bastion

Use ProxyJump to transfer your public key to the virtual machine:

  ssh-copy-id -o ProxyJump=UserName@bastion.cnaf.infn.it username@ui-ams

e) SSH Configuration for Simpler Access

Edit your SSH config file:

  nano ~/.ssh/config

Add the following:

Host bastion
HostName bastion.cnaf.infn.it
User UserName

Host ui-ams
HostName ui-ams
User UserName
ProxyJump UserName@bastion.cnaf.infn.it
IdentityFile ~/.ssh/id_rsa

Now you can simply connect with:

  ssh ui-ams

f) Optional: Using RSA Instead of ed25519

If your system requires RSA for compatibility:

Generate RSA key pair:

  ssh-keygen -t rsa -b 4096 -C "your_email@example.com"

Use the same ssh-copy-id steps to transfer the key to both servers. Your existing SSH config works the same way. If you want to specify the RSA key explicitly: Host ui-ams

  HostName ui-ams
  User UserName
  ProxyJump bastion
  IdentityFile ~/.ssh/id_rsa

Test the connection:

  ssh ui-ams

2) Installing AlmaLinux 9.5

AlmaLinux 9.5 is the recommended OS for local development of NAIA/NSL‑based C++ analysis. You can either run it in a VM on top of Windows or install it alongside Windows (dual‑boot). Dual‑boot is preferred for full native performance.

2.1 Download the AlmaLinux 9.5 ISO - Official ISO & Cloud Images: → https://almalinux.org/get-almalinux/

- Installation Guide (step‑by‑step with screenshots): → https://wiki.almalinux.org/documentation/installation-guide.html

2.2 Installation Methods

- Virtual Machine (No disk repartitioning required)

recommended VM platforms:

Example tutorial: “AlmaLinux 9.5 Installation on VirtualBox” → https://www.servermania.com/kb/articles/how-to-install-almalinux-in-virtualbox/

- Dual‑Boot with Windows (Full native performance, ideal for compiling/testing C++ code against ROOT, NAIA, NSL)

General dual‑boot tutorial: “How to Dual‑Boot Linux and Windows” → https://itsfoss.com/dual-boot-linux-windows-guide/

2.3 Quick Installation Steps

  1. Create bootable USB with Rufus (Windows) or `dd` (Linux/macOS).
  2. Boot from USB, choose “Install AlmaLinux 9.5.”
  3. Partition disk: create or resize Windows partition, then allocate space for AlmaLinux.
  4. Select software: include “Development Tools” group to get compilers/CMake.
  5. Set hostname, user account, and firewall rules as needed.
  6. Reboot and choose AlmaLinux or Windows at GRUB menu.
Note: Even though most heavy analysis runs on CNAF servers, having a local AlmaLinux 9.5 environment lets you quickly build and test your C++ code (with 1–2 ROOT input files) before scaling up to batch jobs on HTCondor.

3) Installing Visual Studio Code (VS Code) on AlmaLinux 9.5

Visual Studio Code (VS Code) is a lightweight yet powerful source-code editor developed by Microsoft. Known for its speed, extensibility, and robust feature set, it is widely used by developers. Note: VS Code also has versions for Windows and macOS.

3.1) Installing Visual Studio Code on AlmaLinux 9.5

Follow these steps to install Visual Studio Code (VS Code) on AlmaLinux 9.5:

Update Your System

Before installing, update your system to ensure it has the latest packages:

  sudo dnf update -y

Add the Microsoft GPG Key

To ensure the authenticity of the packages, import the Microsoft GPG key:

  sudo rpm --import https://packages.microsoft.com/keys/microsoft.asc

Add the VS Code Repository

Create a repository file for VS Code:

  printf "[vscode]\nname=packages.microsoft.com\nbaseurl=https://packages.microsoft.com/yumrepos/vscode/\nenabled=1\ngpgcheck=1\nrepo_gpgcheck=1\ngpgkey=https://packages.microsoft.com/keys/microsoft.asc\nmetadata_expire=1h" | sudo tee -a /etc/yum.repos.d/vscode.repo

Install VS Code

Now, install Visual Studio Code using the following command:

  sudo dnf install code -y

Launch VS Code

Once the installation is complete, you can launch Visual Studio Code by typing:

  code

Alternatively, you can open it via the graphical user interface (GUI).

Update VS Code

To keep VS Code up to date, use the following command:

  sudo dnf update code -y

Alternatively, check for updates within the VS Code interface.

Uninstall (Optional)

If you need to remove Visual Studio Code, use the following command:

  sudo dnf remove code -y

You can also delete the repository file by running:

  sudo rm -f /etc/yum.repos.d/vscode.repo

Note: Ensure you have internet access for these steps. If you're working offline, you can manually download the RPM package and install it. If you encounter issues like missing dependencies (e.g., `libffmpeg.so`), you may need to install additional libraries or use Flatpak/Snap.

3.2) Setting Up VS Code to Connect to ui-ams via SSH

Now that you've set up your SSH keys and configured access to `ui-ams` via the intermediate server (`bastion.cnaf.infn.it`), follow these steps to configure Visual Studio Code (VS Code) to connect to `ui-ams` via SSH.

I) Ensure SSH Keys and Configuration Are Set Up

Make sure you've already completed the SSH key generation and configuration for connecting to `ui-ams`.

II) Install Visual Studio Code on Your Laptop

If you haven't already, install Visual Studio Code on your AlmaLinux 9.5 machine.

III) Install the Remote - SSH Extension in VS Code

To enable remote access via SSH, you'll need the "Remote - SSH" extension. Follow these steps:

- Open Visual Studio Code.

- Go to the Extensions view by pressing Ctrl+Shift+X or clicking the Extensions icon on the sidebar.

- Search for "Remote - SSH" (developed by Microsoft).

- Click Install to add the extension to VS Code.

Explanation: The Remote - SSH extension allows you to connect to remote servers via SSH, edit files, run terminal commands, and debug remotely as if the server were local.

IV) Configure VS Code to Connect to ui-ams

VS Code's Remote - SSH extension uses your existing SSH configuration (`~/.ssh/config`) to connect to remote hosts.

Ensure your SSH config is correctly set up with the ProxyJump directive for `bastion.cnaf.infn.it`.

To initiate the connection:

- Click the green Remote-SSH button located in the bottom-left corner of VS Code (or press Ctrl+Shift+P and search for "Remote-SSH: Connect to Host").

- From the list of hosts, select ui-ams.

Explanation: The Remote - SSH extension reads the `~/.ssh/config` file and automatically routes the connection through `bastion.cnaf.infn.it` using the ProxyJump configuration.

V) Authenticate and Connect

On your first connection attempt, VS Code may ask you to enter the passphrase for your SSH key (if you set one). After authenticating, VS Code will establish the connection to `ui-ams`.

You may also be prompted to verify the server's fingerprint for security purposes. Type yes to continue.

Explanation: Since you’ve already copied your SSH public key to both servers, the connection should be passwordless unless you set a passphrase for your SSH key.

VI) Set Up the Remote Environment

During the first connection, VS Code will automatically install a lightweight server component on `ui-ams`. Once installed, you'll be able to interact with `ui-ams` as if it were a local machine.

Explanation: The server component allows you to edit files, use the terminal, and install extensions on `ui-ams`.

VII) Work on ui-ams

After connecting, you can start working on `ui-ams`:

- Open Files/Folders: Use the File Explorer in VS Code to open a directory on `ui-ams` (e.g., `/home/your_username/myproject`).

- Use the Terminal: Open the integrated terminal (Ctrl+`) to run commands directly on `ui-ams`.

- Install Extensions Remotely: Install extensions (e.g., C++, Python, GitHub Copilot) on `ui-ams` to enhance your development workflow.

Explanation: Once connected, VS Code treats `ui-ams` as if it were a local machine. You can edit, run, and debug files directly on the remote server.

VIII) Test the Connection

To ensure everything is set up correctly, try opening a file or running a command in the terminal (e.g., `pwd` should return `/home/your_username` on `ui-ams`).

Explanation: This confirms that VS Code is properly connected to `ui-ams`, and you're ready to work remotely. z

4) Installing the NAIA Library on CNAF or Local Machine

last edit — Alessandro Bartoloni 2025/05/31

For official instructions, refer to the NAIA documentation:

https://naia-readthedocs.readthedocs.io/en/1.2.0/build-install.html

4.1) Requirements

To build and use NAIA locally, your system must meet the following requirements:

  • A C++ compiler with full C++17 support (tested with GCC ≥ 12.1.0)
  • CMake version ≥ 3.13
  • ROOT built with C++17 support (tested with ROOT ≥ 6.28/04)

Supported platforms: CentOS7 and RHEL9 derivatives (e.g., AlmaLinux 9, Rocky Linux 9)

If your machine has access to CVMFS, you can use precompiled NAIA versions and required dependencies located at:

  • `/cvmfs/ams.cern.ch/Offline/amsitaly/public/install/x86_64-centos7-gcc12.1/naia`
  • `/cvmfs/ams.cern.ch/Offline/amsitaly/public/install/x86_64-el9-gcc12.1/naia`

Each version includes a ready-to-use environment setup script, e.g., for CentOS7:

  /cvmfs/ams.cern.ch/Offline/amsitaly/public/install/x86_64-centos7-gcc12.1/naia/v1.1.0/setenvs/setenv_gcc6.28_cc7.sh
Note: These requirements are only necessary if installing NAIA on a local machine. All dependencies are already pre-installed on CNAF (ui-ams).

4.2) Building and Installing NAIA

Follow the steps below to build and install NAIA from source.

I) Clone the NAIA (v1.2.0) Repository

Choose one of the following methods (a CERN account is required):

  git clone --recursive https://username@gitlab.cern.ch:443/ams-italy/naia.git -b v1.2.0  #Kerberos
  git clone --recursive ssh://git@gitlab.cern.ch:7999/ams-italy/naia.git -b v1.2.0 #SSH
  git clone --recursive https://gitlab.cern.ch/ams-italy/naia.git -b v1.2.0 #HTTPS
*Note:* check for the latest version of NAIA , version 1.2.0 is used in the above command

II) Create Build and Install Directories

Use the command

  mkdir ~/NAIA_library/naia.build ~/NAIA_library/naia.install

Assume your project structure is as follows:

  ~/NAIA_library/ ├── naia/ # Source directory (from git clone)
                  ├── naia.build/ # Build directory 
                  └── naia.install/ # Target installation directory==== III) Build and Install NAIA ====

III) Build and Install NAIA

Run the following commands:

  cd ~/NAIA_library/naia.build
  cmake ../naia -DCMAKE_INSTALL_PREFIX=../naia.install
  make all install

IV) Set Up Environment Variables on CNAF (ui-ams)

Edit your shell configuration files:

Open your .bashrc:

  nano ~/.bashrc

Add the following line at the end (replace <your_username> with your CNAF username):

  source /storage/gpfs_ams/ams/users/abartoloni/NAIA_library/naia/setenvs/setenv_gcc6.28_el9.sh

The setenv_gcc6.28_el9.sh script is located in the setenvs folder inside the NAIA source directory you cloned earlier.

Save and exit

Apply the changes:

  source ~/.bashrc

Repeat the process for .bash_profile:

  nano ~/.bash_profile

Add the same line:

  source /storage/gpfs_ams/ams/users/abartoloni/NAIA_library/naia/setenvs/setenv_gcc6.28_el9.sh

Then:

  source ~/.bash_profile

5) Using NAIA in the Project

last edit — Alessandro Bartoloni 2025/05/31

To use NAIA ntuples in your C++ project, you need to include:

Header files located in:

  naia.install/include

Libraries located in:

  naia.install/lib/libNAIAUtility.so  
  naia.install/lib/libNAIAContainers.so  
  naia.install/lib/libNAIAChain.so

The recommended approach for integrating NAIA into your project is by using CMake.

5.1) CMakeLists.txt NAIA example file

To use the NAIA library in your project, it is recommended to use CMake, a flexible and powerful build system.

NAIA provides CMake configuration files that simplify the integration process by automatically handling include paths and library linking.

The following example sets up a basic C++ project that:

  • Requires CMake version 3.10 or higher.
  • Uses the C++17 standard.
  • Depends on NAIA version 1.2.0 or newer.
  • Builds an executable named main from the source file src/main.cpp.
  • Links the executable with the NAIA::NAIAChain component of the NAIA library.

Below is a sample CMakeLists.txt file with inline comments explaining each step:

  
  # Specify the minimum required version of CMake to build the project
  cmake_minimum_required(VERSION 3.10)
   
  # Define the name of the project
  project(MyNAIAProject)
   
  # Set the C++ standard to C++17 (required by NAIA)
  set(CMAKE_CXX_STANDARD 17)
   
  # Find the NAIA library version 1.2.0 or higher; this is required to build the project
  find_package(NAIA 1.2.0 REQUIRED)
   
  # Define an executable named 'main' using the source file src/main.cpp
  add_executable(main src/main.cpp)
   
  # Link the 'main' executable with the NAIAChain component of NAIA
  # This makes NAIA functionality available in your code
  target_link_libraries(main PUBLIC NAIA::NAIAChain)

Assume the following structure for your project:

  ~/Project_one/
  ├── build/           # Directory where you build the project
  ├── src/             # Contains your C++ source files (e.g., main.cpp)
  └── CMakeLists.txt   # Your build configuration file

5.3) Compiling and Running Your Code

To compile and run the project:

a) Navigate to the build directory:

  cd ~/Project_one/build

b) Run CMake, pointing it to the NAIA installation:

  cmake .. -DNAIA_DIR=~/NAIA_library/naia.install/cmake

c) Build the executable:

 make

d) Run the compiled program:

 ./main input.root output.root

Note on running on CNAF computing facility: Running this on the ui-ams machine is acceptable for testing a small number of .root files. Avoid performing large-scale data analysis directly on ui-ams—it may lead to system crashes. For large jobs, please use HTCondor, as described in the following sections.

6) Installing the NAIA Selection Library

last edit — Alessandro Bartoloni 2025/05/31

The NAIA Selection Library(NSL) is a companion library developed specifically for use with the NAIA framework. While NAIA handles structured access to AMS ntuples and datasets, NSL provides a growing collection of utility functions and tools designed to simplify selection logic and analysis workflows.

Many functions in NSL significantly reduce code complexity and improve readability, allowing you to write cleaner, more maintainable analysis code. If you're working on an AMS project using NAIA, incorporating NSL can greatly enhance both development speed and clarity.

6.1) Requirements

✅ Note: If you are working in an environment where NAIA is already installed and configured, you should be ready to go.

The requirements for installing NSL are largely the same as for NAIA:

  • A C++ compiler with full C++17 support (tested with GCC ≥ 9.3.0)
  • CMake version ≥ 3.13
  • A ROOT installation compiled with C++17 support (tested with ROOT ≥ 6.22/08, recommended ≥ 6.26/02)

6.2) Building and Installing

Follow these steps to build and install NSL:

I) Clone the NSL repository:

  git clone https://<username>@gitlab.cern.ch:443/ams-italy/nsl.git  # (Kerberos)
  git clone ssh://git@gitlab.cern.ch:7999/ams-italy/nsl.git           # (SSH)
  git clone https://gitlab.cern.ch/ams-italy/nsl.git                  # (HTTPS)

⚠️ Note: You need a valid CERN account to access the repository.

II) Create build and install directories:

  mkdir nsl.build nsl.install

Assume the following directory structure:

  $ Home/NSL_library/nsl           # NSL source directory
  $ Home/NSL_library/nsl.build     # Build directory
  $ Home/NSL_library/nsl.install   # Installation directory

III) Build and install NSL:

Before starting, ensure the NAIADIR environment variable is set properly.

  cd Home/NSL_library/nsl.build
  cmake ../nsl -DCMAKE_INSTALL_PREFIX=../nsl.install
  make all install

🧠 Tip: You can also add the NSL environment setup script to your .bashrc or .bash_profile to make it persistent across sessions (similar to NAIA).

7) Using both NAIA and NSL in the project

last edit — Alessandro Bartoloni 2025/06/03

To develop a C++ project based on NAIA and NSL, you need to include both libraries properly.

Specifically, your project should have access to:

  • The NSL headers located in nsl.install/include
  • The NSL shared library: nsl.install/lib/libNSLSelections.so

Like NAIA, NSL exports a CMake target called NSL::NSLSelections, which should be linked in your project.

Important: Since NSL is an extension of NAIA, your project must fulfill all prerequisites for both libraries.

Here is a basic CMake configuration for a project that uses both NAIA and NSL:

 # Specify the minimum required version of CMake to build the project
cmake_minimum_required(VERSION 3.10)
 
# Define the name of the project
project(MyNAIAProject)
 
# Set the C++ standard to C++17 (required by NAIA)
set(CMAKE_CXX_STANDARD 17)
 
# Find the NAIA library version 1.2.0 or higher and NSL library; 
# this is required to build the project
find_package(NAIA 1.2.0 REQUIRED)
find_package(NSL REQUIRED)
 
# Define an executable named 'main' using the source file src/main.cpp
add_executable(main src/main.cpp)
 
# Link the 'main' executable with the NAIAChain component of NAIA and with 
# NSLSelections component of NSL 
# This makes NAIA and NSL functionality available in your code
target_link_libraries(main PUBLIC NAIA::NAIAChain NSL::NSLSelections)

7.1) Reccomended directory structure

Assume the following directory organization:

  Home/
  └── Project_two/
      ├── build/         # Build directory
      ├── src/           # Contains your main.cpp
      └── CMakeLists.txt # At the root of the project

Place your source file (e.g., main.cpp) inside the src/ directory.

7.2) Compiling C++ code and creating a main.exe and run it

Follow below:

a) Enter the "build" directory using the following command:

cd Home/Project_two/build  

b) Run following command:

cmake .. -DNAIA_DIR=Home/NAIA_library/naia.install/cmake -DNSL_DIR=Home/NSL_library/nsl.install/cmake 

c) For creating "main.exe", run the following command:

make  

d) To run the "main.exe", use the following command:

./main input.root output.root  

Note 1: if you used the above command for running the "main.exe", "ui-ams" machine directly do it. Just for testing of one "input.root" is OK but keep avoiding doing heavy analysis on the "ui-ams" because it can easily crash.

Note 2: For heavy analysis you should use HTCondor, which is explained in the following sections.

7.3) An Example of C++ Code for AMS Raw Data Analysis

To analyze AMS raw data stored in .root files, you need a C++ code that is structured around the NAIA and NSL libraries. For learining purpose a complete example is available: a ready-made C++ code dedicated to proton analysis. This code was developed by F. Faldi, M. Orcinha, F. Donnini, and V. Formato. It includes:

  • A functional implementation of data analysis using NAIA and NSL
  • Supporting scripts to run the analysis pipeline
  • Inline comments and documentation to guide users

The repository is accessible via the following CERN GitLab link:

🔗 Proton Analysis C++ code, a CERN account is required to access the repository.

This example serves as a practical starting point and reference for building your own analysis code tailored to different datasets or physics goals within the AMS framework.

strutture/roma1/experiments/ams2/internal_note_naia.txt · Last modified: 2025/06/18 12:19 by mmohamma@infn.it

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki