Table of Contents
Technical Guide for Extracting and Processing AMS-02 Data Using ROOT, the NAIA Framework (Ntuples for AMS-Italy Analysis), and HTCondor
A Practical Setup and Workflow Document
INFN Roma Internal Note - Version 1.0 - 1 June 2025
Authors: Alessandro Bartoloni and Mustafa Mohammad Rafiei
Affiliations: INFN Roma I
Last Updated: — Mustafa Mohammad Rafiei 2025/06/18 12:19
Abstract
This guide explains the process of extracting and processing AMS-02 data using ROOT, the NAIA framework, and HTCondor. First, having CERN and CNAF accounts is essential, and connecting to the "ui-ams" server is required. The initial setup includes creating SSH keys for secure remote access.
After preparing accounts, AlmaLinux 9.5 is installed as the recommended operating system. Then, Visual Studio Code (VS Code) is installed for code development, and the Remote - SSH extension is configured to connect to "ui-ams."
Once VS Code is set up, the NAIA library is installed on the CNAF server or a local machine. Projects using NAIA require configuring the CMakeLists.txt file, and directory structures are defined for organization. The NSL library is then installed, and the usage of both libraries within a project is explained.
Next, relevant C++ code is written, compiled, and executed. To enable parallel processing for ROOT data, HTCondor is configured, and necessary ".bashrc" and ".bash_profile" settings are applied on "ui-ams."
A list of ROOT data files is prepared, and a "run.sh" script is written for execution. Job files (.sub) are created for each data file and submitted to HTCondor for parallel processing. Any held jobs can be resubmitted.
After processing, output ".root" files are merged. A similar procedure is followed for Monte Carlo simulation data, where directory structures are set, file lists are created, and job submission occurs.
Monte Carlo jobs are submitted in parallel within HTCondor, and if needed, resubmissions are performed. Finally, the processed output ".root" files are merged.
This guide covers all essential steps for efficiently and systematically executing AMS-02 data analysis, ensuring an organized workflow from setup to final processing.
1) Having a CNAF and CERN account and connect to the ui-ams machine
1.1) CNAF and CERN account
To access AMS computing resources, you must first request valid CNAF and CERN accounts. Please note that approval may take some time.
Once your accounts are approved and you receive your username and password, you can connect to CNAF services.
1.2) Connecting to the “ui-ams” machine
The ui-ams machine is a virtual machine hosted on a physical server known as the bastion.
To access ui-ams, you must first log in to the bastion server.
Steps:
Open a terminal and connect to the bastion:
ssh UserName@bastion.cnaf.infn.it
After logging in, connect to the virtual machine:
ssh UserName@ui-ams
Note: bastion.cnaf.infn.it is a real server, ui-ams is a virtual machine running on the bastion.
1.3) Create ssh public and private keys
1.3 Creating SSH Keys for Passwordless Access This step is optional but highly recommended for avoiding repeated password entry.
a) Install SSH (if not already installed)
sudo dnf install openssh-clients -y
On AlmaLinux, this package is usually already installed.
b) Generate SSH Key Pair
Recommended (modern and secure):
ssh-keygen -t ed25519 -C "your_email@example.com"
Alternative (if required for compatibility):
ssh-keygen -t rsa -b 4096 -C "your_email@example.com"
You can leave the passphrase empty or enter one for added security.
c) Copy Your Public Key to the Bastion Server
ssh-copy-id username@bastion.cnaf.infn.it
If this fails, use the manual method below.
Manual Method (only if ssh-copy-id fails): Connect to the bastion:
ssh username@bastion.cnaf.infn.it
On the server:
mkdir -p ~/.ssh nano ~/.ssh/authorized_keys
On your local machine, open your public key file (~/.ssh/id_ed25519.pub or .id_rsa.pub) and copy its contents.
Paste the key into the authorized_keys file on the server.
Set proper permissions:
chmod 700 ~/.ssh chmod 600 ~/.ssh/authorized_keys
d) Copy Public Key to ui-ams via Bastion
Use ProxyJump to transfer your public key to the virtual machine:
ssh-copy-id -o ProxyJump=UserName@bastion.cnaf.infn.it username@ui-ams
e) SSH Configuration for Simpler Access
Edit your SSH config file:
nano ~/.ssh/config
Add the following:
Host bastion HostName bastion.cnaf.infn.it User UserName Host ui-ams HostName ui-ams User UserName ProxyJump UserName@bastion.cnaf.infn.it IdentityFile ~/.ssh/id_rsa
Now you can simply connect with:
ssh ui-ams
f) Optional: Using RSA Instead of ed25519
If your system requires RSA for compatibility:
Generate RSA key pair:
ssh-keygen -t rsa -b 4096 -C "your_email@example.com"
Use the same ssh-copy-id steps to transfer the key to both servers. Your existing SSH config works the same way. If you want to specify the RSA key explicitly: Host ui-ams
HostName ui-ams User UserName ProxyJump bastion IdentityFile ~/.ssh/id_rsa
Test the connection:
ssh ui-ams
2) Installing AlmaLinux 9.5
AlmaLinux 9.5 is the recommended OS for local development of NAIA/NSL‑based C++ analysis. You can either run it in a VM on top of Windows or install it alongside Windows (dual‑boot). Dual‑boot is preferred for full native performance.
2.1 Download the AlmaLinux 9.5 ISO - Official ISO & Cloud Images: → https://almalinux.org/get-almalinux/
- Installation Guide (step‑by‑step with screenshots): → https://wiki.almalinux.org/documentation/installation-guide.html
2.2 Installation Methods
- Virtual Machine (No disk repartitioning required)
recommended VM platforms:
- Oracle VirtualBox → https://www.virtualbox.org/wiki/Downloads
- VMware Workstation → https://www.vmware.com/products/workstation-pro.html
Example tutorial: “AlmaLinux 9.5 Installation on VirtualBox” → https://www.servermania.com/kb/articles/how-to-install-almalinux-in-virtualbox/
- Dual‑Boot with Windows (Full native performance, ideal for compiling/testing C++ code against ROOT, NAIA, NSL)
General dual‑boot tutorial: “How to Dual‑Boot Linux and Windows” → https://itsfoss.com/dual-boot-linux-windows-guide/
2.3 Quick Installation Steps
- Create bootable USB with Rufus (Windows) or `dd` (Linux/macOS).
- Boot from USB, choose “Install AlmaLinux 9.5.”
- Partition disk: create or resize Windows partition, then allocate space for AlmaLinux.
- Select software: include “Development Tools” group to get compilers/CMake.
- Set hostname, user account, and firewall rules as needed.
- Reboot and choose AlmaLinux or Windows at GRUB menu.
Note: Even though most heavy analysis runs on CNAF servers, having a local AlmaLinux 9.5 environment lets you quickly build and test your C++ code (with 1–2 ROOT input files) before scaling up to batch jobs on HTCondor.
3) Installing Visual Studio Code (VS Code) on AlmaLinux 9.5
Visual Studio Code (VS Code) is a lightweight yet powerful source-code editor developed by Microsoft. Known for its speed, extensibility, and robust feature set, it is widely used by developers. Note: VS Code also has versions for Windows and macOS.
3.1) Installing Visual Studio Code on AlmaLinux 9.5
Follow these steps to install Visual Studio Code (VS Code) on AlmaLinux 9.5:
Update Your System
Before installing, update your system to ensure it has the latest packages:
sudo dnf update -y
Add the Microsoft GPG Key
To ensure the authenticity of the packages, import the Microsoft GPG key:
sudo rpm --import https://packages.microsoft.com/keys/microsoft.asc
Add the VS Code Repository
Create a repository file for VS Code:
printf "[vscode]\nname=packages.microsoft.com\nbaseurl=https://packages.microsoft.com/yumrepos/vscode/\nenabled=1\ngpgcheck=1\nrepo_gpgcheck=1\ngpgkey=https://packages.microsoft.com/keys/microsoft.asc\nmetadata_expire=1h" | sudo tee -a /etc/yum.repos.d/vscode.repo
Install VS Code
Now, install Visual Studio Code using the following command:
sudo dnf install code -y
Launch VS Code
Once the installation is complete, you can launch Visual Studio Code by typing:
code
Alternatively, you can open it via the graphical user interface (GUI).
Update VS Code
To keep VS Code up to date, use the following command:
sudo dnf update code -y
Alternatively, check for updates within the VS Code interface.
Uninstall (Optional)
If you need to remove Visual Studio Code, use the following command:
sudo dnf remove code -y
You can also delete the repository file by running:
sudo rm -f /etc/yum.repos.d/vscode.repo
Note: Ensure you have internet access for these steps. If you're working offline, you can manually download the RPM package and install it. If you encounter issues like missing dependencies (e.g., `libffmpeg.so`), you may need to install additional libraries or use Flatpak/Snap.
3.2) Setting Up VS Code to Connect to ui-ams via SSH
Now that you've set up your SSH keys and configured access to `ui-ams` via the intermediate server (`bastion.cnaf.infn.it`), follow these steps to configure Visual Studio Code (VS Code) to connect to `ui-ams` via SSH.
I) Ensure SSH Keys and Configuration Are Set Up
Make sure you've already completed the SSH key generation and configuration for connecting to `ui-ams`.
II) Install Visual Studio Code on Your Laptop
If you haven't already, install Visual Studio Code on your AlmaLinux 9.5 machine.
III) Install the Remote - SSH Extension in VS Code
To enable remote access via SSH, you'll need the "Remote - SSH" extension. Follow these steps:
- Open Visual Studio Code.
- Go to the Extensions view by pressing Ctrl+Shift+X or clicking the Extensions icon on the sidebar.
- Search for "Remote - SSH" (developed by Microsoft).
- Click Install to add the extension to VS Code.
Explanation: The Remote - SSH extension allows you to connect to remote servers via SSH, edit files, run terminal commands, and debug remotely as if the server were local.
IV) Configure VS Code to Connect to ui-ams
VS Code's Remote - SSH extension uses your existing SSH configuration (`~/.ssh/config`) to connect to remote hosts.
Ensure your SSH config is correctly set up with the ProxyJump directive for `bastion.cnaf.infn.it`.
To initiate the connection:
- Click the green Remote-SSH button located in the bottom-left corner of VS Code (or press Ctrl+Shift+P and search for "Remote-SSH: Connect to Host").
- From the list of hosts, select ui-ams.
Explanation: The Remote - SSH extension reads the `~/.ssh/config` file and automatically routes the connection through `bastion.cnaf.infn.it` using the ProxyJump configuration.
V) Authenticate and Connect
On your first connection attempt, VS Code may ask you to enter the passphrase for your SSH key (if you set one). After authenticating, VS Code will establish the connection to `ui-ams`.
You may also be prompted to verify the server's fingerprint for security purposes. Type yes to continue.
Explanation: Since you’ve already copied your SSH public key to both servers, the connection should be passwordless unless you set a passphrase for your SSH key.
VI) Set Up the Remote Environment
During the first connection, VS Code will automatically install a lightweight server component on `ui-ams`. Once installed, you'll be able to interact with `ui-ams` as if it were a local machine.
Explanation: The server component allows you to edit files, use the terminal, and install extensions on `ui-ams`.
VII) Work on ui-ams
After connecting, you can start working on `ui-ams`:
- Open Files/Folders: Use the File Explorer in VS Code to open a directory on `ui-ams` (e.g., `/home/your_username/myproject`).
- Use the Terminal: Open the integrated terminal (Ctrl+`) to run commands directly on `ui-ams`.
- Install Extensions Remotely: Install extensions (e.g., C++, Python, GitHub Copilot) on `ui-ams` to enhance your development workflow.
Explanation: Once connected, VS Code treats `ui-ams` as if it were a local machine. You can edit, run, and debug files directly on the remote server.
VIII) Test the Connection
To ensure everything is set up correctly, try opening a file or running a command in the terminal (e.g., `pwd` should return `/home/your_username` on `ui-ams`).
Explanation: This confirms that VS Code is properly connected to `ui-ams`, and you're ready to work remotely. z
4) Installing the NAIA Library on CNAF or Local Machine
last edit — Alessandro Bartoloni 2025/05/31
For official instructions, refer to the NAIA documentation:
https://naia-readthedocs.readthedocs.io/en/1.2.0/build-install.html
4.1) Requirements
To build and use NAIA locally, your system must meet the following requirements:
- A C++ compiler with full C++17 support (tested with GCC ≥ 12.1.0)
- CMake version ≥ 3.13
- ROOT built with C++17 support (tested with ROOT ≥ 6.28/04)
Supported platforms: CentOS7 and RHEL9 derivatives (e.g., AlmaLinux 9, Rocky Linux 9)
If your machine has access to CVMFS, you can use precompiled NAIA versions and required dependencies located at:
- `/cvmfs/ams.cern.ch/Offline/amsitaly/public/install/x86_64-centos7-gcc12.1/naia`
- `/cvmfs/ams.cern.ch/Offline/amsitaly/public/install/x86_64-el9-gcc12.1/naia`
Each version includes a ready-to-use environment setup script, e.g., for CentOS7:
/cvmfs/ams.cern.ch/Offline/amsitaly/public/install/x86_64-centos7-gcc12.1/naia/v1.1.0/setenvs/setenv_gcc6.28_cc7.sh
Note: These requirements are only necessary if installing NAIA on a local machine. All dependencies are already pre-installed on CNAF (ui-ams).
4.2) Building and Installing NAIA
Follow the steps below to build and install NAIA from source.
I) Clone the NAIA (v1.2.0) Repository
Choose one of the following methods (a CERN account is required):
git clone --recursive https://username@gitlab.cern.ch:443/ams-italy/naia.git -b v1.2.0 #Kerberos git clone --recursive ssh://git@gitlab.cern.ch:7999/ams-italy/naia.git -b v1.2.0 #SSH git clone --recursive https://gitlab.cern.ch/ams-italy/naia.git -b v1.2.0 #HTTPS
*Note:* check for the latest version of NAIA , version 1.2.0 is used in the above command
II) Create Build and Install Directories
Use the command
mkdir ~/NAIA_library/naia.build ~/NAIA_library/naia.install
Assume your project structure is as follows:
~/NAIA_library/ ├── naia/ # Source directory (from git clone) ├── naia.build/ # Build directory └── naia.install/ # Target installation directory==== III) Build and Install NAIA ====
III) Build and Install NAIA
Run the following commands:
cd ~/NAIA_library/naia.build cmake ../naia -DCMAKE_INSTALL_PREFIX=../naia.install make all install
IV) Set Up Environment Variables on CNAF (ui-ams)
Edit your shell configuration files:
Open your .bashrc:
nano ~/.bashrc
Add the following line at the end (replace <your_username> with your CNAF username):
source /storage/gpfs_ams/ams/users/abartoloni/NAIA_library/naia/setenvs/setenv_gcc6.28_el9.sh
The setenv_gcc6.28_el9.sh script is located in the setenvs folder inside the NAIA source directory you cloned earlier.
Save and exit
Apply the changes:
source ~/.bashrc
Repeat the process for .bash_profile:
nano ~/.bash_profile
Add the same line:
source /storage/gpfs_ams/ams/users/abartoloni/NAIA_library/naia/setenvs/setenv_gcc6.28_el9.sh
Then:
source ~/.bash_profile
5) Using NAIA in the Project
last edit — Alessandro Bartoloni 2025/05/31
To use NAIA ntuples in your C++ project, you need to include:
Header files located in:
naia.install/include
Libraries located in:
naia.install/lib/libNAIAUtility.so naia.install/lib/libNAIAContainers.so naia.install/lib/libNAIAChain.so
The recommended approach for integrating NAIA into your project is by using CMake.
5.1) CMakeLists.txt NAIA example file
To use the NAIA library in your project, it is recommended to use CMake, a flexible and powerful build system.
NAIA provides CMake configuration files that simplify the integration process by automatically handling include paths and library linking.
The following example sets up a basic C++ project that:
- Requires CMake version 3.10 or higher.
- Uses the C++17 standard.
- Depends on NAIA version 1.2.0 or newer.
- Builds an executable named main from the source file src/main.cpp.
- Links the executable with the NAIA::NAIAChain component of the NAIA library.
Below is a sample CMakeLists.txt file with inline comments explaining each step:
# Specify the minimum required version of CMake to build the project cmake_minimum_required(VERSION 3.10) # Define the name of the project project(MyNAIAProject) # Set the C++ standard to C++17 (required by NAIA) set(CMAKE_CXX_STANDARD 17) # Find the NAIA library version 1.2.0 or higher; this is required to build the project find_package(NAIA 1.2.0 REQUIRED) # Define an executable named 'main' using the source file src/main.cpp add_executable(main src/main.cpp) # Link the 'main' executable with the NAIAChain component of NAIA # This makes NAIA functionality available in your code target_link_libraries(main PUBLIC NAIA::NAIAChain)
5.2) Recommended Project Directory Structure
Assume the following structure for your project:
~/Project_one/ ├── build/ # Directory where you build the project ├── src/ # Contains your C++ source files (e.g., main.cpp) └── CMakeLists.txt # Your build configuration file
5.3) Compiling and Running Your Code
To compile and run the project:
a) Navigate to the build directory:
cd ~/Project_one/build
b) Run CMake, pointing it to the NAIA installation:
cmake .. -DNAIA_DIR=~/NAIA_library/naia.install/cmake
c) Build the executable:
make
d) Run the compiled program:
./main input.root output.root
Note on running on CNAF computing facility: Running this on the ui-ams machine is acceptable for testing a small number of .root files. Avoid performing large-scale data analysis directly on ui-ams—it may lead to system crashes. For large jobs, please use HTCondor, as described in the following sections.
6) Installing the NAIA Selection Library
last edit — Alessandro Bartoloni 2025/05/31
The NAIA Selection Library(NSL) is a companion library developed specifically for use with the NAIA framework. While NAIA handles structured access to AMS ntuples and datasets, NSL provides a growing collection of utility functions and tools designed to simplify selection logic and analysis workflows.
Many functions in NSL significantly reduce code complexity and improve readability, allowing you to write cleaner, more maintainable analysis code. If you're working on an AMS project using NAIA, incorporating NSL can greatly enhance both development speed and clarity.
6.1) Requirements
✅ Note: If you are working in an environment where NAIA is already installed and configured, you should be ready to go.
The requirements for installing NSL are largely the same as for NAIA:
- A C++ compiler with full C++17 support (tested with GCC ≥ 9.3.0)
- CMake version ≥ 3.13
- A ROOT installation compiled with C++17 support (tested with ROOT ≥ 6.22/08, recommended ≥ 6.26/02)
6.2) Building and Installing
Follow these steps to build and install NSL:
I) Clone the NSL repository:
git clone https://<username>@gitlab.cern.ch:443/ams-italy/nsl.git # (Kerberos) git clone ssh://git@gitlab.cern.ch:7999/ams-italy/nsl.git # (SSH) git clone https://gitlab.cern.ch/ams-italy/nsl.git # (HTTPS)
⚠️ Note: You need a valid CERN account to access the repository.
II) Create build and install directories:
mkdir nsl.build nsl.install
Assume the following directory structure:
$ Home/NSL_library/nsl # NSL source directory $ Home/NSL_library/nsl.build # Build directory $ Home/NSL_library/nsl.install # Installation directory
III) Build and install NSL:
Before starting, ensure the NAIADIR environment variable is set properly.
cd Home/NSL_library/nsl.build cmake ../nsl -DCMAKE_INSTALL_PREFIX=../nsl.install make all install
🧠 Tip: You can also add the NSL environment setup script to your .bashrc or .bash_profile to make it persistent across sessions (similar to NAIA).
7) Using both NAIA and NSL in the project
last edit — Alessandro Bartoloni 2025/06/03
To develop a C++ project based on NAIA and NSL, you need to include both libraries properly.
Specifically, your project should have access to:
- The NSL headers located in nsl.install/include
- The NSL shared library: nsl.install/lib/libNSLSelections.so
Like NAIA, NSL exports a CMake target called NSL::NSLSelections, which should be linked in your project.
Important: Since NSL is an extension of NAIA, your project must fulfill all prerequisites for both libraries.
Here is a basic CMake configuration for a project that uses both NAIA and NSL:
# Specify the minimum required version of CMake to build the project cmake_minimum_required(VERSION 3.10) # Define the name of the project project(MyNAIAProject) # Set the C++ standard to C++17 (required by NAIA) set(CMAKE_CXX_STANDARD 17) # Find the NAIA library version 1.2.0 or higher and NSL library; # this is required to build the project find_package(NAIA 1.2.0 REQUIRED) find_package(NSL REQUIRED) # Define an executable named 'main' using the source file src/main.cpp add_executable(main src/main.cpp) # Link the 'main' executable with the NAIAChain component of NAIA and with # NSLSelections component of NSL # This makes NAIA and NSL functionality available in your code target_link_libraries(main PUBLIC NAIA::NAIAChain NSL::NSLSelections)
7.1) Reccomended directory structure
Assume the following directory organization:
Home/ └── Project_two/ ├── build/ # Build directory ├── src/ # Contains your main.cpp └── CMakeLists.txt # At the root of the project
Place your source file (e.g., main.cpp) inside the src/ directory.
7.2) Compiling C++ code and creating a main.exe and run it
Follow below:
a) Enter the "build" directory using the following command:
cd Home/Project_two/build
b) Run following command:
cmake .. -DNAIA_DIR=Home/NAIA_library/naia.install/cmake -DNSL_DIR=Home/NSL_library/nsl.install/cmake
c) For creating "main.exe", run the following command:
make
d) To run the "main.exe", use the following command:
./main input.root output.root
Note 1: if you used the above command for running the "main.exe", "ui-ams" machine directly do it. Just for testing of one "input.root" is OK but keep avoiding doing heavy analysis on the "ui-ams" because it can easily crash.
Note 2: For heavy analysis you should use HTCondor, which is explained in the following sections.
7.3) An Example of C++ Code for AMS Raw Data Analysis
To analyze AMS raw data stored in .root files, you need a C++ code that is structured around the NAIA and NSL libraries. For learining purpose a complete example is available: a ready-made C++ code dedicated to proton analysis. This code was developed by F. Faldi, M. Orcinha, F. Donnini, and V. Formato. It includes:
- A functional implementation of data analysis using NAIA and NSL
- Supporting scripts to run the analysis pipeline
- Inline comments and documentation to guide users
The repository is accessible via the following CERN GitLab link:
🔗 Proton Analysis C++ code, a CERN account is required to access the repository.
This example serves as a practical starting point and reference for building your own analysis code tailored to different datasets or physics goals within the AMS framework.