### EUROPEAN ORGANIZATION FOR NUCLEAR RESEARCH





### ALICE upgrades during the LHC Long Shutdown 2

ALICE Collaboration

#### Abstract

A Large Ion Collider Experiment (ALICE) has been conceived and constructed as a heavy-ion experiment at the LHC. During LHC Runs 1 and 2, it has produced a wide range of physics results using all collision systems available at the LHC. In order to best exploit new physics opportunities opening up with the upgraded LHC and new detector technologies, the experiment has undergone a major upgrade during the LHC Long Shutdown 2 (2019–2022). This comprises the move to continuous readout, the complete overhaul of core detectors, as well as a new online event processing farm with a redesigned online-offline software framework. These improvements will allow to record Pb–Pb collisions at rates up to 50 kHz, while ensuring sensitivity for signals without a triggerable signature.

© 2023 CERN for the benefit of the ALICE Collaboration. Reproduction of this article or parts of it is allowed as specified in the CC-BY-4.0 license.

### Contents

| 1 | Intr | oductio   | n                                                                           | 11 |
|---|------|-----------|-----------------------------------------------------------------------------|----|
|   | 1.1  | Motiva    | ution                                                                       | 11 |
|   | 1.2  | Experi    | mental setup                                                                | 11 |
|   | 1.3  | Data s    | amples                                                                      | 13 |
|   | 1.4  | Outlin    | e                                                                           | 14 |
| 2 | Syst | em desi   | gn and common developments                                                  | 15 |
|   | 2.1  | System    | n design                                                                    | 15 |
|   | 2.2  | Comm      | on readout unit                                                             | 16 |
|   | 2.3  | The A     | LPIDE Chip                                                                  | 18 |
|   |      | 2.3.1     | Technology, Sensing, Pixels                                                 | 18 |
|   |      | 2.3.2     | Analog Front-End and Discriminator                                          | 19 |
|   |      | 2.3.3     | Matrix and Readout                                                          | 22 |
|   |      | 2.3.4     | Features for integration of ITS2 modules                                    | 23 |
|   |      | 2.3.5     | Power consumption                                                           | 24 |
|   |      | 2.3.6     | Results from the experimental characterization in laboratory and beam tests | 25 |
|   | 2.4  | SAMP      | Α                                                                           | 26 |
|   |      | 2.4.1     | CSA and shaper                                                              | 28 |
|   |      | 2.4.2     | ADC                                                                         | 28 |
|   |      | 2.4.3     | DSP and readout                                                             | 28 |
|   |      | 2.4.4     | Physical implementation and packaging                                       | 30 |
|   |      | 2.4.5     | SAMPA performance and tests                                                 | 30 |
| 3 | Dete | ector sys | stems                                                                       | 32 |
|   | 3.1  | Coord     | inate system                                                                | 32 |
|   | 3.2  | Inner 7   | Fracking System                                                             | 32 |
|   |      | 3.2.1     | Stave modules                                                               | 33 |
|   |      | 3.2.2     | Global support mechanics and services                                       | 38 |
|   |      | 3.2.3     | Readout and powering systems                                                | 40 |
|   |      | 3.2.4     | The readout system                                                          | 41 |
|   |      | 3.2.5     | The powering system                                                         | 42 |
|   |      | 3.2.6     | Component production, detector assembly, and commissioning on surface       | 42 |
|   |      | 3.2.7     | Detector calibration                                                        | 44 |

|     | 3.2.8   | Installation and global commissioning                     | 44 |
|-----|---------|-----------------------------------------------------------|----|
|     | 3.2.9   | First results from global commissioning                   | 45 |
| 3.3 | Muon    | Forward Tracker                                           | 48 |
|     | 3.3.1   | Detector layout                                           | 48 |
|     | 3.3.2   | Ladder assembly and testing                               | 49 |
|     | 3.3.3   | Half-Disks                                                | 51 |
|     | 3.3.4   | Cone and Barrel                                           | 51 |
|     | 3.3.5   | Services                                                  | 53 |
|     | 3.3.6   | Readout                                                   | 54 |
|     | 3.3.7   | Detector commissioning                                    | 54 |
| 3.4 | Time I  | Projection Chamber                                        | 56 |
|     | 3.4.1   | Introduction                                              | 56 |
|     | 3.4.2   | Readout chamber design                                    | 57 |
|     | 3.4.3   | Foil production, chamber production and quality assurance | 59 |
|     | 3.4.4   | Field cage                                                | 60 |
|     | 3.4.5   | HV system                                                 | 61 |
|     | 3.4.6   | Front-end electronics and readout                         | 62 |
|     | 3.4.7   | Installation                                              | 64 |
|     | 3.4.8   | Performance                                               | 65 |
|     | 3.4.9   | Calibration                                               | 66 |
| 3.5 | Fast In | teraction Trigger                                         | 69 |
|     | 3.5.1   | FT0                                                       | 69 |
|     | 3.5.2   | FV0                                                       | 70 |
|     | 3.5.3   | FDD                                                       | 70 |
|     | 3.5.4   | Electronics and readout scheme                            | 71 |
| 3.6 | Muon    | System                                                    | 72 |
|     | 3.6.1   | Muon Tracking                                             | 72 |
|     | 3.6.2   | Muon Identifier                                           | 76 |
| 3.7 | Transi  | tion Radiation Detector                                   | 81 |
|     | 3.7.1   | High-voltage distribution and common mode                 | 81 |
|     | 3.7.2   | Readout                                                   | 81 |
|     | 3.7.3   | Detector control                                          | 84 |
|     | 3.7.4   | Standalone tracking                                       | 85 |

|   |       | 3.7.5    | Calibration                                                    | 85  |
|---|-------|----------|----------------------------------------------------------------|-----|
|   |       | 3.7.6    | Quality Control                                                | 86  |
|   | 3.8   | Time-o   | f-Flight detector                                              | 87  |
|   |       | 3.8.1    | Implementation of continuous readout                           | 87  |
|   |       | 3.8.2    | The new Data Readout Module (DRM2)                             | 88  |
|   |       | 3.8.3    | Additional upgrades in low voltage and quality control systems | 90  |
|   | 3.9   | High-m   | nomentum particle identification                               | 92  |
|   |       | 3.9.1    | Introduction                                                   | 92  |
|   |       | 3.9.2    | Upgrading of readout firmware and trigger                      | 92  |
|   |       | 3.9.3    | New readout firmware and readout rate                          | 92  |
|   |       | 3.9.4    | Detector calibration formalism                                 | 93  |
|   |       | 3.9.5    | Other subsystems                                               | 93  |
|   | 3.10  | Electro  | magnetic Calorimeter                                           | 96  |
|   |       | 3.10.1   | The readout system                                             | 97  |
|   |       | 3.10.2   | Trigger                                                        | 97  |
|   |       | 3.10.3   | Spare production                                               | 98  |
|   |       | 3.10.4   | Front-end electronics firmware upgrade                         | 98  |
|   |       | 3.10.5   | Data compression                                               | 99  |
|   |       | 3.10.6   | Calibration                                                    | 99  |
|   |       | 3.10.7   | Quality Control                                                | 99  |
|   | 3.11  | Photon   | Spectrometer                                                   | 101 |
|   |       | 3.11.1   | Detector layout                                                | 101 |
|   |       | 3.11.2   | Readout                                                        | 102 |
|   |       | 3.11.3   | Performance                                                    | 103 |
|   | 3.12  | Zero-D   | Pegree Calorimeter                                             | 105 |
| 4 | Maal  | honica   | and integration                                                | 108 |
| 4 | wiech | names a  | and integration                                                | 109 |
| 5 | Read  | lout and | d data processing                                              | 110 |
|   | 5.1   | Readou   | It data flow                                                   | 110 |
|   |       | 5.1.1    | Synchronous reconstruction                                     | 112 |
|   | 5.2   | First-L  | evel Processors                                                | 114 |
|   |       | 5.2.1    | The FLP detector readout farm                                  | 114 |
|   |       | 5.2.2    | Data quality control                                           | 115 |

|      | 5.2.3    | Services                                   | 116 |
|------|----------|--------------------------------------------|-----|
|      | 5.2.4    | Installation and commissioning             | 118 |
| 5.3  | Event    | Processing Nodes                           | 118 |
|      | 5.3.1    | EPN farm                                   | 118 |
|      | 5.3.2    | EPN installation                           | 119 |
|      | 5.3.3    | $O^2$ data distribution                    | 120 |
| 5.4  | Physic   | s data processing                          | 122 |
|      | 5.4.1    | Asynchronous reconstruction                | 122 |
|      | 5.4.2    | Simulation                                 | 123 |
|      | 5.4.3    | Analysis                                   | 124 |
| 5.5  | Centra   | l Trigger System                           | 124 |
|      | 5.5.1    | Requirements of the Central Trigger System | 125 |
|      | 5.5.2    | Trigger hardware and interfaces            | 125 |
|      | 5.5.3    | Trigger protocol and data format           | 125 |
| 5.6  | Detect   | or Control System                          | 126 |
|      | 5.6.1    | DCS computing hardware upgrades            | 128 |
|      | 5.6.2    | DCS software upgrades                      | 128 |
|      | 5.6.3    | DCS conditions data                        | 131 |
|      | 5.6.4    | DCS operator environment                   | 132 |
| Phys | sics per | formance                                   | 135 |

#### 7 Conclusions and outlook

6

140

# List of Figures

| 1  | ALICE 2 detector systems                                     | 12 |
|----|--------------------------------------------------------------|----|
| 2  | Accumulation of integrated luminosity                        | 14 |
| 3  | Time frame and heartbeat frame structure                     | 16 |
| 4  | ALICE readout architecture                                   | 17 |
| 5  | Block diagram of the common readout unit                     | 19 |
| 6  | Picture of a common readout unit                             | 20 |
| 7  | Picture of ALPIDE                                            | 20 |
| 8  | Cross section of ALPIDE pixel cell                           | 21 |
| 9  | ALPIDE architecture                                          | 21 |
| 10 | Block diagram of the ALPIDE chip.                            | 23 |
| 11 | Diagrams of the ITS2 inner barrel and outer barrel modules   | 24 |
| 12 | ALPIDE detection efficiency and fake-hit rate                | 25 |
| 13 | ALPIDE position resolution and cluster size                  | 26 |
| 14 | Block diagram of the SAMPA ASIC                              | 27 |
| 15 | Block diagram of the front-end implemented in the SAMPA ASIC | 28 |
| 16 | Block diagram of the SAMPA SAR ADC                           | 29 |
| 17 | Diagram of SAMPA DSP                                         | 29 |
| 18 | SAMPA chip                                                   | 30 |
| 19 | Example response curve for 4 mV/fC configuration.            | 31 |
| 20 | ITS2 layout                                                  | 33 |
| 21 | ITS2 stave layout                                            | 34 |
| 22 | ITS2 hybrid integrated circuit                               | 35 |
| 23 | ITS2 outer barrel HIC                                        | 36 |
| 24 | Pictures of the ITS2 assembly                                | 37 |
| 25 | Space frame and cold plate cooling scheme                    | 38 |
| 26 | Material composition of ITS2 staves                          | 39 |
| 27 | ITS2 support structures                                      | 40 |
| 28 | Readout unit design                                          | 41 |
| 29 | Schematic description of the stave production workflow.      | 43 |
| 30 | ITS2 in the clean room during on-surface commissioning       | 43 |
| 31 | The fake-hit rate of an inner half-barrel                    | 44 |
| 32 | ITS2 installation                                            | 46 |

| 33 | Cosmic muons in ITS2                          | 47 |
|----|-----------------------------------------------|----|
| 34 | Primary vertices reconstructed by ITS2        | 47 |
| 35 | Overview of Muon Forward Tracker              | 48 |
| 36 | MFT detector elements                         | 49 |
| 37 | FIT installation                              | 49 |
| 38 | MFT ladder                                    | 50 |
| 39 | Half-disk (exploded view)                     | 52 |
| 40 | MFT disk gluing                               | 52 |
| 41 | Half-cone structure                           | 53 |
| 42 | MFT PSU boards                                | 54 |
| 43 | MFT sensors noise rate                        | 55 |
| 44 | Event display                                 | 55 |
| 45 | Schematic view of the ALICE TPC.              | 56 |
| 46 | GEM stack                                     | 57 |
| 47 | Energy resolution as function of ion backflow | 58 |
| 48 | Exploded view of IROC                         | 59 |
| 49 | GEM design details                            | 61 |
| 50 | GEM stack powering                            | 62 |
| 51 | TPC readout system                            | 63 |
| 52 | TPC FEC layout                                | 64 |
| 53 | TPC d $E$ /d $x$ performance in pilot beam    | 66 |
| 54 | Ion tail measured in TPC GEM                  | 67 |
| 55 | Ion tail and common-mode effect in TPC GEM    | 68 |
| 56 | Overview of FIT detectors                     | 69 |
| 57 | Picture of FV0                                | 70 |
| 58 | Block diagram FIT readout                     | 71 |
| 59 | Tracking system                               | 72 |
| 60 | Tracking readout scheme                       | 73 |
| 61 | DualSAMPA                                     | 73 |
| 62 | Electronic links                              | 74 |
| 63 | FLEX scheme                                   | 74 |
| 64 | SOLAR scheme                                  | 75 |
| 65 | Data flow                                     | 75 |

| 66 | CRU Scheme                                           | 76 |
|----|------------------------------------------------------|----|
| 67 | Overview of one MID half-plane in open position      | 77 |
| 68 | FEERIC architecture and picture                      | 79 |
| 69 | Wireless threshold distribution                      | 79 |
| 70 | MID readout architecture                             | 80 |
| 71 | High-voltage status TRD supermodules                 | 82 |
| 72 | TRD common-mode signal                               | 82 |
| 73 | TRD calibration                                      | 85 |
| 74 | TOF continuous readout implementation                | 87 |
| 75 | TOF hit time distribution readout in continuous mode | 88 |
| 76 | TOF DRM2 card components                             | 89 |
| 77 | HMPID front-end electronics                          | 92 |
| 78 | HMPID data acquisition                               | 93 |
| 79 | HMPID event rate as a function of occupancy          | 94 |
| 80 | HMPID event rate as a function of occupancy          | 95 |
| 81 | Schematic view of EMCal                              | 97 |
| 82 | SRU readout rate                                     | 99 |
| 83 | PHOS module                                          | 01 |
| 84 | Signal waveform of PHOS FEE channel                  | 02 |
| 85 | PHOS time resolution                                 | 04 |
| 86 | Performance of ZDC digitizer                         | 07 |
| 87 | Support of beampipe, ITS2, and MFT 1                 | 08 |
| 88 | ALICE 2 beampipe                                     | 09 |
| 89 | Readout and processing overview                      | 11 |
| 90 | Synchronous reconstruction workflow                  | 12 |
| 91 | FLP dataflow                                         | 15 |
| 92 | $O^2$ Quality Control design                         | 16 |
| 93 | AliECS design                                        | 17 |
| 94 | The $O^2$ computing system monitoring design         | 18 |
| 95 | The ALICE CR0 data centre which houses the EPN farm  | 19 |
| 96 | Network diagram of the Run 3 $O^2$ facility          | 20 |
| 97 | Data distribution software framework                 | 21 |
| 98 | Photograph of a CTP module                           | 26 |

| 99  | Trigger system overview                                  | 127 |
|-----|----------------------------------------------------------|-----|
| 100 | DCS software architecture.                               | 129 |
| 101 | Access to the hardware implemented in ALF-FRED mechanism | 130 |
| 102 | DCS conditions data flow to ORACLE and ADAPOS            | 132 |
| 103 | DCS data exchange with $O^2$ and external consumers      | 133 |
| 104 | ALICE DCS UI with various panels used by the operators.  | 133 |
| 105 | Impact parameter resolution                              | 135 |
| 106 | Projections for charm/beauty $R_{AA}$ and $v_2$          | 135 |
| 107 | Beauty baryon-to-meson ratio projections                 | 137 |
| 108 | Di-electron performance                                  | 137 |
| 109 | Performance projection for $\psi(2S)$ and $J/\psi$       | 138 |
| 110 | Energy loss from recoil jet yield suppression            | 138 |

# List of Tables

| 1  | Requirements for pixel sensor                                                     | 19  |
|----|-----------------------------------------------------------------------------------|-----|
| 2  | SAMPA key parameters                                                              | 27  |
| 3  | Comparison of main detector parameters of the previous ITS1 and the new ITS2      | 33  |
| 4  | Main layout parameters of the new ITS2                                            | 33  |
| 5  | Summary of the ITS2 readout connections and payload capacity                      | 41  |
| 6  | TPC parameters                                                                    | 57  |
| 7  | Geometrical parameters of the new readout chambers                                | 60  |
| 8  | TPC front-end electronics parameters                                              | 63  |
| 9  | TRD tracklet data format                                                          | 83  |
| 10 | FLP readout farm used to transfer the data from the detectors to the $O^2$ system | 114 |

### 1 Introduction

A Large Ion Collider Experiment (ALICE) was proposed, conceived, and built to study the properties of the quark–gluon plasma (QGP) in heavy-ion collisions at the Large Hadron Collider (LHC) at CERN [1]. The design was driven by the requirement to reconstruct tracks at high multiplicity in central Pb–Pb collisions and to provide particle identification over a wide range in transverse momentum ( $p_T$ ). In LHC Runs 1 and 2, the ALICE 1 apparatus was used to record and analyse hadronic collisions ranging from pp to Pb–Pb [2]. The measurements have provided new insights in the properties of the quark–gluon plasma as well as several other aspects of the strong interaction. A comprehensive review of this scientific output was reported in Ref. [3]. During the Long Shutdown 2 (2019 – 2021), major upgrades have led to the new experimental setup, ALICE 2, extending the physics capabilities of the experiment for Runs 3 and 4.

#### 1.1 Motivation

The main objectives of the upgrades in Long Shutdown 2 (LS2) are to significantly improve the capabilities of ALICE to probe the QGP with heavy-flavour quarks, and to enable completely new measurements of the thermal emission of dielectron pairs. In addition, the upgrades significantly improve the precision of measurements in several other areas, such as jet quenching phenomena probing the interactions of high-energy partons, the production of light nuclei, momentum correlations of hadrons to determine the interaction potentials of unstable particles, and the study of collective effects in collisions of protons with high multiplicity. To gain access to these areas of physics a two-fold approach was taken by improving the pointing resolution and increasing the readout rate capabilities of the entire system to collect larger data samples. A thinner and lighter inner tracker with the first layer closer to the interaction point improves the pointing resolution by a factor of 3 in the transverse direction and a factor 6 in the longitudinal direction. This provides more effective suppression of backgrounds in the reconstruction of decays of heavy-flavour mesons and baryons as well as in the dielectron emission measurements. The increase of the readout rate from below 1 kHz to 50 kHz for Pb-Pb collisions leads to improved statistical precision for all measurements, even in the presence of large backgrounds. The improvements in pointing resolution and readout rate will also enable the measurement of thermal dilepton production in Pb-Pb collisions, as well as a number of new measurements of heavy-flavour production, which were out of reach of the ALICE detector in Runs 1 and 2.

#### 1.2 Experimental setup

The experimental setup consists of a central barrel contained in a solenoidal magnet (B = 0.5 T) and a forward muon system with a dipole magnet providing a total bending power of 3 T m, see Fig. 1. The central barrel detector system is designed for efficient tracking in the high track-density environment of heavy-ion collisions, covering transverse momenta from  $\sim 100 \text{ MeV}/c$  to  $\sim 100 \text{ GeV}/c$  with excellent hadron and electron identification capabilities.

Until the end of Run 2, the Inner Tracking System (ITS) which is crucial for the extrapolation of tracks to the primary vertex, consisted of two layers of Silicon Pixel Detectors (SPD), two layers of Silicon Drift Detectors (SDD), and two layers of Silicon Strip Detectors (SSD) [1]. The readout rate of the full ITS was limited to 1 kHz. The ITS was replaced with a new detector (ITS2), based on seven layers of ALPIDE monolithic active pixel sensors (MAPS), which provides better pointing resolution thanks to its reduced distance to the interaction point and better position resolution. It is also able to handle the hit densities resulting from Pb–Pb collisions at 50 kHz interaction rate.

In the radial direction, the ITS is followed by the Time Projection Chamber (TPC) extending from 0.85 m to 2.5 m in radius over a length of 5 m. With the multiwire proportional chambers used in ALICE 1, the ion backflow into the drift region had to be suppressed by active gating, which in turn limited the readout rate to about 700 Hz for Pb–Pb collisions. This limitation is removed in the upgraded TPC by



Figure 1: ALICE 2 detector systems (see legend and text for details).

employing readout chambers based on Gas Electron Multiplier (GEM) foils that reduce the ion backflow and resulting space charge in the TPC to a level that can be corrected for while operating the detector with Pb–Pb interaction rates up to 50 kHz.

The Transition Radiation Detector (TRD) (extending from 2.8 m to 3.5 m in radius) provides additional space points for tracking, which are also used to determine the size of the distortions due to space charge effects in the TPC, as well as dE/dx measurements for particle identification, and the detection of transition radiation for electron identification. The readout electronics were upgraded to minimise the data volume and to reduce the dead time to allow data taking at high interaction rates.

The subsequent Time-of-Flight detector (TOF) allows the identification of hadrons over a wide momentum range and electrons at low momentum. Besides consolidation work on the front-end electronics, the readout was upgraded to handle the increased interaction rates.

A large part of the acceptance in the central barrel is covered by electromagnetic calorimeters. The ElectroMagnetic Calorimeter (EMCal) is realised as Pb-scintillator sampling calorimeters with avalanche photon detector (APD) readout, whereas the PHOton Spectrometer (PHOS) uses PbWO<sub>4</sub> crystals with APD readout. All calorimeters have undergone maintenance and improvements of the readout electronics.

The High Momentum Particle Identification Detector (HMPID) is a ring-imaging Cherenkov detector that adds hadron identification capabilities at large transverse momenta over a limited acceptance. A part of the system was equipped with additional absorbers to facilitate a measurement of the interaction cross section of light antinuclei. Also here, the readout electronics were upgraded to improve the rate capability.

The muon detectors cover the forward pseudorapidity range  $-4.0 < \eta < -2.5$  and use a system of absorbers to remove hadrons and identify muons. The background of secondary muons from pion and

kaon decays in the muon system is small at high  $p_{\rm T}$ , thanks to the so-called 'muon plug' absorber, which is placed at z = 90 cm from the interaction point. The main muon detector stations use multiwire proportional chambers (muon tracking chambers, MCH), and resistive plate chambers (muon identifier, MID), both of which were equipped with new front-end electronics. Following Run 2, and as a new addition to the muon detectors in ALICE 2, the Muon Forward Tracker (MFT) consists of tracking stations with the ALPIDE silicon pixel sensors that are installed in front of the muon plug to improve mass resolution and pointing resolution for the detection of secondary charmonia and muons from B-meson decays.

A set of forward detectors form a Fast Interaction Trigger (FIT), which is used for triggering, event selection and determination of the collision time. The FIT system consists of two arrays of fast Cherenkov radiators placed on both sides of the interaction point (FT0), complemented with 3 sets of scintillator detectors. The interaction trigger is provided by the FT0 together with a large azimuthally segmented scintillator detector placed on the opposite side of the muon detectors, which is also used to determine the reaction-plane orientation in Pb–Pb collisions. Two additional scintillator detectors, FDD, are placed on opposite sides of the interaction point at large distances to cover  $4.7 < \eta < 6.3$  and  $-6.9 < \eta < -4.9$ to select diffractive and ultra-peripheral collisions with rapidity gaps. The FIT detector replaces the T0, V0 and AD detectors, which had similar functionalities in ALICE 1 [1].

The Zero-Degree Calorimeters (ZDC) are installed at  $\approx 100$  m on either side of the interaction point to help determine the centrality and event plane orientation. The readout electronics of the ZDC were upgraded to increase the readout rate to match the rest of the system.

In addition to the interventions outlined here, significant consolidation work has been performed on several subsystems which are described in the sections on individual detector systems below.

Furthermore, the readout infrastructure was completely renewed to support the continuous readout of the core detectors. The raw data from the detectors are mostly transmitted through optical links and received by First Level Processors (FLPs), where the data are assembled to time frames for further processing. A dedicated farm of Event Processing Nodes (EPN) was installed at the experiment site for the online reconstruction of all collisions. The output of this synchronous reconstruction is stored on mass storage systems and is used for an asynchronous reconstruction stage with improved calibration. The output of the latter is then used for physics analysis. A new common software framework, O<sup>2</sup>, was developed for online and offline reconstruction as well as the physics analysis.

#### 1.3 Data samples

During LHC Runs 1 and 2, data were recorded with pp, p–Pb, Xe–Xe, and Pb–Pb collisions at a variety of collision energies. The collision and readout rates were tuned to limit pile-up in pp collisions and to keep the total space charge generated in the gas amplification in the TPC readout chambers to manageable levels. Typical collision rates were up to 8 kHz for Pb–Pb collisions and around 200 kHz for pp collisions. To make optimal use of the different readout rate capabilities across the detector systems, clusters of detectors were read out at different rates. The central barrel detectors were read out at a rate of 500 Hz to 600 Hz, while the cluster with the forward muon detectors together with V0, T0 and the silicon pixel layers for event characterisation were read out at a slightly higher rate. For specific triggers, such as coincidence triggers between the forward muon detectors and the calorimeters in the central barrel, the full detector was read out. During pp and p–Pb data taking a 'fast cluster' containing all barrel detectors except the SDD was used in order to double the effective TPC readout rate. Figure 2 shows the luminosities accumulated during Run 2 with different trigger conditions.

For Runs 3 and 4, it is planned to record pp and Pb–Pb data at interaction rates of 0.5 MHz to 1 MHz and 50 kHz, respectively. This will allow us to inspect integrated luminosities of  $200 \text{ pb}^{-1}$  and  $13 \text{ nb}^{-1}$ , respectively.



**Figure 2:** Accumulation of integrated luminosity over time for different trigger types in pp (left) and Pb–Pb (right) collisions during LHC Run 2.

#### 1.4 Outline

In this article, the upgrades made to ALICE during the LHC Long Shutdown 2 are discussed. The next Chapter 2 presents the readout system design, the common readout unit and the integrated circuits (ASICs), that were conceived, designed and produced for the upgrades of multiple detector systems. Chapter 3 presents the upgrades of the inidividual detector systems in detail. Chapter 4 details the mechanical integration of the detector components within ALICE and the interfaces with the LHC. In Chapter 5 the trigger system, the readout chain, as well as the synchronous and asynchronous processing stages are discussed. The expected performance of the upgraded detector and reconstruction is reported in Chapter 6. Chapter 7 comproses of a conclusion with prospects for the LHC Run 3 and a brief outlook on the future ALICE upgrade plans.

#### 2 System design and common developments

A series of developments have been pursued commonly for multiple systems. Foremost, the readout chain was redesigned for all detectors (Sec. 2.1). A common readout unit was developed for the readout of the detectors (Sec. 2.2). The ALICE Pixel Detector (ALPIDE) chip is designed and used for both the inner tracking system and the muon forward tracker (Sec. 2.3). The SAMPA is used as front-end chip for the time projection chamber and the muon systems (Sec. 2.4).

#### 2.1 System design

In nominal operating conditions (50 kHz interaction rate for Pb–Pb) each TPC drift time period of  $\sim 100 \,\mu s$  will contain on average 5 Pb–Pb events. It was therefore decided to use a continuous, untriggered readout strategy, combined with online data compression for the upgraded readout and data acquisition system.

In order to synchronise the continuous data stream across all readout and processing branches, the data stream is divided in so-called time frames (TF) of a nominal length of 128 LHC orbits ( $\sim$ 11 ms). Each TF is subdivided in heartbeat frames (HBF) with a length corresponding to an orbit of  $\sim$ 89.4 µs. Figure 3 illustrates this structure. For commissioning and calibration runs, for which the data throughput exceeds nominal conditions, all detectors also support triggered mode, in which only data from selected interactions are retained by the readout electronics. In addition, a subset of legacy detectors has not been upgraded to continuous readout and will operate in triggered mode only. For these detectors, as well as for dedicated runs, minimum bias triggers based on the fast interaction trigger detector (FIT) and the PHOS, EMCAL and TOF are distributed. In both the continuous and triggered readout mode, the detector data are time stamped with a precision of an LHC bunch crossing of 25 ns; data belonging to a HBF are grouped together into HBF packets.

The upgraded ALICE system architecture is shown in Fig. 4. The Common Readout Units (CRU) are standardised PCIe FPGA-based optical I/O processor modules used by all upgraded detectors for data readout and configuration, see Sec. 2.2. Data taking is governed by the Central Trigger System (CTS) which distributes timing and trigger signals. The CTS features a two-staged distribution system consisting of one central trigger processor (CTP) and up to 18 active distribution units, the local trigger units (LTU), one for each subdetector. The CTP-LTU and LTU-CRU connections are implemented using bidirectional TTC-PON links [4, 5]. The standard timing and trigger signal distribution path goes from the CTS via the detector-specific CRUs to the detector front-ends via bidirectional radiation tolerant GBT links [6]. Trigger signals are distributed with three different latencies referred to as LM (level -1 at 425 ns), L0 (level 0 at 1200 ns), and L1 (level 1 at 6100 ns). Detectors that require latency-critical trigger signals receive them additionally on a direct path from the CTS, which is located in the cavern, to the detector front-ends on GBT links. A second group of detectors do not support continuous readout and require a trigger signal indicating the presence of an interaction with a latency of 1.6 µs. Some detectors continue to be read out via legacy readout cards (C-RORC [7]) following a hardware trigger signal to initiate the readout. They receive the clock and trigger signals via the legacy TTC system [4, 5]. For more details on the CTS, see Sec. 5.5.

The Online & Offline processing farm  $(O^2)$  contains the first level processors (FLP) and event processing nodes (EPN). The detector front-ends send the data via GBT-based links to the CRUs and C-RORCs located in the FLPs. Depending on the detector implementation, the readout data are reformatted or compressed either in the front-ends, the CRUs, or in the FLPs. The FLPs prepare Sub-Time Frames (STF) by merging all HBFs of one TF of the connected detector. Note that for most detectors the data is distributed over several FLPs. The FLPs ship the STFs of all subdetectors via a network to the EPN farm where they are merged into TFs. In order to compress the data to be stored, the O<sup>2</sup> system performs a first synchronous online reconstruction pass, converts the data into compressed time frames (CTF) and



**Figure 3:** Time frame and heartbeat frame structure in continuous and triggered mode. HeartBeat (HB) triggers are issued in continuous and triggered modes to all upgraded detectors. Physics triggers can be sent to upgraded detectors in triggered mode and are sent to non-upgraded detectors in all modes. HBF and TF rates are programmable with the following nominal values; HBF: 1 every orbit, ~89.4  $\mu$ s/~10 kHz, TF: 1 TF every 128 HBFs/~11 ms/~100 Hz.

sends them to the storage system from where it is accessed asynchronously for further processing. In total, a raw data throughput of 3.4 TB/s is processed in a continuous manner by the readout system. After zero suppression and data compression in the front-ends, the CRUs, and the FLPs, a data throughput of 635 GB/s is processed by the data network and the EPN farm.

The detectors are configured via the detector control system (DCS) which is connected to the detector front-ends via the CRU. The experiment control system (ECS) governs the entire data taking process via direct network connections to the central systems (DCS, CTS, FLP, EPN).

#### 2.2 Common readout unit

For all upgraded detectors the Common Readout Unit (CRU) serves as interface between detector frontend links, the O<sup>2</sup> FLP processors, the CTS and DCS. The CRUs are custom developed FPGA-based Gen 3 PCI Express plug-in cards installed in the FLPs. The card (named PCI40) was originally developed for LHCb [8] and has requirements fully compatible with ALICE. ALICE adopted the PCI40 for its CRU and joined the qualification and test effort and has developed firmware for use in the experiment.

The CRU hardware features up to 48 high-speed, bidirectional, 10 Gb/s optical links using 12-lane Minipod parallel optical transmitters (AFBR-812 and AFBR-822) and receivers from Avago/Broadcom. They are accessible from the CRU front-panel through MTP (Multi-Fiber Termination Push-on) optical ribbon cable connectors and establish the interface to the detector front-end electronics using the GBT protocol [6] implemented in the FPGA. GBT links are the result of a common development for all LHC experiments to provide a radiation tolerant transmission chip (GBTx, SCA) and optical transceiver set (VTTx, VTRx) to be used on the detector front-end cards communicating with the data aquisitioning and detector control systems via optical links. The GBTx provides a data bandwidth of 4.48 Gb/s in wide-bus mode, or 3.2 Gb/s in GBT mode depending on whether forward error correction with superior correction capability for radiation induced transmission errors is activated. The slow control adapter ASIC (SCA) is an auxiliary chip compatible with the GBTx. Connected to a GBTx it allows the control of ADCs and



Figure 4: ALICE readout and control system architecture.

digital IOs. VTTx and VTRx are radiation tolerant dual optical transmitter and transceiver components compatible with the GBTx ASIC.

The number of data links used for each CRU and the use of the forward error correction are adapted to the subdetector needs. In most detector implementations, 24 data links are connected to one CRU. Table 10 in Sec. 5.2 shows the number of CRUs, the number of readout links and the data throughput into and out of one CRU for each subdetector.

For the GBT downlink to the detector-front ends carrying the timing and trigger signals as well as the configuration data up to 320 Mb/s are available from the GBTx ASIC on a configurable number of pins. The single word transmission protocol (SWT) has been developed to provide the front-end designers with a common configuration data framework.

One of the two CRU SFP+ optical transceivers is used to connect the CRUs to the CTS system via bidirectional TTC-PON [5] links. The TTC-PON link allows the distribution of timing and trigger signals with constant latency from the CTS to the CRU over passive optical splitters with a bandwidth of up to 9.6 Gb/s. The links carry the LHC clock with a jitter below 20 ps (rms) and synchronise all 474 CRUs and the connected detectors to each other. The upstream link from all CRUs to the CTS carry detector buffer status information, see Sec. 5.5.2.

The 16-lane (x16) PCI Express card edge connector provides the interface between the CRU and the ALICE O<sup>2</sup> FLPs, in which up to 3 CRUs are installed. The interface achieves  $\sim$ 90 Gb/s sustainable data throughput from the CRU to the memory of the FLP computers [9]. Depending on subdetector implementation, the CRU FPGA forwards data that has already been formatted and compressed in the detector front-end, or performs detector-specific formatting, compression and base line reconstruction. In both cases, the data stream to the FLP consists of data packets compatible with the HBF structure (see Sec. 2.1). A central FPGA firmware framework provides the interfaces to CTS, FLP, CRU and the subdetectors. Subdetector-dependent functionality, such as link decoding, adding HBF structure, compression or data processing is added via a dedicated user logic (UL) firmware plug-in to the central FPGA firmware. A detailed description of the CRU firmware design can be found in [9].

Figure 5 shows the block diagram of the module functionality and Fig. 6 shows a photograph of the CRU card.

#### 2.3 The ALPIDE Chip

#### 2.3.1 Technology, Sensing, Pixels

The ALPIDE chip [10] is a Monolithic Active Pixel Sensor (MAPS) [11] implemented in a 180 nm CMOS technology for imaging sensors provided by TowerJazz (Tower Semiconductor since March 2022) [12]. It was designed for the upgrade of the Inner Tracking System (ITS2) to meet the requirements summarized in Table 1.

The ALPIDE chip (Fig. 7) measures 15 mm by 30 mm and includes a matrix of  $512 \times 1024$  sensing pixels, each one measuring  $29.24 \mu m \times 26.88 \mu m (z \times r\varphi)$ . Analog biasing, control, readout and interfacing functionalities are implemented in a peripheral region of  $1.2 \times 30 \text{ mm}^2$  (Fig. 9).

The ALPIDE chips are fabricated on substrates with a high-resistivity (> 1 k $\Omega \cdot cm$ ) epitaxial layer on p-type substrate. Typical values for the thickness of the epitaxial layer are in the range between 18 and 30 µm. Figure 8 illustrates that a charged particle crossing the sensor liberates charge carriers in the material. The electrons released in the epitaxial layer can diffuse laterally while they remain vertically confined by potential barriers at the interfaces with the overlying p-wells and the underlying p-type substrate. The signal sensing elements are n-well diodes (~2 µm diameter). Their area is typically 100 times smaller than the pixel cell area. The electrons that reach the depletion volume of a diode (or carriers that are released directly inside it) induce a current signal at the input of the pixel front-end.



**Figure 5:** Block diagram of the common readout unit (CRU): the CRU forms the interface between the first-level processors (via PCIe), the central trigger system (via TTS), and the detectors (via TTS and FE).

| Parameter                                                     | Inner Barrel             | Outer Barrel       |
|---------------------------------------------------------------|--------------------------|--------------------|
| Chip dimensions [mm × mm]                                     |                          | × 30               |
| Silicon thickness [µm]                                        | 50                       | 100                |
| Spatial resolution [µm]                                       | 5                        | 10(5)              |
| Detection efficiency                                          | > 99%                    |                    |
| Fake-hit probability [evt <sup>-1</sup> pixel <sup>-1</sup> ] | $< 10^{-6} (<< 10^{-6})$ |                    |
| Integration time [µs]                                         | < 30 (10)                |                    |
| Power density $[mW/cm^2]$                                     | $< 300 \ (\sim 35)$      | $< 100 (\sim 20)$  |
| TID radiation hardness <sup>*</sup> [krad]                    | 270                      | 10                 |
| NIEL radiation hardness <sup>*</sup> [1 MeV $n_{eq}/cm^2$ ]   | $1.7 	imes 10^{12}$      | $1 \times 10^{11}$ |
| Readout rate, Pb–Pb interactions [kHz]                        | 10                       | 00                 |

**Table 1:** General requirements for the pixel sensor chip for the upgrade of the ALICE inner tracking system. In cases where the actual ALPIDE performance is significantly better than the requirements, the actual performance is indicated in parenthesis and italics. (\*) Radiation load integrated over 6 years of operation.

The manufacturing process also provides a deep p-well layer that can be used to shield the epitaxial layer from the n-wells of the pmos transistors. These would otherwise compete with the sensing diodes in collecting the electrons, strongly impairing the charge collection. This feature permits the use of full CMOS circuits, including pmos transistors, in the active area.

A reverse bias voltage can be applied to the substrate. This increases the depletion volume around the n-well collection diodes and reduces the capacitance of the input junction. All these aspects contribute to increasing the S/N ratio.

#### 2.3.2 Analog Front-End and Discriminator

Each pixel cell contains a sensing diode, a front-end amplifier and a shaping stage, a discriminator and a digital section (see the insert in Fig. 9). The digital section includes a multi-event buffer with three hit storage registers and a pixel mask register.



**Figure 6:** Picture of a CRU; bottom left, FPGA cooling radiator, bottom right, power mezzanine; top row; 3 out of 8 Minipods installed; top left, fiber optics cable to MPO connector on front panel; bottom left, SFP transceivers.



Figure 7: Photograph of the ALPIDE chip on a test carrier.



Figure 8: Schematic cross-section of a pixel cell.



Figure 9: Architecture of the ALPIDE chip.

In every pixel, there is a pulse injection capacitor for injection of test charge into the input of the frontend. A digital-only pulsing mode is also available, directly forcing the setting of the in-pixel memory cells, substituting the latching of a discriminated pulse. The analog and digital pulsing patterns are fully programmable. These features are used routinely for testing and calibration.

The front-end and the discriminator are continuously active. They feature a non-linear response and their transistors are biased in weak inversion. The total power consumption of the pixel cell is 40 nW. The small signal gain of the front-end is 4 mV/e, the equivalent noise charge is 3.9 e, while the minimum threshold is below 100 e. The typical value of the capacitance of the sensing diode is 2.5 fF. The input capacitance of the front-end is below 2 fF. The output of the front-end has a peaking time of the order of  $2 \mu s$ , while the discriminated pulse has a typical duration of  $5 \mu s$  to  $6 \mu s$ . The front-end and the discriminator act as an analogue delay line. This allows operating the chip in triggered mode when, as it happens in ALICE, the latency of the incoming trigger is comparable with the peaking time of the front-end. A common threshold level is applied to all the pixels.

The latching of the discriminated hits in the storage registers is controlled by global STROBE signals. A pixel hit is stored into one of three in-pixel latch cells if a STROBE pulse is applied to the pixel while the output of the front-end is above threshold. The generation of the internal STROBE signals can be either triggered by an external command or optionally initiated by an internal sequencer. The duration of the STROBE pulses is programmable. Two major operating modes are supported. In triggered mode the STROBE and the frame readout are triggered externally from an event synchronous command. In continuous mode the strobe is asserted periodically and for a duration almost equal to the period. The event frames are continuously integrated and read out.

#### 2.3.3 Matrix and Readout

The readout of the frame data from the matrix is zero-suppressed and is executed by an array of circuits named *priority encoders* (Fig. 9). The priority encoder provides to the periphery the address of the first pixel with a hit in its double column, selecting it according to a hardwired topological priority.

During one hit transfer cycle a pixel with a hit is selected, its address is encoded and transferred to the periphery and finally the in-pixel memory element is reset. The address of the next pixel with a hit in the double column is then calculated. This cycle is repeated until the addresses of all pixels initially presenting a valid hit at the inputs of a priority encoder have been transferred to the periphery and all the hit storage registers in the double column have been reset.

Each priority encoder is a fully combinatorial circuit and it is steered by sequential logic in the periphery during the readout of a matrix frame. It is implemented in a very narrow region between the pixels, extending vertically over the full height of the columns. There is no free running clock distributed in the matrix and there is no signaling activity if there are no hits to read out. The average energy needed to encode the address of a hit pixel is of the order of 100 pJ. Power is consumed proportionally to the readout rate and to the average hit occupancy of the frames. The readout of the matrix consumes around 3 mW under normal conditions. The priority encoders also implement the buffering and distribution of readout and configuration signals to the pixels.

The 512 double columns and the corresponding priority encoders are functionally grouped in 32 regions  $(512 \times 32 \text{ pixels})$ , each of them with 16 double columns being read out by 16 priority encoder circuits (Fig. 10). There are 32 corresponding region readout units in the chip periphery, each one executing the readout of a region. They steer the priority encoders, latch the encoded pixel hit address, perform additional data reduction and formatting and buffer the hit data into memories. The 16 double columns inside each region are read out sequentially, while the 32 regions are read out in parallel. The data from the 32 region readout units are assembled and formatted by a top readout unit module.

Data can be transmitted on two different readout ports. The largest capacity data readout interface is a



Figure 10: Block diagram of the ALPIDE chip.

1.2 Gb/s serial data port with differential signaling. The serial transmission is 8b/10b encoded, therefore the maximum data throughput is 960 Mb/s. The serial port can optionally operate at reduced line rates (600 Mb/s or 400 Mb/s). A bidirectional parallel data port with single-ended signaling is also available, with a capacity of 320 Mb/s. This port enables the implementation of an inter-chip data transfer and relaying protocol designed to integrate multi-chip modules without additional external devices. This is used in the modules of the ITS2 outer barrel.

The ALPIDE chip has custom control interfaces. There are a differential control port supporting bidirectional (half duplex) serial signaling at 40 Mb/s on differential links and a second single ended control port. The two control interfaces and the dedicated internal logic allow interconnecting multiple chips on a module and control them via the differential interface of only one of the chips acting as hub of the control bus. The control bus is also used to distribute broadcast commands and synchronization messages to the chips, most notably the trigger commands.

The periphery of the chip contains fourteen 8-bit analog DACs for the biasing of the pixel front-ends. The analog section of the periphery also contains a band-gap reference and a temperature sensing circuit. An ADC with 11 bit resolution is available for monitoring and testing purposes, and can probe the outputs of the DACs, the analog as well as digital supply voltages, the band-gap voltage and the temperature sensor.

#### 2.3.4 Features for integration of ITS2 modules

The ALPIDE chip has specific design features to enable the integration of multi-chip detector modules, to minimise the electrical wiring between modules and off-detector electronics and to provide common interfaces across the ITS2 staves. Two different hybrid modules built with ALPIDE sensors are used in the upgraded ALICE ITS2 (Fig. 11): one in the three innermost layers constituting the inner barrel and the other in the staves of the remaining layers of the outer barrel (see also Sec. 3.2.1).

The ITS2 inner barrel module includes nine ALPIDE chips. They share a common differential control and clock distribution buses. Each chip transmits its own data off-detector at maximum line rate (1.2 Gb/s) on point-to-point high speed serial links.

The ITS2 outer barrel module contains fourteen chips, arranged in two subgroups of seven. One chip in each group, called *master*, acts as control hub and data relaying chip. Only the master chips communi-



Figure 11: Diagrams of the ITS2 inner barrel and outer barrel modules.

cate with the external electronics through differential clock and control busses shared between multiple modules and through point-to-point differential wire-line links for the transmission of data.

Each of the master chips connects to six neighbouring chips, forwards them to the main clock and bridges the control transactions on electrical interconnects that are local to the modules.

The chips neighbouring the master use a shared parallel local bus to transfer their data to the master. The master chip relays the data from the slave chips on the serial output port driving the point-to-point links. In this configuration the master chips transmit data on the serial data port using a lower bit rate (400 Mb/s).

Grouping of data from neighbouring chips and transmitting at lower rate are possible in the ITS2 outer barrel layers given the lower occupancy. In addition to reducing the total number of copper links, this scheme achieves a significant reduction of the power consumption given that only one out of seven line drivers is maintained active.

Differential copper wire lines directly connect the ALPIDE chips to the off-detector electronics. These links reach a length of 8 m. The electrical receivers and transmitters on the ALPIDE chips were designed and tailored to the electrical and protocol levels to operate with these long interconnects.

#### 2.3.5 Power consumption

The ALPIDE sensor chip has three power supplies: one analog domain, one digital domain and a power supply dedicated to the Phase-Locked Loop (PLL) of the high speed serial data transmitter. The power consumption of the analog section, dominated by the analog front-ends, is typically 24 mW. The digital power consumption includes the pixel digital sections, readout modules, peripheral circuits, and I/Os. It depends strongly on the configuration and operating conditions. In the nominal conditions of the ALICE ITS2, it is about 130 mW.

The output serial links are driven by a data transmission unit including a PLL, a fast serializer and a line driver stage with a typical power consumption of about 52 mW. The data transmission unit is enabled in all the chips of the ITS2 inner barrel. In the outer barrel modules it is active only in the master chips and



**Figure 12:** ALPIDE sensor chip detection efficiency and fake-hit rate vs global threshold setting. Beam test results (6 GeV/*c* pions, orthogonal incidence). ALPIDE substrate reverse bias: -3 V.

disabled in the remaining chips, that is only 1 out of 7 sensors consumes this extra power.

The power dissipation density is about  $47 \text{ mW/cm}^2$  in the ITS2 inner barrel modules and around  $35 \text{ mW/cm}^2$  in the outer barrel modules.

The readout of the matrix and the digital periphery consume power in proportion to the clocking frequency, the readout rate and the pixel occupancy. In less demanding applications not requiring the high speed links and the full rate capabilities, the power consumption can be reduced considerably with various techniques including slowing down or suspending the primary clock and using the single ended I/Os to read out data at low rates.

#### 2.3.6 Results from the experimental characterization in laboratory and beam tests

The ALPIDE chip and its prototype predecessors have been characterised with an extensive test program including laboratory tests and a series of beam tests. A summary of key results is given in this section. The full set of results and details on the methodologies will be presented in a separate paper.

The laboratory measurements were based on the sensible usage of the built-in test pulse charge injection circuitry and on the systematic analysis of threshold scans and noise measurements. These allowed thorough characterisation the distributions of pixel thresholds, the fractions of pixels requiring masking, and the residual fake-hit rate after masking. Their dependencies on operating conditions were accurately established.

A set of ALPIDE prototype sensors were characterised in beam tests to quantify detection performance and hit-position resolution. The samples under test were located at the center of beam telescopes acting as precision trackers. The telescopes were themselves constructed with ALPIDE sensors and had six detection planes: three upstream and three downstream of the Device Under Test (DUT). The measurements were based on reconstructing particle tracks in the telescope and projecting them onto the DUT plane. The presence of a matching cluster, its size in pixels and its centroid were the basis for measuring the detection efficiency and the hit-position resolution.

Figure 12 provides a summary of the beam test results on the detection efficiency and the fake-hit rate as a function of the global threshold setting. Data of eight different samples are shown, including two non-irradiated DUTs and pairs of devices exposed to increasing doses of ionising radiation before the tests. The samples exposed to the TID level of 200 krad (75% of the lifetime dose) received also a combined Non-Ionising Energy Loss (NIEL) fluence at a level corresponding to 1.3 times the fluence expected over



**Figure 13:** ALPIDE chip hit-position resolution and average cluster size as a function of global threshold setting. Beam test results with 6 GeV/c pions with perpendicular incidence. ALPIDE substrate reverse bias: -3 V.

the total lifetime. The Total Ionising Dose (TID) level of 500 krad (190% of the lifetime dose) includes a combined fluence that is 3.2 times the lifetime fluence. Two devices were also irradiated with neutrons for a cumulated non-ionizing energy loss of  $1.7 \times 10^{13}$  [1 MeV  $n_{eq}/cm^2$ ], corresponding to ten times the fluence expected over the full detector lifetime.

The results show a large operating margin for the threshold setting between 50 and 250 electrons, providing a detection efficiency above 99% and a fake-hit rate that is several orders of magnitude smaller than the required value of  $10^{-6}$  fake-hit probability per pixel per frame. The ALPIDE sensor proved to be extremely well performing in terms of noise. Masking the ten most noisy pixels out of the 524288 in the matrix (less than 0.002%) resulted in a residual fake-hit noise level below the sensitivity of these experiments ( $2 \times 10^{-11}$ ).

Figure 13 shows the beam test results on the hit-position resolution (black markers and lines, upper band) and the average cluster size (red markers and lines, lower band) as a function of the global threshold. The data sets refer to the same samples of Fig. 12. The hit-position resolution is better than  $6 \mu m$  for thresholds below 300 electrons and better than  $5 \mu m$  for a threshold below 140 electrons. As expected the average cluster size depends on the threshold setting, due to the cutting on shared charge diffusing into pixels adjacent to the seed pixel. It ranged between 1.5 and 2.5 pixel hits in the range of interest.

The tests also showed that the chip-to-chip performance variations were negligible, that the chips with combined TID and NIEL irradiation performed similarly to the non-irradiated chips and that sufficient operational margin was present also in the samples with a NIEL dose ten times larger than the one expected over the lifetime.

#### 2.4 SAMPA

The SAMPA [13] is a 32-channel custom front-end ASIC for the readout of gaseous detectors and specifically for the ALICE Muon Chambers (MCH, Sec. 3.6.1) and Time Projection Chamber (TPC, Sec. 3.4). Each of the 32 channels contains a Charge Sensitive Amplifier (CSA) and a 10-bit 20 MSample/s ADC. The digitised data of all 32 channels is made available on serial links as either a raw data stream or preprocessed by an internal Digital Signal Processor (DSP), supporting both the continuous and triggered readout of the upgraded ALICE system. The SLVS serial output links support 320 Mb/s and are compatible with the input links (e-links) on the serial transceiver ASIC (GBTx) of the GBT-links used in ALICE for data transmission between the detectors and the CRUs. Depending on the data transfer rate needed in the application, the SAMPA data can be routed via a programmable number of up to 11 serial links.

|                                   | J 1                       |                            |
|-----------------------------------|---------------------------|----------------------------|
| Parameter                         | МСН                       | TPC                        |
| Input polarity                    | pos                       | neg                        |
| Input charge linear range         | 500 fC                    | 100 fC and 67 fC           |
| Sensor capacitance                | 40-80 pF                  | 12-25 pF                   |
| Gain                              | 4 mV/fC                   | 20 mV/fC and 30 mV/fC      |
| Gain channel-to-channel variation | 1.5 %                     | 1.5 %                      |
| Gain linearity                    | 0.5 % up to 85 % of range | 0.5 % up to 85 % of range  |
| Channel-to-channel cross talk     | <0.3 %                    | ${<}0.2\%$                 |
| Noise                             | $2000 e^-$ @ 60 pF        | 600 e <sup>-</sup> @ 12 pF |
| Peaking time                      | 330 ns                    | 170 ns                     |
| Baseline return                   | <550 ms                   | <500 ns                    |
| ADC sampling rate max. 20 MSa/s   | 10 MSa/s                  | 5 MSa/s                    |
| ADC ENOB                          | >9.2                      | >9.2                       |
| ADC INL                           | < 1 LSB (abs.)            | < 1 LSB (abs.)             |
|                                   |                           |                            |

| Table 2: SAMPA key paramete | ble 2: SAMPA key parar | neters. |
|-----------------------------|------------------------|---------|
|-----------------------------|------------------------|---------|



Figure 14: Block diagram of the SAMPA ASIC.

The block diagram of the SAMPA ASIC is shown in Fig. 14. The front-end is composed of a cascade connection of a CSA (Charge Sensitive Amplifier), a differential semi-Gaussian pulse shaper and an Analog-to-Digital Converter (ADC). The CSA and the pulse shaper convert signals into a semi-Gaussian pulse with an amplitude proportional to the total charge injected on the input. SAMPA was designed and fabricated in 130 nm CMOS technology and it operates at a nominal supply voltage of 1.25 V. In order to adapt the SAMPA to its two applications in the MCH and TPC, the sensitivity, polarity and peaking time of the front-end can be adjusted via external pins. SAMPA supports positive and negative polarity of the input charge and has three different gain modes with different sensitivity and peaking time: 20 mV/fC@160 ns, 30 mV/fC@160 ns for the TPC and 4 mV/fC@300 ns for the MCH. Table 2 summarizes the main characteristics and performance of the SAMPA.

Analog front-end reference voltages (nominal values 450 mV, 600 mV, and 750 mV) are generated internally with temperature compensation and can be adjusted via configuration registers. The ADC requires an external voltage reference of 1.1 V. The DSP eliminates signal perturbations, distortion of the pulse shape, offsets, and signal variations due to changes in the environment. An I2C interface allows setting control registers.



Figure 15: Block diagram of the front-end implemented in the SAMPA ASIC.

#### 2.4.1 CSA and shaper

The SAMPA front-end is composed of a positive/negative polarity CSA with a capacitive feedback  $C_f$  and a resistive feedback  $R_f$  connected in parallel, converting the input charge signal (*Q*) into a voltage step signal proportional to  $Q/C_f$ . The discharge resistor ( $R_f$ ) provides baseline restoration and reduces pile-up effects in the CSA (Fig. 15). A pole-zero cancellation resistor ( $R_{pz}$ ) eliminates the undershoot generated by the long time constant of the output step signal of the CSA. The step signal is fed to a bandpass filter constituted by a first order high-pass filter  $C_{dif}R_{dif}$  (differentiator) and a two bridged-T second order low-pass filters (integrator). After that, a non-inverting stage (NIS) generates a semi-Gaussian output pulse with an amplitude proportional to the input charge. The amplifier of the first shaper is a scaled-down version of the CSA amplifier. In order to provide the second shaper with a differential mode input, a copy of the first shaper is included. This copy is connected in unity gain configuration to minimize its noise contribution. The second shaper consists of a fully differential amplifier with a Miller configuration and a common-mode feedback network. It has the same functionality as the first shaper and implements two other poles and a zero creating a CR-(RC)<sup>4</sup> semi-Gaussian shaper together with the differentiator and the first shaper stage.

The gain of the front-end is controlled by  $R_G$ , which is an array of parallel resistances that are switched by configuration registers. The peaking time of the semi-Gaussian shaper is adjusted for each operation mode (160 ns and 300 ns) by external configuration control of an array of parallel capacitors. These front-end configurations are performed with transmission gates used as low resistance switches.

#### 2.4.2 ADC

The 32-channel 10-bit SAMPA ADC features a sampling frequency of up to 20 MSa/s defined by an external clock. The MCH and the TPC use the SAMPA with 10 MHz and 5 MHz sampling clock, respectively.

The SAMPA ADC is based on the successive-approximation register (SAR) architecture [14]. It is shown in Fig. 16. A differential capacitive DAC is implemented with the split capacitor topology. Top-plate sampling with MSB (Most Significant Bit) preset to achieve full-range sampling is used. A switching strategy with low energy dissipation per cycle is used.

#### 2.4.3 DSP and readout

**Direct ADC Serialization** In the Direct ADC Serialization (DAS) mode, the SAMPA sends out the unmodified raw data stream from all 32 ADC channels via 11 serial links, bypassing the readout processor



Figure 16: Block diagram of the SAMPA SAR ADC.

and DSP. In this mode, most of the digital circuitry is powered down via clock gating, keeping active only the communication links. 10 SLVS links are used to send the 10-bit data samples of each ADC channel. The 11<sup>th</sup> link is used to provide a synchronisation clock. Optionally, a split mode can be activated, such that data from ADC channels zero to 15 (16 to 31) are transmitted on serial links zero to four (five to nine). This allows connection of an odd number of SAMPAs to an even number of serial transmitters, as in the case of the TPC readout, where five SAMPAs are connected to two GBTx transmitter chips.

**DSP** The SAMPA DSP (Fig.17) implements fully parallel data processing on the 32 channels and supports both continuous and triggered readout operation. When in DSP mode, the data coming from the ADC are received via the pre-trigger buffer with programmable depth of up to 192 10-bit words per channel. In triggered operation, the pre-trigger buffer delays the data and allows the collection of the detector signal samples before the arrival of the trigger signal.



Figure 17: Diagram of digital signal processing chain in the SAMPA.

The pre-trigger buffer is followed by a section of several configurable pipelined digital filters for signal conditioning. The filter blocks are:

- The **baseline correction 1** subtracts a given pedestal value for a fixed time after a trigger and applies a configurable Infinite Impulse Response (IIR) filter to correct slow fluctuations of the baseline.

- The **tail cancellation** corrects long tails via a Digital Shaper, using a cascade of four fully configurable first order IIR filters which also can be used as general low-pass or band-pass filter.
- The **baseline correction 2 and 3** offer a moving average Finite Impulse Response (FIR) filter and a non-linear slope-based filter.

The SAMPA is equipped with 3.2 Gb/s output data bandwidth to extract the full raw data stream for up to 10 MSa/s. This feature is used in the TPC application. For the MCH application, data preprocessing in the SAMPA is used by applying a zero-suppression algorithm, removing all data below a threshold configurable for each channel. In addition, a cluster sum algorithm is available, where instead of delivering time and amplitude of each sample above threshold, consecutive active samples are added up to clusters in time and only the sum of the values and the time of arrival are delivered. The data are formatted for transmission in either continuous or triggered mode. A hamming code protected output buffer handles data size fluctuations and distributes the data to the activated serial links.

#### 2.4.4 Physical implementation and packaging

The SAMPA ASIC die is 8.9 mm wide and 9.5 mm long with 350 flip chip bond pads. As visible in the left panel of Fig. 18, only a minor part of the die is devoted to the analog circuits (left on the picture), while the largest fraction contains the digital blocks, with part of the area being occupied by the buffer memories. During the implementation, special care was taken to isolate the power domains of the different circuits. There are five different power domains: CSA, Shaper and Output Buffer, ADC, core digital logic and SLVS IO drivers. The SAMPA features a 372 ball 15 mm  $\times$  15 mm, 1.2 mm thick, Thin Fine-pitch Ball Grid Array (TFBGA) package with 0.65 mm ball pitch in order to be compatible with the MCH integration requirements. The high number of available balls allowed multiple connections to VDD and GND pads, reducing the inductive and resistive loss. The package includes filtering capacitors for ADC power connections and for on-chip ADC reference voltages. A QR-code on the SAMPA package (right panel in Fig. 18) encodes the wafer lot-ID and a unique chip serial number, allowing the identification and tracking of each ASIC.



Figure 18: SAMPA bare die (left) and TFBGA packaged chip (right).

#### 2.4.5 SAMPA performance and tests

The main specifications and performance of the SAMPA are listed in Tab. 2 above. Figure 19 shows the response curve for the 4 mV/fC gain setting.



Figure 19: Example response curve for 4 mV/fC configuration.

Robustness measurements of the CSA against saturation in case of multiple consecutive signals were performed, showing that an average current of at least 30 nA can be sustained for 60  $\mu$ s without significant baseline shift, indicating that the SAMPA can stand this charge rate indefinitely.

The SAMPA functionality was verified successfully against the highest expected radiation load of 2.1 kRad. Robustness of the SAMPA against single event upsets and single event latchups for an expected maximum flux of high-energy hadrons of  $3.4 \text{ kHz/cm}^2$  has been verified. An upper limit for the SEL cross section of  $10^{-7} \text{cm}^2$  for ions with a linear energy transfer of 16 MeV cm<sup>2</sup> mg<sup>-1</sup> has been measured.

80000 SAMPAs have been tested using a robotic test system to verify the functionality of the digital blocks and that the output baseline, noise, gain and peaking time are in a narrow intervals around the nominal values. The SAMPA mass production yield was 79.6%. A breakdown of the rate of different kinds of failures can be found in Table 14 of [15].

#### **3** Detector systems

In the following subsections, each of the ALICE detector systems is presented, with emphasis on the upgrades that were installed during LHC Long Shutdown 2. Each system is presented in a separate subsection, starting from the inner tracking system, the muon forward tracker and the time projection chambers, which have undergone the most significant changes.

#### 3.1 Coordinate system

The gloabel reference coordinate system used in ALICE is a right handed system with the z axis point along the beam line, in the direction away from the muon arm, the y-axis pointing vertically up, and the x axis pointing horizontally towards the center of the LHC. The nominal interaction point is the origin of the coordinate system. The two sides of the detector along the beam axis are refferred to as the C side, where the muon arm is positioned, and the A side, where FV0 is positioned.

#### 3.2 Inner Tracking System

The new Inner Tracking System (ITS2) [16] uses the ALPIDE sensor (described in Sec. 2.3) and represents the largest-scale application of Monolithic Active Pixel Sensors (MAPS) in a high-energy physics experiment. The main goal of the ITS upgrade is to improve the precision of the reconstruction of the primary vertex as well as of decay vertices originating from heavy-flavour hadrons, and the performance in the detection of low- $p_T$  particles. Additionally, readout rates of 50 kHz in Pb–Pb and 400 kHz in pp collisions are required. In order to achieve this performance, the following key improvements were made in comparison with the previous ITS:

- Granularity increased for all layers with pixel sensors with a cell size of  $29.24 \,\mu m \times 26.88 \,\mu m$ . The number of layers for the inner barrel was increased from two to three, raising the total number of layers from six to seven.
- New beam pipe with a central beryllium section with an outer radius reduced from 28 mm to 18 mm (see Chapter 4).
- Innermost detector layer moved closer to the interaction point, from 39 mm to 22.4 mm.
- Material budget reduced to 0.36%  $X_0$  per layer for the innermost layers and limited to 1.10%  $X_0$  per layer for the outer layers.

Table 3 reports a list of main parameters of the old ITS1, used in Runs 1 and 2, and of the new ITS2. The new design improves the tracking efficiency and momentum resolution at low  $p_T$  as well as the impact-parameter resolution by a factor of three and five in the  $r\varphi$ - and z-coordinate, respectively, at a  $p_T$  of 500 MeV/*c* [1].

An overview of the ITS2 structure is shown in Fig. 20. The detector is grouped into the inner barrel (IB) consisting of the three innermost layers, and the outer barrel (OB) arranged in two double layers. The radial position of each layer (listed in Table 4) was optimized to achieve the best performance in terms of pointing resolution,  $p_{\rm T}$  resolution, and tracking efficiency in the high track-density environment of Pb–Pb collisions. The pseudorapidity coverage of the detector is  $|\eta| < 1.22$  for the most luminous 90% of the interaction region, i.e. for interaction vertices located in the range of approximately  $\pm 10$  cm around the nominal interaction point along the beam axis, see also Fig. 34. The total surface area of the sensors is  $\sim 10 \text{ m}^2$  instrumented with about 12.5 billion pixels with binary readout. The detector is operated at room temperature (20°C to 25°C), which is stabilized by water cooling. The radiation load at the innermost layer is expected to be 270 krad of Total Ionising Dose (TID) and 1.7 10<sup>12</sup> 1 MeV n<sub>eq</sub>/cm<sup>2</sup> of Non-Ionising Energy Loss (NIEL), for 10 years of ALICE running, met by the ALPIDE sensors (see

|                                         | ITS1                                     | ITS2                               |  |
|-----------------------------------------|------------------------------------------|------------------------------------|--|
| Technology                              | Hybrid pixel, strip, drift               | MAPS                               |  |
| No. of layers                           | 6                                        | 7                                  |  |
| Radius                                  | 39–430 mm                                | 22–395 mm                          |  |
| Rapidity coverage                       | $\mid oldsymbol{\eta} \mid \leq 0.9$     | $\mid \eta \mid \leq 1.3$          |  |
| Material budget / layer                 | $1.14\% X_0$                             | inner barrel: 0.36% X <sub>0</sub> |  |
|                                         |                                          | outer barrel: 1.10% X <sub>0</sub> |  |
| Pixel size                              | $425 \ \mu m \times 50 \ \mu m$          | $27~\mu m 	imes 29~\mu m$          |  |
| Spatial resolution ( $r\phi \times z$ ) | $12 \ \mu m \times 100 \ \mu m$          | $5 \ \mu m 	imes 5 \ \mu m$        |  |
| Readout                                 | Analogue (drift, strip), Digital (Pixel) | Digital                            |  |
| Max rate (Pb–Pb)                        | 1 kHz                                    | 50 kHz                             |  |





Figure 20: Schematic layout of the ITS2. The three innermost layers form the inner barrel, the middle and outer layers form the outer barrel.

|           |         | <b>7</b> 1 |        |        |           |
|-----------|---------|------------|--------|--------|-----------|
| Layer no. | Average | Stave      | No. of | No. of | Total no. |
|           | radius  | length     | staves | HICs/  | of chips  |
|           | (mm)    | (mm)       |        | stave  |           |
| 0         | 23      | 271        | 12     | 1      | 108       |
| 1         | 31      | 271        | 16     | 1      | 144       |
| 2         | 39      | 271        | 20     | 1      | 180       |
| 3         | 196     | 844        | 24     | 8      | 2688      |
| 4         | 245     | 844        | 30     | 8      | 3360      |
| 5         | 344     | 1478       | 42     | 14     | 8232      |
| 6         | 393     | 1478       | 48     | 14     | 9408      |

Table 4: Main layout parameters of the new ITS2.

Sec. 2.3). In order to meet the material budget requirements, the silicon sensors are thinned down to  $50 \,\mu\text{m}$  and  $100 \,\mu\text{m}$  in the inner and outer barrel, respectively.

#### 3.2.1 Stave modules

The basic detector unit, called stave, consists of the following elements (Fig. 21):

- Hybrid Integrated Circuit (HIC): an assembly of a polyimide Flexible Printed Circuit (FPC) on



Figure 21: Layout of the staves of the inner and outer barrels.

which a number of pixel chips, namely 9 and 14 for the inner and outer barrel staves, respectively, and some passive components, are bonded. Figures 22 and 23 show photos of an inner and outer barrel HIC.

- **Coldplate**: a carbon fibre sheet with high thermal conductivity with embedded polyimide cooling pipes, which is either integrated within the space frame (for the inner barrel staves) or attached to the space frame (for the outer barrel staves).
- **Space Frame**: a carbon fiber truss-like support structure providing the mechanical support and the necessary stiffness to the assembly of HICs on cold plates.

The HICs are glued to the cold plate: 1 HIC for the inner barrel and 8 and 14 HICs, for the middle and outer layers, respectively. The cold plate is in thermal contact with the pixel chips to remove the generated heat. For the inner barrel, each staves consists of a single HIC+cold plate assembly. In the outer barrel, staves are further segmented in azimuth in two half-staves. Each half-stave extends over the full length of the stave and consists of a cold plate on which four or seven modules (HICs) are glued depending on the length of the stave.

**Hybrid Integrated Circuit** As shown in Fig. 21, the inner barrel HIC includes one row of 9 sensors, whereas the outer barrel HIC comprises two rows of 7 sensors each, as visible in Fig. 23, bottom. The HICs consist of an assembly of ALPIDE chips glued to an FPC, which provides the connection to analogue and digital power rails as well as p-well and substrate bias voltages. Differential pairs of traces of 100  $\mu$ m width and spacing are used to distribute control and clock signals in a local bus and to read out individual pixel sensors. In the inner-barrel staves, all 9 sensors share the same aluminium power bus on the FPC, while in the outer-barrel staves, a dedicated aluminium power bus extends over all FPCs of the half-stave and provides analogue and digital power as well as ground connections. The baseline powering scheme is based on a conservative parallel connection: all chips in a HIC are directly connected to the analogue and digital power planes of the FPC, which are in turn fed by the power bus serving the half-stave. The electrical connection to the HICs is made by means of thin aluminium cables soldered onto the HIC, as visible in Fig. 23. To minimise the material budget, aluminium was chosen as conductor



Figure 22: Inner HIC seen from the sensor side. The green tabs are used for fixing and handling, and are removed before mounting the HIC.

(having a radiation length of 8.9 cm compared to 1.44 cm for copper) for the FPCs of the inner barrel. Since the resistivity of aluminium is 1.5 times larger than that of copper, the thickness of the power lines must be correspondingly increased. A thickness of 25  $\mu$ m ensures a voltage drop below 50 mV over the full length, as well as an attenuation suitable for signal transmission up to 1.2 Gbps. Polyimide Upilex-S75 was selected as substrate because it has a small thermal expansion coefficient (0.01% at 200°C) and therefore provides good dimensional stability during the aluminium coating by sputtering in vacuum. The material budget requirements for the outer barrel FPC are less severe and allow for a more standard production procedure, using copper-clad Pyralux, with a substrate of 75  $\mu$ m and 18  $\mu$ m metal layer. With the external power bus, the thinner copper traces are compatible with voltage drop requirements and the readout rate of 400 Mbps, sufficient for the lower occupancy of the outer layers. The power bus in the outer barrel is connected to the HICs via short

The main requirements for the chip to FPC interconnection are: (i) compact module layout with minimal dead area; (ii) highly reliable and stable mechanical connection; (iii) high quality, low inductance electrical connection. A custom made automatic Module Assembly Machine (MAM), named ALICIA, supplied by IBS-Precision Engineering, see top left panel of Fig. 24, implements electrical testing, dimension measurement, integrity inspection and alignment for assembly, was used to achieve a reproducible accuracy and the required production speed at the various HIC assembly sites.

Using a stencil manufactured in an adhesive film (90 µm thick), very precise spots of Araldite 2011 (0.6 mm diameter, 160 per chip) were applied on the FPC clamped on a gripper jig. After the chips were aligned by the MAM onto the vacuum chuck with a position accuracy of better than 5 µm and a spacing of 150 µm, the FPC was positioned precisely on top. Shims of 50 µm were used to ensure a sufficient gap for the glue and variations related to tolerances of tooling (planarity:  $\pm 10$ µm) and components (FPC thickness:  $\pm 10$ µm; chip thickness:  $\pm 5$ µm). The assembly procedure was validated by mechanical tests where on average a pull strength of 44 N/chip and a peel strength of 3 N were measured. The electrical interconnection used a novel approach of wire bonding through the FPC vias (Fig. 24 bottom left panel).



(b) Outer barrel HIC, sensors view

Figure 23: Top view and bottom view of an outer barrel HIC. The yellow cables connect the HIC to the power bus.

In order to account for the clearance necessary for the wedge bonding tool, the FPC vias have an oblong shape ( $1.2 \text{ mm} \times 0.4 \text{ mm}$ ); in addition 300 µm interconnection pads were implemented on the top surface. Wire bonding was performed using 25 µm aluminium wire (three wires per connection); a typical pull force of 11 cN with a standard deviation of 0.8 cN was measured per wire.



**Figure 24:** Pictures of the ITS2 assembly. Top left: A view of the MAM used for chip inspection and HIC assembly. Top right: Photo of an outer barrel stave, with power bus cables opened on the two sides. Bottom: Photo of an inner barrel stave, with detailed views shown in the insert.

**Space frame and cold plate** The layout of the ITS2 stave mechanics and cooling consists of a space frame and one or two cold plates. A large effort was devoted to the design of the lightest possible mechanical supports to maintain the silicon sensors in an accurate position while providing the cooling to remove the heat dissipated by the sensors. A novel technology was developed to directly embed polyimide pipes inside the cold plate, an assembly of highly thermally conductive carbon fibre laminate (see Fig. 25).



Figure 25: Space frame and cold plate cooling scheme. (Left) Inner Barrel; (Right) Outer Barrel

For mechanical stability, the cold plate is stiffened by the space frame, a light filament-wound carbon structure with a triangular cross section. The implementations of the cold plate and space frame differ in the inner and outer barrel to satisfy different geometrical and thermal constraints (Fig. 25). In order to guarantee electrical insulation, a Parylene coating was applied on the cold plates (Parylene volume resistivity is  $1.4 \times 10^{17} \,\Omega$  cm, measured on a 25 µm thick layer). Such a structure provides a highly efficient heat dissipation with single-phase liquid flow up to a power density of 0.5 W/cm<sup>2</sup> produced by the silicon chips glued on top of it. As can be seen in Fig. 26, this solution meets the quite stringent requirement of a very low material budget, in particular for the inner barrel stave (0.36%  $X_0$ /layer). The same solution for the integration of the cooling pipes was adopted for the outer barrel staves. The material budget requirements for the much larger outer barrel layers are less stringent and the different layout corresponds to an average budget of 1.10%  $X_0$ /layer (Fig. 26).

### 3.2.2 Global support mechanics and services

In addition to the requirements of minimising the material in the sensitive region and ensuring high accuracy of the relative position of the detector sensors, discussed in the previous section, the ITS2 mechanical structure fulfils the following design criteria:

- provide an accurate position of the detector with respect to the TPC and the beam pipe;
- locate the first detector layer at a minimum distance to the beam pipe wall;
- ensure thermo-mechanical stability over time;
- ensure accessibility for maintenance and inspection.

Also, the design of the support of the detector and services has to take into account the requirements set by the integration of the new Muon Forward Tracker (MFT) and the Fast Interaction Trigger system (FIT), which are installed very close to the ITS2. The main mechanical support structures of the ITS2 are shaped in two barrels made from carbon fiber composite. The inner shell supports the three innermost layers, while the outer shell supports the outer four layers. Each barrel is divided into top and bottom halves, which are installed sequentially around the beam pipe. Each barrel is composed of a detector section and a service section, as shown in Fig. 27. The staves are housed in the detector barrel and are



**Figure 26:** Azimuthal distribution of single contributions to the material budget of an inner (left) and outer (right) barrel stave layer. The relative contribution of each component to the total material budget is quoted.

connected to the readout and power systems via signal and power cables which are routed through the service barrel to the ALICE miniframe. The pipes that connect the on-detector cooling system to the cooling plant in the cavern are also routed through the service barrels.

**Detector support structure** The main structural components of the detector barrels are the end-wheels and the cylindrical and conical structural shells. Two light composite end-rings provide the reference plane for the fixation of the two extremities of each stave. The position of the staves in the reference plane is given by a ruby sphere, matching an insert in the mechanical connectors at both extremities. This system ensures accurate positioning, within a few  $\mu$ m, during the assembly and provides the possibility to dismount and re-position the stave with the same accuracy in case of maintenance. Finally, the staves are clamped by a bolt. The end-wheels on the A-side also provide the feed-through for the services.

An outer cylindrical structural shell connects the opposite end-wheels of the barrel and avoids that external loads are transferred directly to the staves. As shown in Fig. 27, in order to minimise the material budget in the detection area, the following design choices were adopted:

- The inner barrel is conceived as a cantilever structure supported at one end outside the outer barrel acceptance;
- The outer barrel has no intermediate mechanical structures between the four detection layers within the detector acceptance.

The design of the outer barrel allows the separate assembly inside the TPC of all half-layers, which then are combined sequentially, starting from the outermost layer. The mechanical connection between the two double layers is provided by two conical structural shells located at the extremities of the detection area (Fig. 27). All the barrel support structures are attached to the cage (Fig. 87), acting as the main supporting element inside the TPC bore.



**Figure 27:** Overview of the mechanical structure of the ITS2. The upper panel shows the Inner Barrel, the Outer Barrel staves and the MFT in the back. The lower panel shows the IB and OB conical structural shells supporting the respective services.

# 3.2.3 Readout and powering systems

The readout and powering systems are composed of 192 identical readout units and 142 power boards, and have complete control over all sensor operations, including power management, triggering, data readout and slow control. One of the major goals for the ITS upgrade was to minimize the detector material budget, which for the inner layers is on average  $0.36\% X_0$  per layer [16]. Reducing the sensor power consumption implies softer cooling requirements, and hence decreasing the passive mass in the system. Transferring data from the sensors to the front-end electronics represents a significant part of the total power budget. To reduce the power consumption, intermediate conversions between electrical and optical layers were avoided. Therefore, the high-speed transceiver on the sensor [17] directly drives the differential line connecting it to the front-end electronics, using the shortest possible path in order to achieve the target bit-rate of 1.2 Gb/s within the power budget [18]. Consequently, both readout units and power boards are located within the ALICE L3 magnet, in a magnetic field of ~0.5 T and exposed to the radiation environment. All system components were validated for these conditions [19].

The matrix of  $1024 \times 512$  pixels of the ALPIDE sensor is digitally read out through serial links at 1.2 Gb/s or 400 Mb/s in the inner and outer barrel, respectively. Because of the higher occupancy, the sensors in the inner barrel are read out individually whereas those in the outer barrel are read out in groups of seven using the master-slave mode described in Sec. 2.3. As mentioned in Sec. 3.2.1, the middle and outer layers share the HIC as a common building block and are identical from the readout point of view. Such a HIC is composed of two rows of seven sensors, where each row implements the aforementioned master-slave topology.



Figure 28: Schematic design and interconnections of the ITS2 readout unit.

### 3.2.4 The readout system

To maximize the modularity of the system, the readout electronics are organized in 192 autonomous readout units, one for each stave. As schematically shown in Fig. 28, the readout units provide control and trigger and read the high-speed data lines from the ALPIDE chips. The readout units are identical, and only the I/O connections layer in the firmware adapts to the connected stave. Each readout unit connects to a common readout unit (CRU), see Sec. 2.2, through the optical Versatile Link [20], which is custom made by CERN and provides a 3.2 GB/s radiation-tolerant physical transport layer including forward data correction. One downlink is used to transmit the slow-control commands from the counting room to the readout units, while up to three upstream links from each readout unit can transmit in parallel to achieve the needed readout bandwidth. A separate link is used to receive the trigger from the Central Trigger Processor (CTP).

The maximum necessary data bandwidth is determined by the collision system, the interaction rate, and the characteristics (pixel size, noise, etc.), and positioning of the sensors. The maximum bandwidth of 1.2 Gbit/s suffices for the readout of Pb–Pb collisions at interaction rates up to 100 kHz. For the middle and outer layers, the required bandwidth is lower, even though the sensors are read out in groups of seven sensors in master-slave mode. For those layers, the data link is configured for a maximum bandwidth of 400 Mbit/s between the master chips and the readout unit. Table 5 gives an overview of the data flow and bandwidth in the different parts for each layer. Each link represents a direct connection between a master sensor and a readout unit, while the payload is the actual available bandwidth for the data once the 8b/10b transmission encoding overhead is accounted for.

The ITS2 can operate in two modes, triggered and continuous. In triggered mode, pixel hits on the

|       |        |           |           |         |           | 1 0       | 1 0       |           |
|-------|--------|-----------|-----------|---------|-----------|-----------|-----------|-----------|
| Layer | Staves | Links     | Links     | Link    | Bandwidth | Payload   | Bandwidth | Payload   |
|       |        | per stave | bandwidth | payload | per stave | per stave | per layer | per layer |
|       |        |           | [Gb/s]    | [Gb/s]  | [Gb/s]    | [Gb/s]    | [Gb/s]    | [Gb/s]    |
| 0     | 12     | 9         | 1.2       | 0.96    | 10.8      | 8.6       | 129.6     | 104       |
| 1     | 16     | 9         | 1.2       | 0.96    | 10.8      | 8.6       | 172.8     | 138       |
| 2     | 20     | 9         | 1.2       | 0.96    | 10.8      | 8.6       | 216.0     | 173       |
| 3     | 24     | 16        | 0.4       | 0.32    | 6.4       | 5.1       | 153.6     | 123       |
| 4     | 30     | 16        | 0.4       | 0.32    | 6.4       | 5.1       | 192.0     | 154       |
| 5     | 42     | 28        | 0.4       | 0.32    | 11.2      | 9.0       | 470.4     | 376       |
| 6     | 48     | 28        | 0.4       | 0.32    | 11.2      | 9.0       | 537.6     | 430       |
| Total | 192    |           |           |         |           |           | 1872      | 1498      |

Table 5: Summary of the ITS2 readout connections and payload capacity.

sensors are latched into the sensor memory and then transmitted to the readout boards only if a trigger command arrives within a few microseconds after the event that generated them. In continuous mode, data are always recorded and transmitted, segmented in time frames of programmable duration, where all the events within the same time frame share the same timestamp.

The total bandwidth per stave was adapted to match the capacity of 3 CERN Versatile Link upstream connections per readout unit, with aggregate payload bandwidth of 9.6 Gb/s ( $3 \times 3.2$  Gb/s). The system would saturate only at 200 kHz of Pb–Pb collisions. A detailed description of the components and operating modes of the readout units can be found in Ref. [19].

# 3.2.5 The powering system

The ITS ALPIDE sensors require 1.8 V analogue and digital power rails; a reverse bias can be applied to the sensor substrate. Power regulation happens through linear regulators mounted on custom power boards, which are housed in the same rack as the corresponding readout unit. The power boards have built-in I2C [21] interconnection to monitor their functional parameters in real time, including the sourced currents and voltages. The readout unit has full control over the power board through the I2C bus: power sequence, monitoring and tuning is therefore managed from the counting room through the same Versatile Link used to manage the data acquisition and read the sensors, as shown in Fig. 28.

As described in Sec. 3.2.1, staves are connected to the powering system via aluminium power buses. The inner barrel staves have the main power conductors integrated in the FPC that also carries the signal and control lines, while a separate power bus and bias bus are required for each half stave of the middle and outer layers. In the inner staves all nine sensors share the same power bus, while in the middle and outer layers the power delivery path is (half-)separate for each HIC (see Fig. 24). The power boards can drive up to 16 analogue and digital power rails, providing supply for a full outer barrel stave, which is composed of 14 HICs. The main power supplies are CAEN mainframes (EasyCrates) populated with 61 A3009B radiation tolerant CAEN power modules complemented by 4 CAEN A2518 LV modules for the reverse substrate bias located in the racks in the cavern and in the counting rooms, respectively.

The power board consists of two power units, which contain 16 low voltage and 8 reverse-substrate bias channels. The two power units feature independent I2C control interfaces and output connectors. They are based on radiation tolerant LDO regulators, shunt resistors, overcurrent protection circuitry, current and voltage measuring circuitry, and remote voltage-setting circuitry.

A power unit can power an inner-barrel stave, a middle-layer stave or an outer-layer half-stave. In order to allow every readout unit to control the power units of the attached detector segment, channels have to stay unused. A total of 142 power boards are used to supply the entire ITS2.

### 3.2.6 Component production, detector assembly, and commissioning on surface

The assembly of HICs and staves, and their characterization was distributed over 13 sites spread across North America, Europe and Asia. The workflow is schematically represented in Fig. 29.

The produced staves were transported to CERN where a large clean room was built to allow assembly of the full detector as well as on-surface commissioning activities before the installation in the ALICE cavern. Staves were assembled into layers by mounting them on the support structures. The layers were assembled into four separate half-barrels, two for the inner and outer barrels each. Each half-barrel was then connected to the readout units, power boards, and the cooling system. Throughout the year 2020, the full detector was under commissioning on the surface, with the half-barrels located next to each other to facilitate stepwise integration (Fig. 30).

During commissioning, the detection efficiency, readout performance, and noise levels were characterised. As an example of the results obtained during the commissioning, the fake-hit rate measured



Figure 29: Schematic description of the stave production workflow.



**Figure 30:** ITS2 in the clean room during on-surface commissioning. The lower left shows a zoomed-in view of the half barrels of the outer barrel (OB) and inner barrel (IB) type.

for one inner-barrel half barrel is reported here. Randomly distributed triggers were sent to all the the chips in the layer and the generated hits were registered. These hits are due to noise as well as to cosmic rays crossing the detector in coincidence with the trigger. In the inner barrel, the fake-hit rate was found to be dominated by roughly hundred pixels per half-barrel as shown in Fig. 31 [22], where colors indicate how often a pixel fired in  $15 \times 10^6$  events acquired at a trigger rate of 50 kHz using a charge threshold of  $100 e^-$ , e.g. there were 24782 pixels which fired once in the sample. Masking these pixels leads to a fake-hit rate of  $1 \times 10^{-10}$  /pixel/event. This is significantly better than the target value of  $1 \times 10^{-6}$  /pixel/event [16]. The majority of pixels which remain after this masking show one or two hits, which is consistent with the expected rate from cosmic muons.



Figure 31: Fake-hit rate of an inner half-barrel as a function of the number of masked pixels.

# 3.2.7 Detector calibration

The calibration procedure for the ITS2 consists of two main steps, namely the identification and masking of noisy pixels and the optimization of the in-pixel discriminator thresholds. The calibration has to be performed on a regular basis both to determine the best operating point of the detector and to measure its performance for the chosen operating point at regular intervals. The majority of the calibration scans is based on the injection of analogue or digital pulses into the single pixels, most notably the measurements of the pixel thresholds and their tuning to optimal values. The scan is executed completely by the DCS software, which controls sequencers on the readout units that trigger pulses to the chip. The main challenge in the threshold calibration is related to the high data rate generated by 12.5 billion channels. A threshold scan of the full detector with 50 charge injection points and 50 hits per point results in about  $3 \times 10^{13}$  hits or approximately 100 TB of raw hit data. If the scan is performed as fast as the on-detector bandwidth allows, this data will be collected in slightly less than 1 hour, resulting in a data rate of 20–30 GB/s. However, a full scan is generally used as a reference and it is not needed on a daily basis. In fact, to ensure a good threshold calibration of the detector it is sufficient to pulse about 1% of the pixels; such a scan can be completed in under 5 minutes.

In order to process the larger amount of data in a timely way, the analysis is performed in a distributed manner on the EPNs (see Sec. 5.3), making use of the ALICE DPL (see Chapter 5). Once the pixel thresholds are properly tuned, the next calibration step is performed, namely the detection and masking of the noisiest pixels. In general, the fraction of noisy pixels masked is below 0.1 % which leads to an overall fake-hit rate of about  $10^{-8}$  hits/event/pixel on average, well below the required  $10^{-6}$  hits/event/pixel by design.

After a successful calibration, the addresses of the pixels to be masked and the ALPIDE register settings for threshold tuning are sent off to be stored in the configuration database using a dedicated WinCC panel in the DCS.

# 3.2.8 Installation and global commissioning

The installation of the ITS2 in ALICE started in January 2021, when the services were transferred from the on-surface commissioning hall to the experimental cavern. The detector was installed on rails in

the so-called "cage" (see Sec. 4), hosting the beam pipe, ITS2 and MFT inside the ALICE TPC. Several insertion tests were performed on the surface to optimise the procedures and to identify potential interferences. For the final installation inside the experimental apparatus, up to six cameras where used to continuously monitor key contact points during insertion, since the clearance between staves of top and bottom barrels is of the order of a millimeter. Furthermore, surveys were carried out to determine the exact position of the detector elements and the beam pipe and visualize them with the help of threedimensional scans carried out earlier on the surface. At each step of the insertion process, CAD models of the insertion process were compared with the camera images to verify the positioning of the detector elements. The process started in March 2021, with the outer barrel installation. Figure 32 (top) shows layer 3 in position in ALICE, around the beam pipe. The visible surface is covered by the power distribution bus. After the installation, the outer barrel was thoroughly tested before the inner barrel was inserted into its final position in close vicinity to the beam pipe, in May 2021. Figure 32 (bottom) shows the bottom half of the inner barrel in its final position at around 1 mm distance from the beam pipe. The full detector was installed without damaging any component.

### 3.2.9 First results from global commissioning

After connection and verification of the detector and its services, the focus was set on the central system integration and on gaining experience with the final framework for the operation of the detector. First cosmic muon tracks traversing the full detector, like the one shown in Fig. 33, could be acquired using continuous integration without a dedicated trigger signal. These tracks were found by matching three hit points in the layers 4 to 6 of the outer barrel and requiring another hit point in the inner barrel, at a rate of about 0.02 Hz, while those traversing only the outer barrel were more frequent (0.5 Hz). The commissioning campaign continued throughout the year and allowed to optimise the detector control system (DCS), the calibration procedures, and readout parameters. During the pilot beams with pp collisions at injection energy (450 GeV) in October 2021, reconstructed tracks from the ITS2 were used to determine the position of the primary vertex, as shown in Fig. 34, and continuously monitor it online in the QC plots. The primary vertex position is one example of the quantities that are continuously monitored by the online quality control system of the O<sup>2</sup> analysis framework to assess the quality of the data acquired with the ITS2.



**Figure 32:** Top: Outer barrel surrounding the beam pipe with the Muon Forward Tracker (MFT) in the background. Bottom: ITS2 inner barrel bottom half-barrel next to the beam pipe, outer barrel and MFT in the background.



Figure 33: Event display of a cosmic muon traversing all layers of ITS2 twice, no magnetic field.



**Figure 34:** Longitudinal distribution of the primary vertex positions from ITS2 tracks reconstructed online during the LHC pilot beam in October 2021.



Figure 35: Schematic view of the Muon Forward Tracker (left) and its integration with the central barrel (right).

### 3.3 Muon Forward Tracker

The Muon Forward Tracker detector (MFT), see Fig. 35, is a high position resolution silicon detector, which has been designed to extend the physics program of the muon spectrometer (see Sec. 3.6). Its primary goal is to improve the pointing resolution of muons by matching the tracks reconstructed down-stream of the hadron absorber to those reconstructed inside the MFT upstream of the absorber [23]. This approach allows the removal of multiple scattering effects in the hadron absorber and improves the pointing resolution of muon tracks down to about 100 µm. The MFT is located between the interaction point and the front absorber and surrounds the beam pipe at the closest possible distance. It provides charged particle tracking in the pseudorapidity interval  $-3.6 < \eta < -2.45$ , which covers most of the muon spectrometer acceptance. The acceptance boundaries are defined on one side by the size of the beam pipe, and on the other side by the volume and position of the ITS2, the FIT-C and the beam pipe support, as shown in Fig. 35.

# 3.3.1 Detector layout

The MFT has a projective geometry (see Fig. 35) based on five disks, coaxial with the beam pipe and labelled D00 (innermost) to D04 (outermost), the first two (D00 and D01) being identical and the others (D02, D03 and D04) having their diameters increasing with the distance from the interaction point. To ease assembly and insertion, the detector is divided into two identical halves, labelled H0 for the bottom part and H1 for the top part. The MFT is composed of a total of 936 ALPIDE silicon sensors (see Sec. 2.3) distributed on both sides faces of the ten half-disks, and arranged in detection modules called ladders. Each ladder is a hybrid integrated circuit with two to five sensors (depending on the position within the disk), which are glued and interconnected on a flexible printed circuit (FPC) board to provide the power and readout connections. In order to minimize the material budget, the silicon-pixel sensors constituting the MFT are thinned down to the same thickness of 50 µm as the ITS2 inner barrel sensors (see Sec. 3.2), and the FPC to which they are connected are made of polyamide with two layers of aluminum on either side. Each ladder is connected to a PCB that is located outside the acceptance, external to each half face. The MFT contains 240 ladders whose positions were defined to ensure an 85% overlap of the sensors between the two faces of each disk. The face of the half-disks is subdivided into four zones (each containing between three and five ladders) which yields a total of 80 zones for the full MFT.

The ten half-disks are then assembled into half-cones. The first three half-disks are connected to a set of motherboards that provide the connection of the readout lines with 6.5 m long copper cables, which run alongside the whole absorber towards the front-end electronics boards. For the two larger half-disks, the same type of copper cables is used, connected directly to the PCBs. Each half-cone also houses a Power Supply Unit (PSU), which controls and monitors the powering of the zones to guarantee the ladder safety and is located outside the acceptance between the last two half-disks. The disks and the PSU are water-cooled and air ventilation is used to ensure temperature homogeneity inside the confined space where the

MFT is mounted. Figure 36 shows an exploded view of the different elements composing the detector. The two half-cones are fixed to two end-cap patch-panels which in turn are fixed to large carbon fibre composite structures, called half-barrels, that are used to insert and position the MFT within the ALICE internal cage, see Sec. 3.5. The services are routed along the half-barrels and through the patch-panels to reach the detector. The patch-panels are mechanical pieces used also to support the FIT-C detector and to interconnect the readout cables from the half-cones to the readout units, which are located 6 m away beside the front absorber. Figure 37 shows a fully assembled half-cone and the MFT in its final position.



Figure 36: Detailed view of the elements composing the MFT detector.



**Figure 37:** Left-hand panel: fully assembled half-cone of the MFT with patch-panel and the FIT-C detector. Right-hand panel: the MFT in its final position; the cooling and power services can be seen along the half-barrels.

# 3.3.2 Ladder assembly and testing

The basic element of the MFT detector, called ladder, is composed of an aluminum FPC on which siliconpixels sensors are glued and interconnected. The length of the ladders varies from 2 to 5 chips each to match the size of the half-layers. Each FPC is equipped on one side with footprints for placing the sensors and a 70-pin connector for powering the chips and transmitting the high-speed readout signals, and on the other side with microelectronic components (resistors and capacitors) that decouple the power supply (analog and digital) of the sensors and adapt the impedance of the data lines. Given their variable length, the FPC design was optimized in order to reduce the maximum voltage drop to 100 mV and to ensure the best transmission of the high-speed data lines.

The ladder assembly took place in the clean room of the CERN EP/DT Departmental Silicon Facility using a three-axis digitally controlled placement machine, called ALICIA (ALICE Integrated Circuit

Inspection and Assembly machine), which places the chips with a precision of 5  $\mu$ m on a specially machined stainless steel support with a lattice of very small holes to hold the chips in position by suction. At the same time, the FPC is positioned on a suction support which also keeps it perfectly flat and positioned with a precision better than 300  $\mu$ m. Small dots of Araldite-type two-component glue are applied to the FPC using a stainless-steel stencil with conical holes. The FPC is flipped and positioned opposite to the chips with the help of precision centering pins. The weight of the FPC carrier provides sufficient pressure to spread out the glue and a spacer between the FPC and the chips ensures that the final thickness of the glue layer is around 50  $\mu$ m. After curing of the glue, the assembly is removed from ALICIA, visually inspected and brought to the CERN bond-lab, where the connection between the FPC and the chips is realized by ultrasonic micro-wire bonding on the metallized pads of the FPC. Each micro-interconnection consists of 3 wires with 25  $\mu$ m diameter that pass through the vias of the FPC to be connected to the 74 pads on the surface of the sensors. The assembly of the ladder is then completed and a final visual inspection verifies the quality of the interconnections. Figure 38 shows a picture of an assembled ladder.



**Figure 38:** Example of an assembled MFT ladder. Upper picture: back side of the FPC with the glue spots on which the sensor are glued. Bottom picture: front view of assembled ladder.

Once the ladder is assembled, it is transferred to the MFT laboratory where it undergoes a battery of tests to qualify its performance. In the laboratory, two test benches allow each ladder to be qualified from an electrical and functional point of view. First of all, the ladder is gradually powered with analog and digital voltage and its power consumption is checked. Then, the ladder is connected to an acquisition system developed specifically for the ITS2 and the MFT projects, which allows its functionality to be tested in terms of electronic noise, number of dead or defective pixels, and transmission speed of the digitized data. Each test is associated with a qualification grade which is a function of the performance measured against the expected specifications and automatically determined by the qualification software. At the end of these tests, the ladder is qualified according to four grades:

- "Gold" for a ladder that works perfectly and whose pixels and circuitry respond exactly as expected.
- "Silver" when the number of pixels which do not respond correctly is between 0.1 and 4‰ and the ladder can be used without any problem.
- "Bronze" when the number of defective pixels is between 0.4 and 1% and the ladder, although functional, is used as spare rather than for equipping the MFT detector.
- "non-compliant" when a ladder does not pass the tests because of, for example, damaged chips, a defect of the FPC or improper handling. This ladder is discarded.

To fully equip the MFT (including one additional half detector and 20% spares modules), 500 ladders were manufactured, tested, qualified and mounted on the disks in one year with a rate of gold and silver qualified ladders of around 91%.

# 3.3.3 Half-Disks

As shown in Fig. 39, each half-disk is composed of a support to which a heat exchanger is glued and two PCBs are screwed. The ladders are glued onto each face of the heat exchanger, screwed to the support, and connected to the PCBs. Four different types of half-disks were designed since the first two disks (D00 and D01) are identical.

The half-disk supports, which form the mechanical interface between the elements of the half-disk and the cone structure, were also designed to ease integration of services (cooling pipes and power cables). To minimise the material budget, the support structures are made of PEEK (PolyEtherEtherKetone plastic).

The ladders are connected to a PCB to route the readout, slow control, and clock signals from the ladders to connectors located at the periphery of the half-disk. The PCBs of half-disks D00, D01 and D02 are connected to motherboards that relocate the connection to readout cables for integration reasons. For half-disks D03 and D04, the readout cables are directly connected to the PCBs. The PCB also distributes the different voltages (analog, digital, and reverse bias) to the ladders. Two connectors are located on the left and right sides, one for zones 0 and 1, and the other for zones 2 and 3. The PCB is equipped with decoupling capacitors located close to the ladder and power connectors. In addition, a temperature sensor (PT100) allows the measurement of the local temperature and the acquisition of a reference for the temperature information given by the ALPIDE chips. The temperature signal is sent to the PSU on a dedicated line of the power cables.

The heat exchanger was designed to keep the ALPIDE sensors at a temperature below 30°C using water cooling and to have a total material budget per half-disk below 0.7% of a radiation length. It is composed of two K13D2U carbon fiber cold plates glued to each side of a 14 mm thick core made of Rohacell foam. To circulate cooling water under each sensor, 3 or 4 kapton tubes of 1 mm diameter are glued and covered by a carbon fleece. Two manifolds made of PEEK are glued on each side of each heat exchanger to distribute water through the kapton pipes.

The heat exchangers are qualified in several steps. First, the internal structure is inspected using X-ray tomography in order to check the quality of the gluing and the integrity of the cooling pipes. Before closing the second manifold, the water flow exiting each pipe is measured in order to verify that none of the pipes are pinched. Finally, cooling tests are performed by using resistive patches to simulate the heat generated by the sensors. The temperature is monitored with PT100 sensors located on the heat exchanger and a thermal camera. The goal is to check the homogeneity of the cooling.

The ladders are glued to half-disks using the Dow Corning SE4445-CV silicon glue. The pattern of glue deposition was studied to avoid any flow outside the area of each sensor (see Fig. 40). The planarity of the heat exchanger is around 50  $\mu$ m and the final thickness of the glue is around 50  $\mu$ m. The ladders are positioned using a gantry and plugged into the disk PCB. Electrical and communication tests of the sensors are performed to check their proper functioning. In case of failure in the electrical and functional tests, the ladder can be replaced before the glue is fully cured. The remaining glue is removed and a new ladder can be glued.

# 3.3.4 Cone and Barrel

Each MFT half-cone is supported by a mechanical structure on which three motherboards for the readout are mounted (see Sec. 3.3.6). The different half-disks and the PSUs (see Sec. 3.3.5) with their support are mounted on this structure and the different services (readout cables, power cables, and cooling pipes) are routed inside this structure. The environment of the MFT, in particular the presence of the very fragile beam pipe, imposed several constraints on the design of the cone. As the MFT cone is supported from the side close to disk D04, the displacement due to gravity is the largest at disk D00. This displacement has to be kept below 100 µm to avoid interference with the beam pipe support flanges which are positioned very close to the detector. The cone support structure was produced from aluminum. In order to homogenise



Figure 39: Exploded view of a half-disk (D00-01).



Figure 40: Left: Half-disk during ladder gluing. Right: Glue deposition pattern.



Figure 41: Half-cone structure with air ducts in light blue.

the temperature inside the cone, air guides produced by a 3D-printer were added, see Fig. 41. Reference targets can be fixed to the support structure for the purpose of geometry surveys.

Each half-cone is fixed onto a half-barrel which is built from composite material along which the services (power cables, water pipes, and air ducts) are run as shown in the right-hand panel of Fig. 37. On the A-side, a patch-panel (PP2) is mounted to guide these services. On the C-side, the patch-panel has the following functions:

- position and support a half-cone
- connect and guide the services from/to the half-cone (this is where the filter boards are fixed, see Sec. 3.3.5)
- position and support the FIT-C detector.

Finally, the half-barrels are equipped with wheels that run on guide rails for the insertion of the half-cones into their final position.

# 3.3.5 Services

The ALPIDE sensors require three different voltage supplies (analog, digital and reverse bias) which are locally generated via DC-DC converters in a PSU in order to minimize the material budget in the barrel by reducing the number of copper power lines and moving the fine power distribution as close as possible to the active detector area. The MFT is equipped with four PSUs, each powering one face of the five half-disks of the same half-MFT. Each PSU is composed of two boards (see Fig. 42): a main board ensures power distribution from the DC-DC converters and a mezzanine board controls the main board and is equipped with one GBT-SCA to send measurements to the DCS (voltages, currents, temperatures, humidity, and status of zones). The different functionalities of the PSU are:

- Conversion of the eternal power supply for the analog and digital circuitry of the sensors by FEASTMP-CLP DC-DC converters <sup>1</sup>. One DC-DC converter is used to power two zones of the same half-disk face (0-1 or 2-3) with analog voltage. For digital voltage, one DC-DC converter is also used to power two zones for half-disks 00-01-02 and one zone for half-disks 03 and 04 since they have more sensors per zone and the output current is limited to 4 A. The voltage drop from the PSU to the sensors is taken into account and the output voltages of the DC-DC converters are adjusted accordingly.

<sup>&</sup>lt;sup>1</sup>A DC-DC converter is an electronics circuit that converts a continuous current from one voltage level to another.



Figure 42: MFT PSU boards with mezzanine assembled on their support.

- Detection of latch-up events through the measurement of currents per zone (analog, digital, and reverse bias). In case of latch-up, all voltages of the zone are switched off and the information of the line that has generated this event is encoded and transmitted.
- Communication with DCS is realised by the use of GBT-SCAs to control DC-DC converters, reset zones, adjust reverse bias voltage and thresholds on analog, digital and reverse bias currents. It is also used to send the measurements of voltages, currents, and temperatures of the half-disks, of the inlet and outlet of cooling of the PSU, of the ambient temperature (two sensors on the mezzanine board), and finally of the humidity (sensor located on the mezzanine board). The status of zones and the latch-up information are also sent to DCS.
- Fail-safe procedure: in case of a loss of communication with the DCS for more than 6 s, all voltages are automatically switched off and the DC-DC converters are disabled.

All the PSUs are controlled through a dedicated CRU via four intermediate boards, called PSU-Interfaces (one per PSU), equipped with GBTx ASICs.

# 3.3.6 Readout

The MFT readout is based on the same general architecture used for the ITS2 detector (based on the Readout Unit front-end board, see Sec. 3.2.4) arranged with an implementation specifically designed for the MFT geometry. In particular, the 50  $\Omega$  differential pairs connecting the sensors to the external readout are routed from the disk to the cone section, then to the patch panel, and finally along the sides of the ALICE front absorber to the readout crates. This requires a specific design of the routing elements and cables in order to provide a reliable connection quality, in particular for the high speed data link at 1.2 Gb/s used for data transmission over a distance of several meters. The design and development of the MFT readout architecture contains a larger number of connections and passive elements than the ITS2, due to constraints on the number of cables that can be routed through the barrel detectors. All signals are instead routed through the limited space between the absorber and the TPC, requiring a sequence of passive dispatching elements, each one inducing specific constraints in terms of impedance adaption, which is a crucial parameter for the high-speed signal transmission quality.

### 3.3.7 Detector commissioning

The two halves of the MFT detector, together with a third spare half, were assembled by the end of 2019. Over almost one year, the detector was fully qualified and commissioned in the laboratory in order to assess and optimize its operation in terms of powering, cooling, and readout. A detailed study of the noise rates was performed and summarised in Fig. 43, which shows the fake-hit rate as a function of the



**Figure 43:** Noise occupancy, measured in hit/pixel/event as a function of the number of masked pixels over the whole MFT detector.



Figure 44: Display of the reconstructed muon tracks in the MFT from a TED shot event.

number of masked pixels. A noise occupancy below  $10^{-7}$  hits per pixel and per event is obtained by masking only 138 pixels out of a total of 490 millions pixels. This result is well within the specifications for the detector.

The installation of the MFT into the ALICE experiment took place in December 2020 and required several months of activity to route all the services and integrate this new detector within the ALICE central systems. A detailed commissioning phase confirmed that the noise levels measured in the laboratory were unchanged after the complex installation inside the ALICE cavern. Moreover, the MFT was ready to take data in October 2021 during the commissioning of the LHC injection lines, using proton beams injected into the transfer line, which are dumped on the Target Extraction Dump (TED). Interactions of the proton beam with the TED, which is located around 30 m upstream from the ALICE cavern, produce a shower of muons traversing the ALICE detector. As shown in the event display in Fig. 44, the MFT detector was able to detect and reconstruct these muon showers, proving its readiness for data taking in Run 3.



Figure 45: Schematic view of the ALICE TPC.

# 3.4 Time Projection Chamber

This section summarizes the upgrade of the ALICE Time Projection Chamber (TPC). A more detailed description of the upgrade can be found in [15, 24].

# 3.4.1 Introduction

The TPC was successfully operated in pp, p–Pb, Pb–Pb, and Xe–Xe collisions at a variety of collision energies [2, 25] during LHC Runs 1 and 2 (2009 to 2018). Its active volume has a cylindrical shape with a length and outer diameter of about 5 m, resulting in a total active volume of 88 m<sup>3</sup> (see Fig. 45). It covers a symmetric pseudorapidity interval around midrapidity ( $|\eta| < 0.9$ ) at full azimuth. The field cage has a high-voltage electrode in its center, which divides the active volume into halves. The inner diameter of the central field cage drum is 114 cm, which provides the necessary space for the installation of the ITS. Each of the two endplates is subdivided into 18 azimuthal sectors. Each sector houses one inner (IROC) and one outer readout chamber (OROC).

During Runs 1 and 2, the readout chambers were based on multiwire proportional chamber (MWPC) technology. MWPCs have to be operated with an active ion gating grid in order to collect ions from the amplification region. Otherwise, these ions would drift back into the drift volume, where they would lead to substantial space-charge distortions. However, triggered readout is not compatible with the goals of the ALICE upgrade described in Chapter 5. Instead, the upgraded TPC must be read out continuously. This implies that the previous readout system, including the readout wire chambers and the front-end electronics needed to be replaced. At the same time, the excellent performance achieved in Runs 1 and 2 needed to be maintained in order to achieve the ambitious ALICE physics program for Runs 3 and 4. A dE/dx resolution of 5% (for isolated tracks) translates into a requirement for the local energy resolution of better than 14% at the <sup>55</sup>Fe-peak.

On average, at a collision rate of 50 kHz, tracks from five collisions pile up within the TPC drift time window of about  $100 \,\mu\text{s}$ . Without continuous readout, not all of these interactions can be read out. With continuous readout, however, novel gas amplification techniques are required in order to provide sufficient ion blocking without an active ion gate. The requirement to keep the ion-induced space-charge

| Detector gas                      | Ne-CO <sub>2</sub> -N <sub>2</sub> (90-10-5) |
|-----------------------------------|----------------------------------------------|
| Gas volume                        | $88 \mathrm{m}^3$                            |
| Drift voltage                     | -100 kV                                      |
| Drift field                       | 400 V/cm                                     |
| Maximal drift length              | 250 cm                                       |
| Electron drift velocity           | 2.58 cm/µs                                   |
| Maximum electron drift time       | 97 µs                                        |
| $\omega\tau (B = 0.5 \mathrm{T})$ | 0.32                                         |
| Electron diffusion coefficients   | $D_{\rm T} = 209\mu{\rm m}/\sqrt{{\rm cm}}$  |
|                                   | $D_{\rm L} = 221\mu{\rm m}/\sqrt{{\rm cm}}$  |
| Ion drift velocity                | 1.168 cm/ms                                  |
| Maximum ion drift time            | 214 ms                                       |
|                                   |                                              |

Table 6: Parameters of the upgraded TPC. Table taken from [15].

distortions at a tolerable level leads to an upper limit of 2% for the fractional ion backflow (defined as the ion escape probability per effective electron-ion pair produced in the gas amplification stage) at the operational gas gain of 2000.

Gas Electron Multipliers (GEMs) [26] provide a viable solution to this challenge. They can be arranged in stacks, creating layers of amplification stages, which can be tuned accordingly. With a careful optimization of the gain share among the GEMs, and by efficiently blocking the path of back-drifting ions that emerge from subsequent layers, the required low ion backflow can be achieved.

The gas mixture for the operation of the upgraded TPC is Ne-CO<sub>2</sub>-N<sub>2</sub> (90-10-5), i.e. 90 parts of Ne, 10 parts of CO<sub>2</sub>, and 5 parts of N<sub>2</sub>. No changes to the existing gas hardware were necessary, since the TPC had already been operated with this gas mixture in Runs 1 and 2. A mixture based on Ne has the advantage of a higher ion mobility compared to similar Ar-based mixtures, which directly reduces the magnitude of the space-charge distortions by nearly a factor of two [27].

Table 6 summarizes the most important TPC parameters. The drift time for ions from the readout plane to the central electrode is 214 ms at a drift field of 400 V/cm. Therefore, at 50 kHz, around  $10^4$  collisions partially contribute to the space-charge distribution.

#### 3.4.2 Readout chamber design

The new readout chambers of the upgraded TPC are based on stacks of four GEM foils. Foils with standard (S,  $140 \,\mu$ m) and large (LP,  $280 \,\mu$ m) hole pitch are combined to an S-LP-LP-S configuration, which is shown in Fig. 46. Most of the ions are produced in the last amplification step, i.e. GEM 4. Their



**Figure 46:** Schematic view of a stack with four GEM foils. The baseline settings for the voltages across the four GEMs, the transfer fields between the GEMs, and the induction field between GEM 4 and the pad plane are indicated as well.



**Figure 47:** Energy resolution  $\sigma(^{55}\text{Fe})$  as a function of ion backflow (IBF) in a 4-GEM stack (S-LP-LP-S) in Ne-CO<sub>2</sub>-N<sub>2</sub> (90-10-5). The gas gain is kept at 2000 in all measurements by adjusting the voltages on GEM 3 and GEM 4 at a fixed ratio of 0.8 or 0.95. Figure from [15].

drift path is efficiently blocked by the upper GEM layers by carefully optimizing the GEM voltages and transfer fields, and by choosing GEM hole patterns avoiding the accidental alignment of holes in subsequent layers. As Fig. 47 shows, an extended operational region satisfying the requirements (ion backflow below 2% and  $\sigma$ (<sup>55</sup>Fe) below 14%) can be found with an S-LP-LP-S setup. The data exhibit a characteristic anticorrelation between ion backflow and the relative energy resolution  $\sigma$ (<sup>55</sup>Fe). This effect is largely due to the operational conditions at GEM 1 since ions emerging from this layer have a large probability to escape into the drift volume. In order to minimize the number of ions produced at GEM 1, the gas amplification has to be reduced, which however also leads to an effective loss of primary ionization, and therefore to a degradation of the energy resolution.

A readout chamber consists of a trapezoidal aluminium frame (*Al-body*), a fiberglass plate (*strongback*) and a *pad plane* made of a multilayer printed circuit board (PCB). Figure 48 shows an IROC with the individual components. While an IROC is assembled from one Al-body, one strongback, one pad plane and one GEM stack, an OROC is assembled from one Al-body, one strongback, three pad planes and three GEM stacks labeled OROC 1, OROC 2 and OROC 3. The stacks consist of four GEM foils stretched and glued onto fiberglass epoxy frames, each containing a spacer cross. The geometrical parameters of the new readout chambers are summarized in Table 7.

Figure 49 shows the most important features of the GEM design. The design of the Al-bodies includes a copper pipe for temperature control by water cooling. The electric potential is provided to the GEM stacks via feedthroughs. The strongback provides electrical insulation between pad plane and Al-body and reduces the pad capacitance to ground. The pad planes are made of a 3.2 mm thick FR4 multilayer board. The top PCB layer consists of copper readout pads, arranged in pad rows. The pad planes do not contain ground electrodes in order to minimize the capacitance to ground at the preamplifier input. The pads are connected to traces routed to vertical 40-pin female connectors on the backside of the pad plane. The routing of the traces is done using three additional PCB layers. The signal routing was designed to minimize the trace length. Four connectors in radial direction group 160 pads to connect to one front-end card (see Sec. 3.4.6).



Figure 48: Exploded view of an IROC with chamber body components and GEM frames. Figure taken from [15].

The GEM foils for the ALICE TPC upgrade were produced using the single mask technique [28]. The top side of each foil is subdivided into high-voltage segments with an area of about  $100 \text{ cm}^2$ . The segmentation limits the currents in case of electrical discharges and minimizes the affected area in case a segment develops a short circuit. The gap between the adjacent segments is  $200 \,\mu\text{m}$ . An additional  $100 \,\mu\text{m}$  space is added between the segment boundaries and the surface containing GEM holes. Electric potentials are applied to a foil through wires soldered to HV flaps placed on the top and bottom sides of the foil. From here, the potential is distributed via a 1 mm wide copper trace running along three sides of the foil. Each foil segment is connected to the HV trace via a 5 M\Omega loading resistor ( $R_{\text{load}}$ ).

# 3.4.3 Foil production, chamber production and quality assurance

In total, 45 IROCs and 40 OROCs were assembled over several years at several production sites. More than 300 shipments of material and subcomponents between the different production sites were necessary. Standardized transport and testing procedures and well defined assembly and quality-assurance protocols were followed in order to ensure high quality and reliability of the assembled ROCs. All assembly activities involving GEM foils were performed in clean rooms with class ISO 5 to 7, taking all precautions to avoid contamination of the GEMs.

The GEM foils were extensively tested before and after each transport. These tests consisted of optical inspection and a measurement of the leakage current [29]. An excessive current of a segment points to a shorted GEM segment. An increased current may come from a defect or contamination and was considered a potential danger. Shorted and contaminated GEMs were sent back to the production site for cleaning. An advanced quality assurance procedure, performed once for each GEM, consisted in a long-term (at least 5 hours) leakage current measurement and an optical survey [30–32]. During the optical survey, microscope photographs were taken of the entire GEM. The pictures were stitched together and analyzed for defects and hole-size nonuniformities. In total, 829 GEM foils were produced in the course of the project. The final yield of good foils after all production and quality assurance steps was about 91%.

The framing of a GEM foil consisted of stretching the foil and positioning it above a frame covered with a thin layer of epoxy glue. The framed foils were subsequently mounted on the preassembled readout chamber bodies, and the HV wires were soldered to the top and bottom HV flaps. The stacks

| Readout chambers<br>Total number   | $2 \times 2 \times 18 = 72$                                                             |  |  |
|------------------------------------|-----------------------------------------------------------------------------------------|--|--|
| Readout technology                 | $2 \times 2 \times 18 = 72$<br>4-GEM stack, single mask, standard (S, 140 µm) and large |  |  |
| Readout technology                 | (LP, 280 µm) hole pitch GEMs in S-LP-LP-S configuration                                 |  |  |
| Effective gas gain                 | 2000                                                                                    |  |  |
| Inner (IROC)                       |                                                                                         |  |  |
| Total number                       | $2 \times 18 = 36$                                                                      |  |  |
| Active range                       | $848.5 < r < 1321 \mathrm{mm}$                                                          |  |  |
| Number of HV segments per GEM foil | 18                                                                                      |  |  |
| Pad rows                           | 63                                                                                      |  |  |
| Total pads (IROC)                  | 5280                                                                                    |  |  |
| S:N (MIP)                          | 20:1                                                                                    |  |  |
| Outer (OROC)                       |                                                                                         |  |  |
| Total number                       | $2 \times 18 = 36$                                                                      |  |  |
| Active range                       | 1347 < <i>r</i> < 2464 mm                                                               |  |  |
| Total pads (OROC)                  | 9280                                                                                    |  |  |
| S:N(MIP)                           | 30:1                                                                                    |  |  |
| Pad rows                           | 89                                                                                      |  |  |
| OROC 1                             |                                                                                         |  |  |
| Active range                       | 1347 < <i>r</i> < 1687 mm                                                               |  |  |
| Number of HV segments per GEM foil | 20                                                                                      |  |  |
| Pad rows                           | 34                                                                                      |  |  |
| Number of pads                     | 2880                                                                                    |  |  |
| OROC 2                             |                                                                                         |  |  |
| Active range                       | $1708 < r < 2068 \mathrm{mm}$                                                           |  |  |
| Number of HV segments per GEM foil | 22                                                                                      |  |  |
| Pad rows                           | 30                                                                                      |  |  |
| Number of pads                     | 3200                                                                                    |  |  |
| OROC 3                             |                                                                                         |  |  |
| Active range                       | $2089 < r < 2464 \mathrm{mm}$                                                           |  |  |
| Number of HV segments per GEM foil | 24                                                                                      |  |  |
| Pad rows                           | 25                                                                                      |  |  |
| Number of pads                     | 3200                                                                                    |  |  |

Table 7: Geometrical parameters of the new readout chambers.

are not glued, such that they can be disassembled and rebuilt in case of problems. After HV connection, resistance and capacitance across each foil were measured in order to identify possible issues at the earliest possible stage. The assembled detectors were mounted in gas-tight test and transport vessels and qualified, before being sent to CERN for acceptance tests, storage and installation in the TPC field cage.

At CERN, the accepted ROCs underwent a final stability test under irradiation. For these tests, the nominal gas mixture and the final components of the HV system were used. These tests were performed in the ALICE cavern during LHC operation or at the CERN Gamma Irradiation Facility (GIF++) [33, 34].

Those ROCs that were not accepted or replaced after initial commissioning of the TPC at the surface have been refurbished such that they are available as good spare chambers for an eventual replacement campaign in the future. The TPC could be brought to the surface again during the Long Shutdown 3 of the LHC in the years 2026 to 2028. The refurbishment of the ROCs includes in particular the assembly and installation of new GEM stacks.

### 3.4.4 Field cage

The field cage of the TPC is described in detail in [25]. The central drift electrode is biased to a potential of about -100 kV and generates a uniform electric drift field with the help of potential strips that are suspended close to the walls of the inner and outer field cage vessels. These strips are powered through



**Figure 49:** GEM design details. The detail on top shows the  $200 \,\mu$ m separation between the copper sectors and an additional clearance of  $100 \,\mu$ m for the GEM holes from the copper edge. Figure taken from [15].

a resistor chain housed in a water-cooled rod. In addition, aluminium strips are glued to the walls of the field cage vessels at a certain distance. These *guard rings* are powered through separate resistor chains. Their purpose is to prevent local charging-up of the surfaces. The potential of the last strip of each resistor chain (both resistor rods and guard rings) is set to a value similar to that of the GEM 1 top electrodes facing the drift volume, which is around -3.26 kV for the nominal configuration. With respect to the original MWPC-based TPC, this potential is now higher, and requires to be set at the last strips with additional power supplies. The existing HV connections in flanges in the aluminium endplates of the field cage, and the connections to the last strips had to be adapted for the higher potential ratings. Suitable last resistors to ground were chosen to allow for a small current to ground.

# 3.4.5 HV system

The baseline settings for the voltages across the GEMs, for the transfer fields between the GEMs, and for the induction field between GEM 4 and the pad plane were optimized with respect to operational stability under the radiation load expected in Run 3. The settings are indicated in Fig. 46. The main feature is a very low transfer field  $E_{T3}$  between GEM 3 and GEM 4 of only 100 V/cm. The other two transfer fields and the induction field  $E_{ind}$  are kept at typical values around 3500 V/cm. The highest gain is provided in GEM 3 and GEM 4, while the gain in GEM 1 is relatively low. As a consequence, most ions are created around GEM 4. Their drift into the drift region of the TPC is hindered by the large-pitch foils utilized for GEMs 2 and 3, and by the very low transfer field  $E_{T3}$ .

An equalization of the gain across all 144 GEM stacks on the TPC is achieved by adjusting the voltages in GEM 3 and GEM 4. The induction field is corrected correspondingly in order to ensure that the potential on the GEM 1 top electrode remains uniform over all stacks, such that the uniformity of the drift field in the TPC drift volume is not disturbed.

A new HV system was designed for the operation of the GEM-based ROCs. A detailed description can be found in [15]. In order to maximize operational safety, while at the same time providing the

highest possible flexibility, a power supply system with cascaded channels was chosen. In this way, the potentials at the various electrodes can be easily adjusted, and a safe operation of the GEM stacks can be guaranteed. A schematic view of the high-voltage system including all loading and protection resistors is shown in Fig. 50. A shunt resistor in the voltage distribution for the GEM 4 top electrode allows to



**Figure 50:** Detailed powering scheme of a GEM stack. Each subsequent high-voltage channel is stacked on top of the lower-lying channel. The ground reference is defined by a separate line connected to the ground of the detector. The line for GEM4 top is shunted with a resistor ( $R_{shunt}$ ) inside the high-definition current meter. Each line is connected to the detector through a decoupling resistor ( $R_{dec}$ ). The signal from a calibration pulser is coupled via a capacitor ( $C_{pulser}$ ) to the line for GEM4 bottom. Individual loading resistors ( $R_{load}$ ) are mounted on all segments on the top sides of the GEMs. Figure taken from [15].

periodically read the currents for all GEM stacks. During the operation of the TPC, current variations in the GEM stacks will directly relate to variations of the local track density in the TPC drift volume. In a high-definition current meter the currents are digitized at a rate of typically 1 kHz (8 kHz maximum) by 24-bit ADCs with a resolution of 3 nA. From these data, three-dimensional maps of the space charge from back-drifting ions can be extracted in order to calibrate drift-field distortions (see Sec. 3.4.9). Finally, the powering scheme also includes the possibility to inject a pulser signal to the HV line of each GEM 4 bottom electrode for calibration purposes.

### 3.4.6 Front-end electronics and readout

A schematic view of the front-end electronics and readout system is shown in Fig. 51. A single Front-End Card (FEC) processes the signals from 160 input channels. The pulses are transformed into differential, semi-Gaussian voltage signals and then digitized in five SAMPA ASICs (see Sec. 2.4). All ADC values are transmitted off-detector through two optical links. In this way, all data are available such that flexible filter algorithms can be applied in the FPGA-based readout cards (CRU, see Sec. 2.2). One down-link



**Figure 51:** Schematic view of the TPC readout system. Five SAMPA chips amplify, shape, and digitize the current signals picked up on the connected pads. Two GBTx ASICs multiplex the digitized data. GBTx 0 forwards the data from two and a half SAMPA chips to a VTRx. GBTx 1 forwards the data from the other two and a half SAMPAs to one VTTx module (two optical transmit channels). GBTx 0 also receives configuration data and the reference clock through the VTRx. The reference clock is distributed to the other components. A GBT-SCA chip is used for monitoring and configuration. Figure taken from [15].

is needed for control and configuration of a FEC. In total, 3276 FECs are needed to read out the TPC. This leads to 6552 data links and a total data throughput of 3.28 TB/s to the 360 CRUs for TPC readout. In the CRU FPGA the data are corrected for the common-mode effect and ion tails are removed from the signals, before data reduction (zero suppression) is applied. Moreover, the signals are integrated for each channel over 1 ms, and these data are sent out separately as input to calculate three-dimensional space-charge maps for distortion correction. An overview over the correction of the TPC signals in the CRU and of the calibration of the TPC data is given in Sec. 3.4.9.

The parameters for the front-end electronics system are summarized in Table 8. With respect to the readout system utilized in Runs 1 and 2 [25], the new FEC has to meet two new requirements: continuous readout and negative input signal polarity. For the charge-sensitive amplifier (CSA), a saturation limit of 30 nA was required in order to accomodate the expected average rate of primary ionization clusters (up to about 3 nA per front-end channel) and in addition fluctuations due to the local track multiplicity. The conversion to digital values takes place with a gain of 20 mV/fC, at a sampling rate of 5 MSa/s, and with a precision of 10 bit.

The TPC FEC is shown in Fig. 52. It has a similar form factor as the one used in Runs 1 and 2. The 8-layer FEC PCB utilizes rigid-flex technology, where rigid and flexible substrates are laminated together into a single structure. The radiation-hard GBT link [35] system is used (see Sec. 2.2) for data transfer

| Readout mode               | continuous                              |  |  |
|----------------------------|-----------------------------------------|--|--|
| Number of channels         | 524160                                  |  |  |
| Number of FECs             | 3276                                    |  |  |
| Signal polarity            | negative                                |  |  |
| Average system noise (ENC) | 670 e                                   |  |  |
| Conversion gain            | 20 mV/fC                                |  |  |
| Dynamic range              | $100 \text{ fC} (30 \times \text{MIP})$ |  |  |
| Peaking time               | 160 ns                                  |  |  |
| CSA saturation limit       | 30 nA                                   |  |  |
| ADC number of bits         | 10                                      |  |  |
| ADC sampling rate          | 5 MSa/s                                 |  |  |
| Power consumption (total)  | 56 mW per channel                       |  |  |

Table 8: Parameters of the upgraded front-end electronics. Table taken from [15].



**Figure 52:** Layout of the final TPC FEC PCB Rev. 1a. The components are mounted on both sides of the board. The figure shows the top side with five SAMPAs, two GBTx, one GBT-SCA, one VTRx, one VTTx and some other components. On the bottom side a few additional small components and the connectors to the detector are placed. Figure taken from [15].

and control. The clock for the digital circuitry is received from the CRU. The electrical links between GBTx and SAMPA use the SLVS standard and are operated at 160 Mb/s. The ADC sampling clock of 5 MHz is derived by division from the SLVS link clock speed inside the SAMPA chips. The monitoring of FEC operational parameters is based on the GBT-SCA 12-bit ADC and includes 14 measurements per FEC. FEC control is achieved via the GBT-SCA as well.

The SAMPA chips are operated in DAS mode, where the DSP is bypassed (see Sec. 2.4). In this mode, the power consumption is 9 mW per channel, which adds up to 1.5 W for the whole FEC. Additional power is needed for the GBT components and for the voltage regulators. The total power consumption for the full FEC is about 9 W (56 mW per channel). The power is supplied using the same Low Voltage (LV) system from Runs 1 and 2 [25]. In particular, two low voltage channels are used for each TPC sector (91 FECs) to supply the analog (2.25 V, 85 A) and digital (3.25 V, 185 A) power. Each FEC is surrounded by water-cooled copper envelopes. The heat transfer from the hottest components (the two GBTx ASICs, the five SAMPA ASICs and the voltage regulators) to the copper plates is optimized by the addition of flexible heat-transfer pads.

# 3.4.7 Installation

The upgrade of the TPC was carried out in a 15-month period during the Long Shutdown 2 of the LHC in a dedicated clean room on the surface at the site of ALICE at LHC Point 2. The procedure included the deinstallation of the old readout chambers and the installation of the new GEM detectors, the modification of the field cage, the installation of the new front-end electronics and services, and a

series of basic functionality and performance tests.

After the end of Run 2, the TPC was disconnected in December 2018 and January 2019, and then moved from the experimental cavern to the surface in February and March 2019. It was installed in the clean room after removal of the old services (cables, pipes and hoses), front-end electronics and Service Support Wheels (SSW) [25], and after extensive cleaning. Initially, all MWPC ROCs were uninstalled on one side of the field cage, and the end plate was closed with aluminum panels. In a second step, the new GEM ROCs were installed. When all GEM chambers had been installed on the first side, the TPC was lifted and rotated by 180 degrees for chamber installation on the second side.

After completion of the ROC installation, the modified SSWs were installed on the two sides of the TPC. Each SSW supports the front-end electronics and related services (LV and fibers), the HV infrastructure (protection resistor boxes), the manifolds for the various circuits for cooling water, and the drift gas manifolds.

In a next step, the front-end electronics (3276 FECs) were installed and the corresponding services (power cables, fibers and cooling tubes) were connected. A first commissioning phase was then carried out for pairs of sectors in order to verify all chambers and the electronics. Various modes of data taking allowed for testing and improving the readout and reconstruction workflow. The acquired data were also used for calibration purposes and validation of the detector simulation. The data sample included pedestal, pulser and laser runs, as well as samples containing cosmic tracks and charge clusters from irradiation with an X-ray source. In August 2020, the TPC was transported back to the experimental cavern for installation into the ALICE magnet.

After installation in the central barrel of the ALICE experiment, the TPC was connected to its service infrastructure in the winter of 2020/21. The necessary connections include the hoses for the water cooling of the electronics and of the auxilliary systems, the LV cables, the HV cables, the fiber patch cords and a few additional cables (pulser and laser control).

# 3.4.8 Performance

After connection and verification of the services, the commissioning of the data processing chain, of the readout and reconstruction workflows and of the calibration procedures started. One first highlight were the pilot beams at low luminosity provided by the LHC in October 2021. Figure 53 shows an online plot from data recorded during this period. No track selection criteria and no calibration was applied in these data. The data come from the online quality control system and demonstrate the particle identification performance for tracks reconstructed online in the graphics processing units installed in the  $O^2$  EPN servers (see Sec. 5.3).

From summer 2022 the TPC was recording data at higher luminosities routinely. The data are affected by a baseline shift due to the common-mode effect. This effect is well known, and occurs in the ROCs due to capacitive coupling of the GEM 4 bottom electrode to the readout pads. In addition, during the analysis of the data collected during the commissioning phase, a characteristic tail was identified, which appears in particular for signals with large amplitudes. Simulations showed that part of the tail is induced by ions that are created just below the holes of GEM 4. Due to the local electric field configuration, these ions move fast enough to induce a signal on the pad plane. In addition, due to the rather high induction field applied for the TPC GEMs, amplification occurs in the full induction gap. The gain in the induction gap is very small, but nevertheless the produced ions contribute to the ion tail. A high-precision measurement of the ion tail (using overlay of many signals from laser tracks in the TPC) is shown in Fig. 54. Fig. 55 visualizes the common-mode effect together with the effect of the ion tail on data collected by a single readout channel at high occupancy.



**Figure 53:** TPC dE/dx as a function of momentum *p*. Online plot from quality control during pilot run with 900 GeV collisions in October 2021. No track selection criteria were applied. The magnetic field value was of 0.5 T.

# 3.4.9 Calibration

In order to operate the detector, some basic calibration steps are necessary. The baseline (pedestal value) and noise of each electronics channel are extracted from the data collected in special pedestal calibration runs, where all filtering on the CRU is inactive. The mean of the baseline distribution defines the pedestal value, while the sigma corresponds to the noise of a given channel. The baseline values need to be uploaded to the logic in the CRU FPGAs in order to subtract them from the input signals. The thresholds for the zero suppression filter algorithm in the CRU are derived from the noise values (typically  $3\sigma$ ).

The correction algorithm for the common-mode effect requires the upload of configuration parameters (one per channel) into the logic in the CRU FPGAs as well. The parameter is extracted from the data collected in special calibration pulser runs, where a pulser signal is injected into the HV line of each GEM 4 bottom electrode. It describes the local geometry (distance between pad plane surface and GEM 4 bottom electrode), which is influenced by sagging of the GEM foils.

The correction algorithm for the ion tail filter requires two configuration parameters for each channel, describing the shape of the tail and its amplitude relative to the pulse amplitude. These parameters are extracted for each pad from physics data aquired with beam.

On top of the basic calibrations, more complex calibration of the TPC data is needed. The drift velocity can be extracted from laser tracks that can be generated inside the drift volume of the TPC in special laser calibration runs or during physics runs. The gain can be extracted for each pad in special calibration runs where the radioactive decay of the <sup>83</sup>Kr isotope is measured, or from analysing the tracks generated by particles in the physics runs. Different inputs may be used for the correction of the space-charge distortions.



Figure 54: Measured shape of the ion tail in data recorded with laser tracks.

- An interpolation method using external track references in ITS, TRD and TOF may be used. This
  method was extensively tested during Run 2, where some distortions due to imperfections were
  already present in certain regions.
- A reference average distortion map may be scaled with the actual luminosity of the interactions at the given moment for each time frame.
- The signals collected continuously in the TPC front-end electronics are integrated (1 ms integration time) for each channel inside the CRU and may be used for building a three-dimensional map of the space-charge distribution.
- Finally, the information from the high-definition current meters that sample the analog currents at all GEM 4 top electrodes for all GEM stacks may be used for building similar space-charge maps with lower granularity.

The latter two methods require as input the ion drift velocity at the given electric field strength.

In addition, static distortions play an important role. They are related to a small misalignment between readout chambers and central electrode, and to misalignment of the magnetic field and the drift field inside the field cage due to imperfections. The static distortions are constant in time for a given detector configuration.

A two-stage process (see Chapter 5) has been implemented for the processing and calibration of physics data. The first stage is performed synchronously with the collection of the data, and focuses on cluster finding and the association of clusters to tracks. For this purpose, the mean space-charge distortions scaled with the current luminosity are used. The reconstructed tracks have sufficient precision to allow matching to the external detectors (ITS, TRD and TOF). The compressed data are written to permanent storage. The second reconstruction stage is performed on the compressed data in asynchronous mode. It aims at a further improvement of the data quality, in particular in terms of the space-charge distortion



**Figure 55:** Visualization of the ion tail and of the common-mode effect for one readout channel zoomed around the baseline region. The data are from a simulation with 30% pad occupancy and without noise from the electronics. The common-mode effect shifts the baseline to below zero. The ion tail is visible for signals with large amplitude. Both effects are corrected in the CRU FPGAs.

calibrations. It may employ a combination of the described methods, as well as other more refined calibration input.

# 3.5 Fast Interaction Trigger

The Fast Interaction Trigger (FIT)[36] serves as an interaction trigger, online and offline luminometer, initial indicator of the vertex position, and forward multiplicity counter. Offline analysis of FIT data provides the precise collision time for TOF-based particle identification, yields the collision centrality and the event plane orientation, and provides the main input for the measurement of cross sections of diffractive processes. The FIT consists of five distinct detector stations, positioned at different locations along the beam line. Three different detector technologies, as detailed below, are used. An illustration of FIT is shown in Fig. 56; the distance from the interaction point (IP) and pseudorapidity coverage of the different arrays are displayed in the inset table. The naming convention relates to the similar ALICE



**Figure 56:** View of the FIT detectors illustrating the relative sizes of each component. From left to right FDD-A, FT0-A, FV0, FT0-C, and FDD-C are shown. Note that FT0-A and FV0 have a common mechanical support. FT0-A is the small quadrangular structure in the centre of the large, circular FV0 support. Note that all detectors are planar with the exception of FT0-C, which has a concave shape centered on the IP. The inset table lists the distance from the interaction point and the pseudorapidity coverage for each component.

detectors used during Run 2. FT0 is the successor of T0 [37], which owes its name to the fact that it was used to provide a start time. FV0 is the successor of V0 [38], which provided the vertex location. Finally, FDD (Forward Diffractive Detector) is the successor of ALICE diffractive detector, AD, which detects charged particles at large pseudorapidity for the selection of diffractive and ultra-peripheral events. [39].

A new, fast electronics and readout system [40] that can handle the larger interaction rates in Runs 3 and 4 has been designed and implemented for all FIT subdetectors [41].

# 3.5.1 FT0

The FT0 consists of two arrays of quartz Cherenkov radiators, FT0-A and FT0-C, which are optically coupled to MicroChannel Plate-based photomultipliers (MCP). The FT0-A is located at 3.3 m from the IP and comprises 24 MCPs and 96 quartz radiators. Due to the close proximity to the IP, the FT0-C support has a convex shape (as seen from the IP), positioning all 28 MCPs such that each of the 112 quartz radiators is at a distance of 84 cm from the nominal IP. The Photonis XP85002/FIT-Q MCPs are factory-customized versions of the Planacon XP85012. The customization is a new back-plane design for FT0 which groups the usual 64 anodes into four outputs, one for each of the four optically isolated quartz

radiators, each with a thickness of 2 cm and an area of  $2.65 \times 2.65 \text{ cm}^2$ . This segmentation provides the granularity for measurements of multiplicity in central Pb–Pb collisions, while minimizing the dead areas due to MCP edges and the optical isolation of the radiators. In order to obtain the best possible timing resolution, the signal path from each MCP anode to the front-end electronics has the same length. The intrinsic time resolution of each quadrant is  $\sigma_t \approx 13 \text{ ps}$  [42]. Accounting for signal deterioration along the 30 m long signal cables and processing by the front-end electronics, the achieved time resolution of FT0 is about 25 ps for a single minimum-ionising particle. Simulation studies of FT0 with the PYTHIA event generator [43] and the GEANT detector response simulation [44] indicate that the efficiency of the minimum bias trigger for pp collisions is  $\geq 98\%$  for the OR of the two sides and  $\geq 77\%$  for coincidences between FT0-A and FT0-C.

# 3.5.2 FV0

FV0 is a large, segmented scintillator disk with a novel light collection scheme [45], assuring short pulses, to achieve a single MIP time resolution of about 200 ps, and a very uniform response across the entire detection surface. The active element of FV0 is a 4 cm thick EJ-204 plastic scintillator divided into five concentric rings of equal pseudorapidity coverage. The outer diameter of the largest ring is 144 cm and the inner diameter of the smallest is 8 cm. The four inner rings are subdivided into eight sectors of 45 degrees each, while the outermost ring, due to its large area, has 16 sectors. A grid of equal-length, clear Ashai fibers is attached to the back side (as viewed from the IP) of the scintillator as can be seen in Fig. 57. At the other end, the fibers from each sector are bundled and optically coupled to Hamamatsu R5924-70 PMTs. This way, the 48 sectors of FV0 are mapped to 48 independent readout channels. This segmentation, combined with the information from the other forward detectors, is sufficient to yield the required centrality and event plane resolution. Together with FT0, FV0 provides the needed input to generate minimum bias and multiplicity triggers at the 'minus one' trigger level (LM). With a total latency below 425 ns, this is the fastest trigger in ALICE. In addition, the FV0 monitors the LHC background conditions and luminosity.



**Figure 57:** Photograph of one half of the FV0. The optical fibers connect the scintillators to the PMTs on the rim of the support structure, the black structure seen here. The center wall has been removed to show the scintillator, the surface matrix structure, and the optical fibers.

# 3.5.3 FDD

The FDD [46] comprises two nearly identical arrays, FDD-A and FDD-C, surrounding the beam pipe on opposite sides of the IP. Each array consists of eight rectangular scintillator pads with a size of  $21.6 \times 18.1 \times 2.5$  cm<sup>3</sup>. These eight pads are assembled in two overlapping layers of four sectors each. To make clearance for the beam pipe, a quadrant was removed from the innermost corner of each scintillator plate. The radius of the removed quadrant is 6.2 cm on the FDD-A scintillators and 3.7 cm on the FDD-C, as illustrated in Fig. 56. Each pad has two wavelength shifting (WLS) bars attached to the opposite sides of the scintillator. Clear optical fibers carry the light from the WLS to H8409-70 PMTs. There are eight independent FDD channels on each side of the IP.

The FDD covers a large pseudorapidity interval (see table in Fig. 56) and is sensitive to the presence of even a single MIP. As such, it is an ideal system to tag interactions characterised by large rapidity gaps as those from photon-induced ultra-peripheral collisions or diffractive processes. The main physics goals to which the FDD contributes to the pp program are the studies of centrally produced exclusive states, measurements of cross sections for single and double diffraction, and inelastic processes. Regarding the physics objectives in Pb–Pb and p–Pb collisions, the FDD provides an independent measurement of centrality based on the charged-particle multiplicity in an intermediate pseudorapidity range between the ITS and the ZDC and contributes to the selection of ultra-peripheral collisions as well as their classification into exclusive or dissociated channels.



### 3.5.4 Electronics and readout scheme

**Figure 58:** Schematic diagram of the FIT readout electronics. The MicroChannel Plates (MCP) are described in the FT0 section.

All three subsystems of FIT use the same front-end and readout electronics based on just two customdesigned modules: a Processing Module (PM) and a Trigger and Clock Module (TCM). One PM provides twelve independent inputs. Each subdetector has only one TCM while the number of PMs is determined by the number of channels. Each PM is connected to the dedicated TCM via an HDMI cable to transmit "pre-trigger" data, slow-control data, and the LHC clock. The commands, configuration data, and status data are sent from the detector control system to the TCMs via a 1 Gb Ethernet optical link using an IPbus (UDP-based protocol) [40]. A schematic diagram of the FIT electronics using FT0 as an example, is shown in Fig. 58. The triggers and the measured event rates for the luminosity measurements are transmitted from the TCMs via the same connection. The PMs are configured from TCMs via an HDMI SPI connection. The PMs and TCMs are connected to the ALICE DAQ with GBT links. The FIT delivers the produced trigger signals to the central trigger system. Custom-made laser calibration systems provide pulses used for time and amplitude calibration, as well as monitoring of ageing and radiation damage of the FIT detector components.

The FIT detectors were installed in ALICE in 2021. Initial commissioning of the detectors was performed in October 2021 with low-intensity proton beams in the LHC at a collision energy of 450 GeV. These pp collisions were used to check both the integration of the FIT detectors in the ALICE data processing chain, and to get a first, preliminary look at FIT performance. We note that such low multiplicity collisions give only a lower bound on the performance of the FIT detectors, particularly on the time resolution of FT0. Using this data set, the time resolution of FT0 was found to be 26 ps. Further checks on this first data set are being performed, and full integration with the online systems and software framework is being completed at the time of writing.

# 3.6 Muon System

The forward muon arm was described in [1]. It consists of a composite absorber of about 10 interaction lengths, made from layers of both high- and low-Z materials located at a distance of 90 cm from the nominal interaction point, a large dipole magnet with a 3 Tm integrated field placed outside the L3 barrel magnet, ten planes of very thin, high granularity, cathode pad tracking stations. The muon arm is completed by a second muon filter (seven interaction lengths of iron) located after the last tracking station and upstream from four planes of resistive plate chambers which are used for muon identification. The spectrometer is shielded by a dense conical absorber tube, of about 60 cm outer diameter, which protects the chambers from secondary particles created in the beam pipe.

The increased luminosity of the LHC at the ALICE interaction point after LS2 required an upgrade of the front-end and readout electronics on both the muon tracking and muon identifier subsystems.

### 3.6.1 Muon Tracking

The Muon Tracking Chambers (MCH) [1] consist of 156 multiwire proportional chambers with cathode pad readout (cathode pad chambers) with more than one million electronic channels. The system has five tracking stations, each of which is composed of two chambers. Because of the different sizes of the stations, ranging from a few square meters for station 1 to more than  $30 \text{ m}^2$  for station 5, two different designs were adopted. The first two stations are based on a quadrant structure, with the readout electronics distributed on their surface (left panel of Fig. 59). Four independent quadrants form one chamber. For the larger stations (3 to 5), a slat architecture was chosen (right panel of Fig. 59). The largest slat size is  $40 \times 240 \text{ cm}^2$  and the electronics are mounted on the top and bottom parts of each slat. Slats are mounted on a support to form one half-chamber. One half-chamber consists of 9 slats for station 3, and 13 slats for stations 4 and 5. The tracking system covers a total area of about  $100 \text{ m}^2$ .



**Figure 59:** Left: Station 2 of the tracking system; the readout electronics are distributed on the surface of a quadrant. Right: Stations 4 and 5 of the tracking system; the readout electronics are mounted along the top and bottom edges of the slats.

The detector chambers are unchanged from Runs 1 and 2, while the front-end and the readout electronics were upgraded to accommodate the larger interaction rates for Runs 3 and 4.

The electronics can run either in the default dead-time-free continuous mode or in triggered mode. The readout data flow is schematically shown in Fig. 60. The Front-End Cards (FEC) continuously send data at 80 Mbit/s through an electrical link to the SOLAR (Sampa to Optical Link for Alice Readout) readout boards which connect to the CRU (see Sec. 2.2) through GBT optical links at 3.2 Gbit/s.

### The DualSAMPA front-end electronic cards

The front-end electronic cards, called DualSAMPA, host two chained SAMPA chips (see Section 2.4) of 32 channels each and three low voltage regulators. Since the detectors are the same ones used in Runs



Figure 60: The Muon Tracking readout scheme



**Figure 61:** The two geometries of the DualSAMPA boards (DS12 on the left and DS345 on the right), with the white connector plug socket on PCB and on the other side the black connector connecting to SOLAR boards.

1 and 2, the dimensions and the layout of the connectors for the DualSAMPA cards on the electronic PCBs remain the same as for the previous FEC. Moreover, two types of cards were produced, each with the same functionalities but with different dimensions to suit the quadrant and slat detector layouts. The DualSAMPA board is shown in Fig. 61).

Out of the 19300 DualSAMPA produced (11000 DS345 for slats of stations 3, 4, and 5, and 8300 DS12 for quadrants of stations 1 and 2), 16900 are installed in the cavern (9700 DS345, 7200 DS12).



**Figure 62:** Left: Flex mounted on a slat connecting five DualSAMPA cards linking through a green flat ribbon cable to the readout board. Right: Large electronic PCBs covering the surface of a quadrant.





#### The readout electronic FLEX links and large electronic PCBs

The link between the DualSAMPA and the readout cards consists of a flexible circuit (FLEX) and a flat ribbon cable for the slats, while a large electronic PCB and a flat ribbon cable are used for the quadrants (see left and right panels of Fig. 62).

Each DualSAMPA has dedicated data and clock lines while the trigger lines are daisy chained to feed up to 5 DualSAMPA (see Fig. 63) using an I2C line. An active buffer was added to the I2C line to ensure a good signal integrity.

More than 3000 FLEX PCBs of 24 different types were produced depending on the number of DualSAMPA to address, the geometry, and the pad density; 2760 of these were installed, the remainder serving as spares.

#### The SOLAR readout cards

Each FLEX/ribbon cable is plugged into one of the eight ports of a SOLAR readout board, allowing this latter to read out up to 40 DualSAMPA boards (see Fig. 64). The GBTx chip of the SOLAR board acts as a serializer to send the signals from the different DualSAMPA to the CRU through GBT optical links. The SOLAR board also hosts a GBT-SCA chip handling the eight I2C command and control lines, one optical transmitter/receiver VTRx and two DC/DC FEAST converters.

A total of 624 SOLAR boards are hosted in 112 custom SOLAR crates, with up to six boards each.

#### The data flow from SAMPA to the CRU User Logic

In the SAMPA chip, the signal of each electronic channel is amplified with a gain of 4 mV/fC, waveformed with a shaping time of 300 ns, then sampled and digitized at 10 MHz in a 10-bit ADC, and is eventually digitally processed with a baseline correction and a zero-suppression before being formatted. The SAMPA format consists of data samples from a signal waveform with its time stamp and size to-



Figure 64: The SOLAR board: functional diagram (left panel) and photo of the board itself (right).



Figure 65: Data flow scheme.

gether with a header containing mainly the bunch crossing number, the SAMPA address and the channel address of the SAMPA chip.

The signals of the 64 channels of the two chained SAMPA chips of a FEC are serialized at 80 Mbit/s (2 bits at 40 MHz). The first port of the SOLAR board handles the first 2 bits of the first DualSAMPA while the second port takes care of the 2 bits of the second DualSAMPA and so on, combining all 40 ports, which results in a 3.2 Gbit/s data optical transmission to one input of a CRU. The electrical and optical links are always transmitting data, independent of the type of information (physics data, synchronisation, etc.).

The MCH CRU user logic receives data from the 24 GBT links (see Sec. 2.2), each one handling 40 DualSAMPA channels. For each GBT link, the user logic deserializes the 80 bits, forms the SAMPA words, removes the SYNCH words and inserts error checks and configuration conditions. These 64-bit SAMPA words contain the payload, the GBT link identifier, the DualSAMPA channel identifier, and error bits. The user logic then embeds the TTC signals into this stream, constructs the RDH (Readout Data Header) and transmits words of 256 bits to the Front Level Processor (FLP, see Sec. 5.2) (see Figs. 65 and 66).

No specific processing is performed in the FLPs. The cluster finding, the cluster fitting and the track finding are done on the EPNs (see Section 5.3).

Quality Control (QC) processes have two steps: the QC error check task is perfomed on the entire raw data at the FLP level, while the detector occupancy and pseudo-efficiency are monitored from decoded data samples in QC tasks on EPNs. While the detector occupancy QC uses digits (signal from each front-end electronic channel), the pseudo-efficiency task uses the pre-clusters (groups of pad hits that are close in time and space). The charge of the pre-clusters is also verified. These tasks will allow the monitoring of the detector performance.



Figure 66: CRU scheme.

## 3.6.2 Muon Identifier

The Muon Identifier (MID) is the present designation of the Muon Trigger system [1], which was operational in ALICE during LHC Runs 1 and 2.

The detector is composed of 72 single gap Resistive Plate Chamber (RPC) detectors, organised in two stations of two planes each, located at a distance of 16 m and 17 m from the interaction point, respectively. In both stations the two planes are 17 cm apart. The total detection area is about  $150 \text{ m}^2$ . An overview picture of one half-plane of the MID, in open (maintenance) position, taken during the FEERIC card installation (see next section) in 2019, is shown in Fig. 67.

The RPC signals are collected by means of a total of 20992 readout strips, each of them connected to Front-End Electronics (FEE). The output signals from the FEE, in LVDS standard with a width of 25 ns, are propagated via multiwire copper cables to the local cards, which are part of the readout electronics.

The FEE cards, which are located on the RPC detectors, were replaced during LS2. The main motivation is to reduce the ageing of the RPCs and improve the rate capability during the upcoming data taking periods. The ASIC of the past FEE, called ADULT [47], was upgraded to a new type, called FEERIC [48, 49]. Unlike ADULT, FEERIC performs amplification of the RPC analog signals. Thanks to this upgrade, the ALICE RPCs can be operated at lower gain, with a reduction by a factor 3–5 of the charge produced in the gas gap, hence limiting ageing effects.

The readout electronics, composed of 234 local cards and 16 regional cards, was also completely replaced to sustain the larger data flow associated with the higher collision rate in Runs 3 and 4.

Although all RPC detectors were still operational at the end of Run 2, a few of them were drawing a



Figure 67: Overview of one MID half-plane in open position.

relatively large current after having accumulated a charge up to  $20 \text{ mC/cm}^2$  in the gas. It was therefore decided to replace those RPCs with completely new ones. For the longer term, a crucial R&D on new environment-friendly gas mixtures [50] for RPCs, based on tetrafluoropropene, which is characterised by a very low Global Warming Potential (GWP), has been launched.

### **FEERIC electronics**

The FEERIC 8-channel ASIC is designed in the AMS 0.35 µm CMOS technology. Its main components are (see Fig. 68, top scheme) a transimpedance amplifier, a zero-crossing discriminator, and a one-shot circuit which inhibits retriggering during 100 ns. The operating threshold is typically 70 mV corresponding to a charge of approximately 130 fC at the readout strip level. Details of the performance of the FEERIC electronics are given in [51]. Figure 68, bottom panel, shows a picture of a FEERIC card. A total of 2720 cards (spare included) were produced in the second half of 2017. The installation of the FEERIC cards on the RPCs in the ALICE cavern was completed in July 2019.

During Run 2, one of the 72 ALICE RPCs was equipped with FEERIC electronics and showed satisfactory performance and stability [49]. The charge released in the gas gap, around 30 pC per charged particle crossing the RPC at nominal high voltage, was typically four times smaller as compared to the one with ADULT.

A new wireless threshold distribution for the FEERIC cards was developed. A total of two masters (one on each side of the cavern) and 24 nodes close to the RPCs (see Fig. 69, right panel) were installed in the ALICE cavern in 2019 to remotely control the threshold of each of the 2384 installed FEERIC cards. The masters are controlled via ethernet and communicate via the high level ZIGBEE wireless protocol with the nodes, which are I2C chained to the FEERIC cards on the RPCs (see Fig. 69, left panel). A further upgrade of this system to a more powerful WiFi Ethernet-based system is planned for the winter shutdown of 2022. Both master and node share the same hardware and firmware. A card acts either as master or node, depending on the configuration stored in its EEPROM which also retains the memory of the last requested threshold values. The latter are restored at power on.

#### **Readout electronics**

The LVDS digital signals from the FEERIC electronics, so-called strip-patterns of 16 bit length (one bit corresponding to one RPC readout strip), are received by the local cards. Each local card receives the strip patterns corresponding to 16 horizontal and 16 (or 8 in some cases) vertical readout strips, on both sides of the same RPC, from each of the four detection planes. The vertical readout strips (maximum length 73 cm) cover the full RPC width while the horizontal strips (maximum length 55 cm) are segmented all along the RPC length. The details of each local card inputs are given in [52]. The entire setup consists of 234 local cards, housed in 16 VME-9U crates used as mechanical support and for power supply. The signals from up to 16 local cards are collected by a regional card via the e-links on the J2 bus in each crate. Each regional card is interfaced to a Common Readout Unit (CRU) by means of two GBT links at 3.2 Gb/s.

In total this project has two CRUs, housed in one single FLP desktop computer. The MID readout architecture is shown in the left panel of Fig. 70, while a picture of the three types of readout cards is shown in the right panel. Simulations of the expected bandwidth for Pb–Pb collisions at 50 kHz rate, based on Run 2 data, indicate that this design includes a safety factor of more than one order of magnitude.

Data corresponding to MID self-triggered physics events [53, 54] are transmitted from the local and regional cards to the CRU using the GBT up-links. Colliding beam particles in the LHC are of course the main source of such events. Every 25 ns (40 MHz) a new self-triggered event is potentially stored in the local and regional FIFOs. Only non-empty events are stored in these FIFOs. In the standard configuration a non-empty event corresponds to, at least, a non-zero strip pattern.



Figure 68: FEERIC ASIC architecture (top) and FEERIC card picture (bottom).



Figure 69: Wireless threshold distribution scheme (left) and master or node electronics card (right).



**Figure 70:** MID readout architecture (left) and readout cards (right picture) with local (LOC) (top right), regional (REG) (top left) and J2 bus (bottom) between local and regional.

It is important to note that it takes five clock cycles at 40 MHz, per self-triggered event, for the transfer of the regional FIFO data and 9–21 clock cycles for the local ones, depending on the number of nonempty detector planes in this last case. This means that the data from different local and regional cards, corresponding to the same bunch crossing, arrive asynchronously in the CRU. This also means that the local and regional FIFOs saturate in case they are filled at the full clock frequency of 40 MHz. For instance, the FIFO saturation could happen in case of a very high level of noise at the FEE level. It should be noted that a busy bit would be set in such a case.

At the first stage of data processing in the CRU, the user logic performs zero-suppression and raw data header construction using the central trigger (CTP) orbit information. The output of the user logic is transmitted by words of 256 bits to the FLP. At this level, the data coming from the different GBT links are assembled in C++ structures and synchronized to provide the information corresponding to a given bunch crossing. The local and regional cards always operate in continuous readout mode. However, the system can also run in triggered mode, with the CRU transmitting only data corresponding to a time window centred on a bunch crossing which coincides with a trigger from the CTP.

The local and regional cards respond also to all types of triggers delivered by the CTP and received via the GBT down-link [53].

The commissioning of the complete MID detection chain has started in the autumn of 2020. First muon tracks, in coincidence with MFT and MCH, have been registered in October 2021 by dumping proton beams in the TED, as explained in Sec. 3.3.7.

### 3.7 Transition Radiation Detector

The construction, operation and performance of the TRD is presented in [55]. The TRD contributes to the overall momentum determination of charged particles, as it provides up to six track segments (tracklets, each of a geometric length of about 3.5 cm) for each charged particle in the acceptance ( $|\eta| < 0.9$ ). In addition, the TRD allows to identify electrons via the detection of transition radiation. Using a likelihood method enhanced with machine learning techniques, it is possible to suppress pions by a factor of more than 100 while retaining an electron (positron) efficiency of 90%.

The TRD has the capability to trigger on events based on the charged particle content, incl. the identification of electrons, within about 8 µs after a collision. This feature was used in LHC Run 2 to select collisions with charmonia, jets or atomic nuclei. Here we focus on modifications implemented for the high rate running in LHC Runs 3 and 4.

### 3.7.1 High-voltage distribution and common mode

During TRD operation in Runs 1 and 2 a number of anode and drift channels of individual chambers developed high currents and were eventually not operational any more. Based on similar experience on the TPC, and on experience from the repair of one TRD supermodule (SM) during LS1, the built-in decoupling capacitors in the on-detector high-voltage distribution system were suspected to cause the observed behaviour. At that time, the construction of the TRD was still ongoing, and the last four SMs were built without certain capacitors (4.7 nF) in the high-voltage distribution system. In total, until the end of Run 2, 70 anode channels and 20 drift channels were taken out of operation from a total of 522 chambers installed in 18 SMs.

The high-voltage distribution system with the decoupling capacitors on filter boards is mounted directly on each chamber and therefore encased in the hull of the SMs. Via milling cut-outs into the casing and by removing the top cover, it was possible to access the filter boards of all 30 chambers in a SM. Each anode and each drift channel hold a 4.7 nF capacitor; the anode wire plane is segmented in eight or six sectors of two pad rows each, decoupled from each other by a 2.2 nF capacitor. Measurements of capacitors that were removed confirmed the reason for the high-voltage failures, explaining the observed issues. Most of the problems could be traced to failing 4.7 nF capacitors, but it was found that also a small fraction (in the percent range) of the 2.2 nF capacitors had failed. Therefore, all capacitors on the filter boards were removed from a total of 9 TRD SMs. This number was determined by the turnaround time of SM deinstallation from the space frame of ALICE, repair, test and reinstallation during the first year of LS2. Before reinstallation of each SM, long-term high voltage, low voltage, cooling and readout tests were performed to ensure proper detector operation. It turned out that 96 % (80 out of 83 not operational chambers) in the nine SMs could be restored. Figure 71 displays the configuration of the individual SMs in terms of installed decoupling capacitors. Based on experience, the expected failures of remaining capacitors until end of Run 4 is estimated to be low enough such that good tracking capability in all sectors is ensured for the entire period of operation.

As the capacitors were meant to buffer high charge deposits in the chambers, their removal results in larger induced common-mode signals on readout pads in the same high-voltage segment. The measured common-mode signal is shown in Fig. 72 and is about three times larger than with the capacitors in place, consistent with the expectation from the remaining capacitance of the readout chamber. This effect will be corrected at the software level based on the measured local charge deposit.

## 3.7.2 Readout

The readout chain has been optimised in the past for a high event inspection rate at Level -1 (LM) with a fast calculation of the L1 trigger contribution (LM tracklet data readout time  $< 8 \mu$ s, L1 decision time  $< 6 \mu$ s), while transferring large, high resolution raw data for events accepted beyond the L1 level (L1



Figure 71: The status of TRD supermodules concerning capacitors in the high-voltage distribution.



**Figure 72:** Induced common-mode signal with and without capacitors for the anode high-voltage. The baseline of pads in the same high-voltage segment as a cosmic-ray particle with an integrated signal between 10 000 and 12 500 ADC counts is shown before (red) and after (blue) the removal of the capacitors. For comparision, the baseline from pads in a high-voltage segment without hits is shown in green.

raw data readout time  $\approx 300 \,\mu$ s) [55]. In Run 3, the L1 trigger functionality is no longer required and the detector must provide readout rates as high as feasible while writing all events to permanent storage. No data shall be discarded in the readout chain and the fraction of recorded events in 1 MHz interaction rate pp collisions or 50 kHz Pb–Pb collisions shall be maximised.

The applied solution is presented in the following sections. Simulations confirm that it enables collecting more than 70% of the events in a 50 kHz interaction rate Pb–Pb running scenario.

**Optimisation of the existing FEE.** In order to achieve a high event-readout rate in Run 3, only tracklets are read out, a mode which has been used to find fast L1 trigger contributions in Run 2. The maximum data volume per LM trigger and per Multi Chip Module (MCM, processing signals from 18 readout pads) is four words of 32 bits each. The usage of the available bits is no longer optimised for triggering, but for physics analysis.

Previously, each MCM processed and transmitted up to four tracklets, where each tracklet was transmitted as a 32-bit word. However, even in the most central Pb–Pb collisions, a track density of four tracklets per MCM has been rarely reached. Therefore, only three tracklets per MCM are allowed in the Run 3 data format. The estimated fraction of tracklets lost by this measure in central Pb–Pb events is below 1 %. The freed-up 32 bit word is used as a header to store position information about the MCM and eight bits per tracklet are reserved for PID information. It is followed by one to three 32-bit tracklet words that store the position within the MCM, slope and additional twelve bits of PID information of the tracklet. The details are shown in Table 9.

**Table 9:** TRD tracklet data format. Each MCM that has reconstructed at least one tracklet will send a header with shared coordinate information and eight bits of PID information per tracklet. For each reconstructed tracklet, one additional payload word with additional position and PID information, as well as the reconstructed tracklet angle (slope), will be stored.

|                | 31 30 27 26 25 24 | 17 16     | 9 8      | 1 0 |
|----------------|-------------------|-----------|----------|-----|
| Header         | 1 padrow col      | HPID2 HPI | D1 HPIDO | ) 1 |
|                | 31                | 21 20 9   |          | 1 0 |
| Payload (1-3x) | position          | LPID      | slope    | 0   |
|                |                   |           |          |     |

The PID information per reconstructed tracklet will increase from eight to 20 bits, which will be used to store charge information from three time slices with six or seven bit dynamic range each. Simulations have shown that the expected performance with this data format is similar to an offline analysis with the same number of time slices. The tracklet position and slope will also be stored with higher precision than in previous runs.

In Run 3, the TRD uses a physics trigger sent by the CTP at LM latency (575 ns, see Sec. 5.5). In addition, the TRD supports a new trigger type, called calibration trigger. The calibration trigger, also sent at LM latency, enables the shipping of tracklet data and, additionally, the full raw data. This allows to trigger a full readout for a small fraction of events, facilitating detector calibration. Apart from that, the calibration trigger is interpreted by the FEE as a command to reload its configuration parameters from hamming protected memory areas. This is a precaution measure and mitigates the impact of Single Event Upsets (SEUs) on data taking, sporadically observed on some isolated half chambers as Link Monitor Errors (LMEs). An LME of a particular half chamber occurs whenever the data sent by the half chamber cannot be correlated with the corresponding triggers.

**Common Readout Unit (CRU).** For Run 3, the Global Tracking Units used previously [55] are replaced by CRUs (see Sec. 2.2). The CRUs receive the data directly from the FEE via 1044 custom optical links based on 8-bit/10-bit encoding. Every CRU provides 30 link inputs, implying that in total 36 CRUs are in use (two per TRD SM). They are housed in twelve First Level Processors (FLPs).

The FPGA firmware on the CRU is composed of a common logic, and a TRD-specific user logic. It controls the readout process of the detector and receives, buffers and formats the data for the  $O^2$  system. All CRUs are connected to the LTU to receive trigger information and to signal a detector busy status to the CTP (see Sec. 2.1). Each CRU determines an individual busy status contribution depending on the status of the readout of the connected FEE links. The CTP combines the busy status contributions from all CRUs in order to determine a global busy status of the detector.

Before being written to permanent storage, the data are reformatted and compressed, optionally either on the FLPs or EPNs (see Sect. 5.2 and 5.3). The following points describe sequentially the process of acquiring an event, explaining the role of the CRU and the interactions with other readout components:

- 1. The Central Trigger Processor (CTP) sends a trigger at LM latency (physics or calibration) via the Local Trigger Unit (LTU) to the FEE and to the CRUs in parallel. The trigger to the FEE is shipped via a legacy Timing, Trigger and Control (TTC) [4] network. The trigger to the CRU is sent via trigger distribution networks, using the new Trigger and Timing Control via TTC-PON technology. Nine networks are necessary to achieve the minimum latency (see Sec. 5.5). The CRUs store all information from the received trigger message (e.g. orbit and bunch crossing id) in internal buffers.
- 2. Upon the arrival of the trigger, the FEE begins recording the data, while primary charge drifts towards the anode region. Each CRU receives the trigger at approximately the same time and internally opens a time window to wait for the input links to send all the acquired data. The timeout is programmable. In addition, each CRU generates its busy status contribution and sends it to the CTP. The TTC-PON upstream communication feature is used to transmit the busy status signal. This prevents the CTP from sending any other trigger as long as any CRU contributes an active busy signal in order to avoid confusion of the FEE state machines.
- 3. When the FEE has acquired and processed the data, it starts shipping them via the optical links. At the end of the transmission, the FEE appends specific end markers. The CRUs record the data received on all input links. In case no data end marker is recognised by the CRU within the programmable timeout or data words are received outside the data expectation window, the CRU marks the concerned link as erroneous (LME) and excludes it from data taking until a manual or automated recovery takes place. The CRU stores all received data in large internal data buffers with size sufficient to hold entire calibration data events at maximum multiplicity. When the CRU has confirmed the reception of end makers on all active links, or the timeouts have been reached, the CRU releases its busy status contribution. The CTP considers the detector as busy until all 36 CRUs have released their busy contribution.
- 4. Once the detector side of the event acquisition is finalised, and the data are stored in internal buffers, the CRU is ready to acquire the next event. The buffered data are reformatted and shipped to the readout system in parallel. The CRU packs the data into packets of a maximum size of 8 kB and equips these packets with Raw Data Headers (RDHs). In addition, TRD-specific headers are inserted into the data stream. The headers contain various information, in particular the trigger timestamp information needed in order to link the acquired data to other detector data during the reconstruction.

# 3.7.3 Detector control

A special feature of the upgraded TRD DCS system is that the readout chain status of all half chambers is made available to the DCS system by the CRU. The CRU firmware contains a dedicated error state machine for all half chambers. If a connected half chamber shows a misbehavior that can be detected at the CRU level, the corresponding state machine enters an error state. This error state is stored in a

dedicated CRU register, which is read by the DCS system via the ALFRED [56] system (Sec. 5.6.2). The obtained status of all half chambers is displayed on a dedicated DCS panel in order to monitor LMEs.

## 3.7.4 Standalone tracking

A standalone tracking algorithm for the TRD was implemented using a Kalman filter approach. The seeding uses the direction and position information of all pairs of TRD tracklets. The track reconstruction efficiency and transverse momentum resolution was determined by matching tracklets to tracks reconstructed by the TPC. For the TRD standalone tracking, a momentum resolution of about 9% for 500 MeV/*c* particles was achieved for the case of six tracklets in the fit. By including the primary vertex information as an additional constraint, a momentum resolution better than 4% was achieved. The TRD standalone tracking algorithm was used to identify and study photon conversions and nuclear interactions in front of and within the TRD. It will also be used for the TRD drift velocity calibration in Run 3.

## 3.7.5 Calibration

The Run 3 TRD calibration procedure is similar to the one employed before, except for the drift velocity calibration, which is based on a new development. The angle between a TRD tracklet and the corresponding TRD track,  $\Delta \alpha$ , is measured as a function of the track impact angle for each chamber. A model with two free paramaters, the effective drift velocity  $v_D^{eff}$  and the Lorentz angle  $\alpha_L$  (the angle between the velocity of drifting electrons and the drift field), is used to fit the distributions. A typical example is depicted in Fig. 73 (left). The effective drift velocity is compensating the ion tail effect which is systematically changing the tracklet angle. The physical true drift velocity  $v_D^{true}$  is about 35% larger than  $v_D^{eff}$ . A closure test with  $5 \times 10^4$  events using Run 2 data demonstrates that the average angular difference between tracklets and TRD track is zero, as shown in Fig. 73 (right).

About  $4 \times 10^5$  minimum bias pp-equivalent events are needed for an update of the calibration parameters. This is similar to what was used in Run 2 with about 600 to 3000 tracklets per chamber. The seeding and Kalman filter procedures need on average 10 ms per p–Pb event. In total, not more than 20 minutes for one update of the calibration parameters is needed on a single CPU core.



**Figure 73:** Left:  $\Delta \alpha$  versus impact angle for a typical TRD chamber in Run 2, having a fixed uncalibrated drift velocity. The quoted values refer to the Run 2 calibration procedure (upper row) and to the new calibration scheme (lower row). Right: Average  $\Delta \alpha$  versus impact angle for all TRD chambers after the calibration was applied. The red band shows the RMS of the distribution.

## 3.7.6 Quality Control

The QC system consists of tasks that are running in various parts of the  $O^2$  system and produce QC objects, mostly in the form of histograms. The following items are controlled:

- The data arriving from the FEE via the CRU are validated, allowing to detect disabled or malfunctioning parts in the readout tree or SEUs in the FEE.
- Zero-suppressed ADC data from all calibration events are analyzed to reconstruct the average, time-dependent signal shape for each of the 522 readout chambers. These histograms are versatile low-level monitoring tools for many aspects of the operation of the TRD, including trigger timing, drift velocity and gas gain.
- Tracklets from a small fraction of events are used to monitor the local reconstruction of track segments in the FEE of each chamber.
- The tracking QC monitors the efficiency of the synchronous and asynchronous reconstruction algorithms at the tracklet and track level.
- Residuals between reconstructed tracks and tracklets are analyzed in the asynchronous stage to monitor the impact of alignment and calibration on the detector performance.

The data from these QC tasks are further processed by checker algorithms to provide automated notifications and trending.

In Run 3, the upgraded TRD system successfully recorded calibration data (for gain uniformity correction) with a <sup>83</sup>Kr source in standalone mode, and collision data with the whole ALICE setup.

## 3.8 Time-of-Flight detector

The ALICE Time-Of-Flight (TOF) detector [57, 58] is a large array of Multi-gap Resistive-Plate Chamber (MRPC) strip detectors, where each strip is read out by 96 pads each with  $2.5 \times 3.5$  cm<sup>2</sup> area. Groups of 91 strips are organized in supermodules, covering the 18 sectors of the ALICE spaceframe. Each of the supermodules is read out by four custom VME crates, each hosting nine or ten Time-to-digital-converter Readout Module (TRM) boards, one Data Readout Module (DRM) card and one Local Trigger Module (LTM). While the DRM acts as master and has interfaces with the central systems, the LTM elaborates trigger information and sets the threshold on the NINO ASIC chips hosted on the front-end cards.

The TOF upgrade for Run 3 mostly concerns part of its readout electronics, to accomplish continuous readout, aligning with the ITS and the TPC, and with the aim to exploit at maximum its particle identification discriminating power in the intermediate momentum range. The intervention needed to adapt to a continuous readout was relatively limited thanks to the very small intrinsic dead time ( $\sim 10$  ns) of the MRPC detector and its front-end electronics and the fact that the High Performance TDC (HPTDC) has on-board buffering resources for digitized data, as detailed in the next sub-section.

## 3.8.1 Implementation of continuous readout

The TRM cards are equipped with 30 HPTDCs operated in very high resolution mode, with 24.4 ps bin width. The specifications of the HPTDC (and its performance once integrated in the TRM cards) are detailed elsewhere [59], but it is important to recall here its trigger matching function. Based on time tags, the HPTDC allows the trigger latency to be programmable over a large dynamic range and also supports overlapping triggers, where individual hits may be assigned to multiple events. Once a trigger is received, only stored hits starting from a given time and for a limited matching window are moved to the readout FIFO and made ready for further stages of readout. During Runs 1 and 2, with a limited high-rate capacity in the barrel detectors of ALICE, the trigger was limited to a few kHz. In this situation, the internal HPTDC buffers for the TOF were configured with a latency window of 6500 ns (corresponding to the latency of the triggers reaching the TOF crates) and with a matching window of 600 ns, to comfortably collect all hits registered in the TOF detector associated to the triggered collision.

In this configuration, continuos readout may be achieved by applying a strictly periodic trigger with frequency  $f_{\rm T}$  and matching window  $m_{\rm w} = 1/f_{\rm T}$ . Figure 74 (left) illustrates the underlying idea. Delivering a trigger with a constant 50 kHz frequency, and setting latency and matching windows of 20 µs, all hits



**Figure 74:** (left) HPTDC programming in Run 1 and 2 operations (top arrow) and in Run 3 (bottom arrow). The three trigger levels L0, L1 and L2a are replaced by a periodic trigger with a given frequency, mimicking a continuous readout. All hits (black lines) are read out and can be associated to physical events at a later stage. (right) Possible selection of parameters (fixed trigger frequency  $f_T$  and matching window width  $m_w$ ) to realize a continuous readout. The green circle corresponds to the chosen point of operations.



**Figure 75:** Hit time within the orbit of randomized hits sent at fixed rate to HPTDC inside a TOF crate operated in continuous readout mode. No holes are seen in the distribution, which means that random hits are received in all 3564 bunch crossings through the whole LHC orbit.

are readout. Figure 74 (right) shows the curve of allowed values, together with the limitations of the system. On the one hand, the latency window cannot be set at a value larger than half of an LHC orbit. On the other hand, as discussed in the ALICE Readout Upgrade TDR [60], the trigger frequency cannot be too high, given the time spent reading the HPTDC chains in the TRM cards (two HPTDC chains of 15 chips) with a fixed readout deadtime of 3.2 µs for token-passing operations among chips alone. More generally, the readout time on average has to be less than  $1/f_{\rm T}$ . Considering also the readout time over the VME backplane (up to 10 TRMs per crate have to be read) an optimal operation point was found with  $f_{\rm T} = 33$  kHz. The procedure was verified in a test system, sending random hits to several TRM cards, programmed with appropriate latency and matching windows (29800 ns). The periodic triggers, hereafter labelled as TOF special triggers (TT), were delivered at fixed bunch crossing. The orbit is split in three parts with TT occurring at BC numbers 51, 1177, and 2673. Given the flat distribution hits over the LHC orbit, Fig. 75 shows that no hits were lost.

#### 3.8.2 The new Data Readout Module (DRM2)

In order to keep up with the planned increase of luminosity and of the interaction rate (up to 1 MHz in pp collisions and 50 kHz in Pb–Pb collisions), a new Digital Readout Module 2 (DRM2), was designed [61]. With respect to the existing DRM module (hereafter DRM1) it has a more modern FPGA (Microsemi IGLOO2). Overall, it replaces the connections to the central readout and from the central trigger in the DRM1, which were based respectively on the Detector Data Link (DDL) [62] and the TTC system [4], with just one bidirectional GBT link (see Sec. 2.2). The GBTx ASIC is hosted on the DRM2 and the GBT protocol is implemented in the FPGA on the CRU at the receiving end. The new system for each DRM2 has a user bandwidth to the central readout system (CRU) of 3.2 Gb/s, corresponding to the bandwidth available on a single GBT link.

As mentioned, the readout is implemented with special TOF triggers at fixed bunch crossing values with  $f_T = 33$  kHz, setting a matching window of 30 µs in the HPTDC installed in the TRMs to achieve continuous readout. The same link is also used for receiving triggers and a low-jitter clock, which is distributed to the front-end electronics as primary clock. For the TOF detector, the quality of this clock is crucial, and a campaign of measurements on the clock received from the common readout unit has been carried out. A clock jitter (RMS) as low as ~10 ps was measured in the laboratory, which is compatible with the requirements. Nevertheless, a dedicated line of clock distribution is available (same as during Runs 1 and 2) with a similar jitter.



**Figure 76:** The DRM2 card: on the left the VTRX transceiver and the GBTx ASIC (covered by a heat dissipating panel). On the right the ARM piggy-back card is visible. The additional optical receivers for the SCL and the LHC clock are in the middle of the front panel.

A picture of the DRM2 card is shown in Fig. 76. It is a narrow 9U VME card  $(16 \text{ cm} \times 33 \text{ cm})$  with the same form factor as the DRM1 and the TRM boards. The heart of the board is a Microsemi Flashbased IGLOO2 FPGA (M2GL090-FG676 with silicon revision 3), which drives the trigger and data flows inside the crate. This device has been chosen since the expected TID (Total Ionizing Dose) for the board (placed at  $\approx$ 4 meters from the beam pipe) is 0.13 krads in 10 years, which is acceptable for such a device. The advantage is that the FPGA configuration memory is immune to single event upsets, so that scrubbing is not needed. Results of irradiation tests on several components of the DRM2, including the IGLOO2 FPGA, commercial optical transceivers, and staging RAM have been reported in [63].

The FPGA's GBTx connection consists of a single 40-bit large parallel lane of 80 MHz differential signals. The same configuration was previously tested on a GBTx test board developed before designing the DRM2, where a bit-error ratio lower than  $10^{-14}$  and a total jitter on the received clock around 50 ps had been measured [64]. As on the DRM1, an additional optical Slow Control Link (SCL) is implemented. The SCL has a dual role: configuration and monitoring. It provides a firmware implementation of the CONET2 protocol developed by CAEN [65]. Via this link, all DRM2 are connected to commercial A3818 PCIe cards, housed in Linux machines hosted in the DCS network. The SCL is used for configuration of the front-end electronics and programming of all VME cards. In addition, while the data collected are immediately sent to the CRU via the GBT link, the firmware also stores them in the staging RAM (1M × 36 bit SSRAM from Cypress: CY7C1460KV33) for transfer (1 MB buffers) to the DCS machines. From these data (only a portion of all data are inspected via the SCL), some values such as temperatures are stripped and reported in the DCS via DIM servers. In addition, Quality Control programs run on these data.

A block of hardware inherited by DRM1 is the ARM microprocessor mounted on the A1500 piggy-back card provided by CAEN. This CPU implements, via JTAG over the VME backplane, the programming of the Actel APA750 and APA600 installed in the TRM and LTM cards, respectively. The connection on the front panel to the console port of this CPU was improved, with respect to the DRM1, by using a commercial RS232-USB interface to provide a more modern USB interface to connect a laptop computer. The ARM CPU is also able to program the firmware of the IGLOO2 FPGA. As for the A1500 mounted on the DRM1, thanks to the modified Ethernet interface (which has been validated to operate in magnetic field), all the firmware updates of the VME cards (DRM2, TRM and LTM) can be remotely executed.

Finally, the DRM2 distributes the clock to all VME cards inside the crate. This is an entirely new functionality with respect to the DRM1. Previously, only every second TOF crate had a clock distribution module (Clock and Pulse Distribution Module, CPDM). This complicated the power-up and configuration sequences, and created single points of failure (e.g. via the dependence on the crate power supply). The DRM2 distributes a local clock to all cards in the VME crate in a user-selectable way, using either the clock received via GBTx or the clock received directly from the LHC interface. For the latter, an optical receiver from PD-LD/NECSEL is used, with ST plug-type, with pinout compatible with the Truelight TRR-1B43-000, which was previously widely used for TTC applications. This configuration minimizes the jitter of the clock distributed to the TRM cards.

All the DRM1s were removed and disassembled during the first months of 2019. The A1500 ARM piggy-back cards were tested and prepared for installation on the DRM2 cards. The procedure for validation and test of the DRM2 production (completed during 2019) is described in [66]. The installation of all DRM2 cards, partially delayed by the Covid-19 pandemic, was ended in June 2020 with full commissioning starting in September 2020. The MRPC high voltages were ramped to the nominal value (V = 6500 V on each stack) in the same month. The TOF detector actively participated to the data taking with the LHC pilot beams in October 2021.

The front-end software needed for configuration of the electronics was upgraded and deployed in a staged way during spring 2020.

## 3.8.3 Additional upgrades in low voltage and quality control systems

During LS2, several other TOF systems linked to the readout upgrade were subject to key improvements and maintenance to prepare for the intense data taking foreseen in Run 3. Among many interventions, we highlight in particular:

- The DC/DC systems (CAEN modules A1395 and A1396): these modules supply power to the four crates of each TOF supermodule. They receive a DC 48 V power supply via bus-bars from outside the L3 Magnet and provide LV power supply for the VME boards and the front-end cards on the MRPC modules. A solid state fuse that was subject to frequent breakdown was replaced. Additionally, a study via proton irradiation at the Centro di Protonterapia in Trento in 2019 investigated the cause of SEU events registered in 2018 at high irradiation-rates that produced sudden loss of communication with the module. The addition of a filter capacitor on the reset line of the microprocessor on the A1396 fixed the problem. A full refurbishment of all modules (entailing dismounting 216 modules from the detector) was completed in 2019-2020.
- All DRM2s are equipped with an ARM microprocessor (AT91RM9200 from Atmel) running Linux. During Runs 1 and 2 these CPUs were used exclusively to perform firmware upgrades on the VME boards via the JTAG interface on the VME backplane using Actel software for APA FPGAs. Using the cross-platform development tools provided, a full slow control DIM server has been deployed on these CPUs. This provides an additional channel to monitor voltages and temperature on the cards (even if the SCL is not connected). More importantly, thanks to a different hardware implementation on the DRM2 with respect to the DRM1, via the server running on ARM CPUs it will be possible to reset the CONET link and to reset the Microsemi FPGA of the DRM2. These two emergency resets may be used in case of loss of communication with the DRM2 (on the SCL), without the need of executing a power cycle. This may be useful especially because the DRM2 provides the primary clock to all TRM cards and a DRM2 power cycle would cause the loss of the clock and therefore the need of a power cycle in all VME slots in the crate.
- The procedure for the control and validation of the recorded data has been integrated into the O<sup>2</sup> framework under the project of the Quality Control (QC). For example, counters reporting errors detected in some HPTDC on the TRM are provided, as well as information on the hit rate on all 150 000 channels of the TOF detector. Noisy channels, identified by a hit rate greater than 1 kHz, can be disabled individually. Quality control tasks monitor the reconstruction and calibration process at various levels in order to provide a detailed insight into the various steps of the data processing. The QC code runs on dedicated computing nodes (FLPs or EPNs) to monitor the TOF

data stream, and in the DCS system, sampling data through the SCL. For the SCL part, the QC task is primarily intended to monitor temperatures of front-end electronics while readout is ongoing, as well as to monitor the hit rate, turning off quickly very noisy channels that could prevent the continuous readout (on average each event has to be read out in  $30 \,\mu$ s, see Sec. 3.8.1). While a certain degree of duplication exists, the FLP and EPN QC provides more aggregated information for the whole TOF, like the average multiplicity per matching window.

- The data flow from the CRUs is processed by the CPUs to perform the first level of data decoding and some preprocessing, which produces a second level of raw data, providing a zero-suppressed data stream where the relevant information is stored in a compact format. This effectively reduces the output bandwidth from the FLP to the EPNs by a factor of four for very high multiplicity events, and by a much larger factor for low multiplicity events. This allows the framework to make the best use of the available computing resources on the FLPs by performing low-level data monitoring. The QC system is able to access the preprocessed data directly on the FLPs for monitoring of the raw data stream as early as possible.



**Figure 77:** From left to right: HMPID Front-End Electronics (FEE), Readout Control Board (RCB) with the readout FPGA, the TTCRx and the Source Interface Unit (SIU). On the right, the C-RORC cards installed on  $O^2$  FLP computers.

## 3.9 High-momentum particle identification

## 3.9.1 Introduction

The ALICE High Momentum Particle IDentification (HMPID) detector [1, 67] is designed to identify hadrons at  $p_T > 1 \text{ GeV}/c$ . It is based on Ring Imaging Cherenkov detectors (RICH). Seven MWPCs, equipped with CsI segmented photocathodes, detect Cherenkov patterns. Together with the momentum measured by the TPC, they allow the determination of the particle mass. During Run 3, the main goal of the detector is to identify light nuclei and corresponding anti-nuclei at high transverse momenta in the central rapidity region, up to 12 GeV/c for the deuteron and triton, and up to 10 GeV/c for <sup>3</sup>He. During LS2, the readout firmware was upgraded to allow an increase of the event readout rate. In addition, to measure the inelastic cross section of (anti-)deuterons in the momentum range 0.2 to 2.2 GeV/c, two aluminum absorbers of 8 cm thickness, were installed in front of two RICH modules. In one module, two out of three Cherenkov radiator gas vessels were leaking, whereas the second RICH was located in a favourable position for the installation of the second absorber, needed for the required statistical abundance of the measurement. The consequent loss of acceptance of the PID measurement is largely compensated with the remaining modules by an event readout rate ten times higher than in Run 2. The detector stayed in place during the long shutdown, since moving it to a laboratory for repairing and upgrades was considered too risky. In turn a re-design of the readout firmware was carried out as explained in the next section.

# 3.9.2 Upgrading of readout firmware and trigger

Figure 77 shows the block diagram of the HMPID readout chain. The Readout Control Board (RCB) houses the readout FPGA, the TTCRx mezzanine card and the Source Interface Unit (SIU) interfacing the trigger and DAQ systems. The FPGAs firmware (FW) synchronises the trigger and data readout and is a key element of the readout performance. The C-RORC cards are installed on O<sup>2</sup> First Level Processor (FLP) computers and connected via optical links to the HMPID readout electronics.

The block diagram for the HMPID DAQ system is shown in Fig. 78. Fourteen optical links, two per RICH module, are connected to four C-RORC cards, on two FLPs .

# 3.9.3 New readout firmware and readout rate

In Run 2, the detector was operated at lower readout rates, in line with the TPC (2 kHz and 800 Hz, respectively, in pp and Pb–Pb collisions). As a result, the effective increase in data sample size between Run 2 and 3 can be up to a factor 14 and 10, respectively, in pp and Pb–Pb collisions. In order to improve the readout speed in Run 3, a new readout firmware was designed, tested in the laboratory, and finally deployed on the RICH modules. In laboratory tests, the HMPID has reached a readout rate up to 28 kHz for pp collisions (readout data headers only, a factor five improvement with respect to Run 2) and 9 kHz



**Figure 78:** HMPID full DAQ structure. Four C-RORC boards are installed on two First Level Processor (FLP) computers of the  $O^2$  data acquisition environment. Fourteen optical links connect the 14 RCBs, two per RICH module.

in Pb–Pb collisions (about a factor three improvement with respect to Run 2).

The readout firmware runs on the ALTERA Stratix II EP2S15F484C5 FPGA housed on the Readout and Control Board (RCB). The firmware improvements listed below resulted in higher and more stable data acquisition rates. The improvements are:

- shortening of the event readout time by 90 µs due to the omission in Run 3 of the L2a trigger level with its long latency,
- skipping of empty readout columns (in pp collisions only one out of seven events has a track in the HMPID acceptance, which results in an occupancy of 0.17%),
- the digitization and data transfer from FEE cards to the column memory buffer, which is now carried out in parallel to the decoding of the L1 trigger message in the TTCRx card. On average, it completes after 15.6 µs from the arrival of the LM trigger,
- masking of failing electronic columns, which is applied during the configuration of the FEE.

The final readout performance, as measured in the ALICE experiment with simulated occupancy, is shown in Fig. 79. With 0.17% occupancy (pp collisions), the measured acquisition rate is 22 kHz, whereas with 2% occupancy (in Pb–Pb collisions) it is 9.6 kHz. During the LHC pilot beam campaign in October 2021, the HMPID recorded events at 11.2 kHz with the zero bias trigger, coherently with the measured readout performance. In fact, the LHC filling schema had two colliding bunches with a  $\sim 20 \,\mu s$  separation. On average, only the first one was accepted in the HMPID, whereas the second colliding bunches, as expected during the normal operation of the LHC, will allow the full exploitation of the readout performance.

## 3.9.4 Detector calibration formalism

The detector-calibration method has changed with respect to Run 2. During a standalone run without zero suppression, a dedicated workflow at the EPN level computes the average values of the pedestal distributions and the corresponding standard deviation. These values are archived in the Calibration and Constants Data Base (CCDB, see Sec. 5). The DCS retrieves these data from the CCDB and uploads the pedestal values and the standard deviations in the readout electronics, via the ALFRED mechanism (see Sec. 5.6). The granularity of this calibration mechanism is at the level of a single readout column, with a single calibration file per column. Each C-RORC link configures 24 columns, which correspond to the right or left half of a RICH module.

## 3.9.5 Other subsystems

**Detector Control System** The DCS for the HMPID was upgraded providing the following new features:



Acquisition rate vs Occupancy

**Figure 79:** Event rate as a function of the detector occupancy, which is on average about 0.17% and 2% in pp and Pb–Pb collisions, respectively.

- uploading the pedestal and sigma values in the readout electronics via the new ALFRED formalism;
- monitoring the busy time of a single RO link;
- automatically ramp up tripped HV channels.

Absorbers for anti-deuteron inelastic cross section measurement Another important achievement during LS2 is the installation of two aluminium absorbers of 8 cm thickness, corresponding to half an interaction length for the anti-deuteron inelastic cross section measurement in the momentum range of 0.2 to 2.2 GeV/c. The (anti-)deuterons impinging on the two absorbers will be identified using the dE/dx measured by the TPC and the time-of-flight measured by the TOF detector. The detection of secondary particles produced in the hadronic interaction with the target nuclei will be carried out by the pad-segmented cathodes of the HMPID-MWPCs installed right behind the absorbers.

During Run 3 the expected statistical precision of this measurement is expected to be in the range 5–10% in the momentum interval 0.2 GeV/*c* $for p–Pb collisions at <math>\sqrt{s_{\text{NN}}} = 8.8$  TeV and 5–8% in the momentum interval 0.2 GeV/*c* $for Pb–Pb collisions at <math>\sqrt{s_{\text{NN}}} = 5.5$  TeV (see Fig. 80). A systematic uncertainty of maximum 5.5% is expected based on a conservative estimate.

In ALICE an effort is ongoing to measure the (anti-)deuteron inelastic cross section, also at low momentum, using the TRD as an absorber. This detector has an average mass number  $\langle A \rangle \approx 8$  considering the gas mixture, the active detector materials, and the support structure. The measurement using aluminum (A = 27) as a target will provide complementary information to other existing approaches and these measurements will allow the study of the mass-number dependence of the (anti-)deuteron inelastic cross section at low momentum, which is currently unknown.



**Figure 80:** Expected relative statistical uncertainty on the anti-deuteron absorption cross section in p–Pb and Pb–Pb collisions for the full Run 3 data sample and for one half of the Pb–Pb sample.

### 3.10 Electromagnetic Calorimeter

The Electromagnetic Calorimeter (EMCal) [68, 69] was designed for the measurements of electrons from heavy-flavor hadron decays, of the electromagnetic component of jets, and of direct photons and neutral mesons. The calibration procedures and achieved performance during Runs 1 and 2 are described in detail in [70].

The calorimeter remains a trigger detector during Run 3 and will continue to provide L0,  $L1-\gamma$  and L1jet triggers (described in Sec. 5.5). It is also possible to read out the calorimeter with any other trigger provided by the CTP (e.g. FIT minimum bias trigger). The hardware did not need any modifications to comply with the Run 3 requirements. However, in order to be able to operate during Run 3, the following actions were taken: spare hardware was produced in order to ensure operation during Runs 3 and 4, and the front-end electronics firmware was upgraded in order to satisfy the specifications for Run 3. Details will be given below.

The EMCal is a shashlik-type lead-scintillator sampling calorimeter comprising 4416 individual modules that are grouped into 20 Super Modules (SM). Each of the modules is composed of four optically isolated towers, resulting in 17 664 individual towers in total. The optical readout of each tower is provided using wavelength shifting fibers coupled to an Avalanche Photo Diode (APD).

The front face dimensions of the towers are  $6 \times 6 \text{ cm}^2$  resulting in individual tower acceptance of  $\Delta \eta \times \Delta \varphi \simeq 0.0143 \times 0.0143$  at  $\eta = 0$ . The towers are arranged within the SMs such that each tower is approximately projective to the interaction vertex in  $\eta$  and  $\varphi$ . The towers are operated at  $\sim 25^{\circ}$ C ambient temperature with a nominal APD gain of  $\simeq 30$ , to achieve a 14-bit effective dynamic energy range from  $\sim 16 \text{ MeV}$  to  $\sim 250 \text{ GeV}$  per tower.

The overall design of the calorimeter is heavily influenced by its integration within the ALICE setup [1]. SMs of 3 different sizes are used: full-size, 2/3-size and 1/3-size. Each full-size SM consists of  $12 \times 24 = 288$  modules arranged in 24 strip modules of  $21 \times 1$  modules each. The 1/3 and 2/3 size SMs consist of  $4 \times 24 = 96$  and  $12 \times 16 = 192$  modules, respectively.

The detector consists of two parts, that cover two different regions in azimuth, as illustrated in Fig. 81 (see also Fig. 1). The main segment of the EMCal consists of ten full-size SMs and two 1/3-size SMs covering  $|\eta| < 0.7$  in azimuth and  $80^{\circ} < \varphi < 187^{\circ}$  in azimuth, while six 2/3-size SMs and two 1/3-size SMs are installed around the PHOS detector, covering  $0.22 < |\eta| < 0.7, 260^{\circ} < \varphi < 320^{\circ}$  and  $|\eta| < 0.7, 320^{\circ} < \varphi < 327^{\circ}$ . The latter part of the detector is some times referred to as Di-Jet Calorimeter (DCal). In the following, we will use the term EMCal to refer to the full system, and DCal only when the distinction between the two segments is useful.

The SMs are located at  $R \simeq 4.3$  m in radial distance from the beamline, inserted into support frames situated between the time-of-flight detector and the ALICE L3 magnet. The weight of a single full-size SM is  $\simeq 7.7$  tons, and the total weight of all 20 SMs is  $\simeq 120$  tons. More details regarding the mechanical structure and Front-End Electronics (FEE) can be found in [68, 69].

An individual EMCal tower is read out with an avalanche photodiode and preamplifier mounted on the tower. The preamplifier signal is split into energy and trigger shaper channels on the FEE boards [71]. The energy shaper signals are sampled at 10 MHz with 10-bit resolution using ALTRO chips [72]. Prior to digitisation, each energy signal is split into high and low gain channels, each shaped separately, with a gain ratio of 16 to provide an effective dynamic range of 14 bit. Each FEE board provides readout of the high and low gain channels from 32 towers.

The ALTRO chips are configured to record 15 10-bit time (pre-)samples per readout channel per event to cover the  $1.5 \,\mu$ s integration window. The data are compressed by discarding samples close to the reference level (pedestal) that contain no useful information ("zero suppression"), reducing substantially



**Figure 81:** Schematic view of the EMCal, consisting of two disjunct detector segments, in the top-left hemisphere and the bottom-right hemisphere (DCal), covering approximately opposite locations in azimuth. The PHOS calorimeter inside the DCal segment is indicated in brown.

the data volume. The pedestals are obtained from special runs with no pre-programmed pedestal or signal present.

# 3.10.1 The readout system

Each SM is equipped with a readout concentrator, the so called Scalable Readout Unit (SRU) [73]. The SRU interconnects with each FEE board through a custom daughter card which was designed for the EMCal FEE board. It provides interface compatibility between the SRU and the EMCal FEE board to provide the Data, Trigger, Clock and Control (DTC) links. The maximum bandwidth of a DTC link on the SRU is 2 Gb/s. In the EMCal application, the bandwidth of the DTC link is conservatively limited to 20 MB/s due to the hardware capability of the rather outdated FEE board FPGA (Altera ACEX 1K Family EP1k100QC208-3). However, the DTC link does not limit the EMCal data throughput.

Each SRU has a total of 40 point-to-point links to connect to 37 FEE and three TRU boards for the full size EMCal SMs, and sends the data to an FLP through two Detector Data Links (DDL), see Sec. 5.2. The SRU board integrates a TTCrx (LHC Trigger, Timing, and Control receiver) [74], which can receive trigger and timing information from the ALICE Trigger system. It also has three SFP+ ports directly connected to the FPGA's high speed serial transceivers for serial data transport at up to 5 Gb/s and an additional SFP+ port that provides a 10 Gb Ethernet link. For the EMCal application, one of these transceivers is used for the Ethernet connection to the ALICE detector control system, while the other two transceivers are used for the two DDL links to transmit the data to an FLP.

# 3.10.2 Trigger

The EMCal provides inputs to the L0 and L1 trigger decisions in ALICE (Sec. 5.5). The trigger subsystem resides in specific hardware boards. The analog signals of  $2 \times 2$  adjacent towers are summed in the FEE boards and transmitted to a Trigger Region Unit (TRU) board, where the  $2 \times 2$  tower sums from twelve FEE cards ( $2 \times 2$  sums from 96 channels) are digitized at the LHC clock frequency of 40 MHz [75]. The digitized  $2 \times 2$  tower sums are summed over time samples with pre-sample pedestal subtraction to provide an integral energy measurement, referred to as time sum. Finally, overlapping  $4 \times 4$  tower digital sums are formed within each TRU and a peak finding algorithm is used to find a signal peak. Each  $4 \times 4$  sum signal peak amplitude is then compared against a threshold to provide a L0 trigger output that indicates the presence of a high energy shower in the TRU region (1 TRU covers 1/3 of the area for a full-size SM). The L0 trigger decision from each TRU is passed to a Summary Trigger Unit (STU), which performs the logical OR of the L0 outputs from all TRUs to provide a single L0 input to the ALICE CTP.

Upon reception of an accepted L0 trigger from the CTP, the digitized time-summed  $2 \times 2$  tower sums from each TRU are passed to the STU. In the STU the  $4 \times 4$  overlapping tower sums are formed again, but across TRU boundaries over the full acceptance to provide an improved L1 high energy shower trigger referred to as L1- $\gamma$  trigger [76]. At the same time, tower sums over a large  $8 \times 8$  trigger channel window ( $16 \times 16$  towers) and a  $16 \times 16$  trigger channel window ( $32 \times 32$  towers) are also formed to provide a L1 jet trigger. Both L1 triggers allow to define two thresholds for the event selection.

In order to reduce the bias due to multiplicity fluctuations in heavy-ion collisions, there is a direct communication between the STUs of the main EMCal segment and the DCal to consider the underlying event background in the online L1 trigger decision. The background is estimated based on the median of the energies deposited in  $8 \times 8$  trigger channel ( $16 \times 16$  towers) windows in the opposing segment of the detector. The background is subtracted from the signal amplitude and then compared against a threshold to provide L1 triggers.

# 3.10.3 Spare production

In order to guarantee a smooth operation of the detector through Run 3, additional FEE boards were produced. They are identical to the ones used during Runs 1 and 2. A total of 100 front-end cards and six TRUs, amounting to 15% of the units used in the experimental cavern, were produced. In addition, 2 STUs were produced as spares.

# 3.10.4 Front-end electronics firmware upgrade

The firmwares of SRU and of STU had to be upgraded in order to satisfy the requirements concerning the readout rate for Run 3. In particular, in order to increase the readout rate for the anticipated 50 kHz minimum bias Pb-Pb interaction rate for Run 3, a multi-event buffering (MEB) logic was implemented in the SRU firmware, allowing to accept a trigger while the data from the previous trigger is being processed. The importance of multi-event buffering for the data recording rate as a function of interaction rate is shown in Fig. 82. The left-hand plot shows the predictions expected from Monte Carlo simulations, and the results from measurements with black events are shown on the right-hand plot. To estimate the SRU readout rate, which depends on detector occupancy, some Pb-Pb data from Run 2 were used. Pedestal data are used to create the load expected for minimum bias Pb-Pb collisions. For early readout rate estimates [60], the detector occupancy was emulated by masking ALTRO channels in the FEE configuration (open markers on the left-hand side plot). This was improved later by applying a relatively high value for the baseline and by suppressing the data in the ALTRO channels (full markers). This procedure yields results that are closer to real data. The readout rate decreased by  $\sim 10\%$  compared to preliminary expectations for both single- and muti-event buffering configurations. With four event buffers, a readout rate of  $\sim 35$  kHz is expected for minimum bias Pb–Pb collisions at 50 kHz, and the results from Monte Carlo simulations are in a good agreement with the results from measurements at Point 2.

In addition, further improvements were implemented in the SRU firmware for increasing the readout stability and the physics performance. In particular, a synchronization between the LHC 40 MHz clock and the ALTRO 10 MHz clock was implemented in order to perform online time calibration during Run 3.

The firmware of the STU was upgraded to conform to the Run 3 trigger and DAQ protocols. The STU



**Figure 82:** SRU readout rate as a function of interaction rate for different Multi-Event Buffering (MEB) schemes. The left panel shows simulated data, while the right panel shows measured performance.

readout time highly depends on the data size to be sent from STU to DAQ. It is  $\sim 50 \,\mu$ s when sending the data from all trigger channels needed for the full QC. However, for physics analysis only the channels contributing in the trigger decision are selected by the STU FPGA, resulting in a readout time of  $\sim 20 \,\mu$ s. During normal operation for physics data taking, the data from all channels will be recorded for only  $\lesssim 1\%$  of events, and the average readout time is expected to be close to  $20 \,\mu$ s.

#### 3.10.5 Data compression

In order to reduce the amount of data written to tape, a fit of the time series of raw ADC signals is performed per tower, extracting amplitude and peak time. This fit is performed during the synchronous reconstruction. Amplitude and time, as well as tower index and gain type are encoded in a 48-bit word per tower. Further compression can be achieved by removing low tower energy signals which are rejected in the clusterization process and will therefore not contribute to physics measurements.

#### 3.10.6 Calibration

The calibration procedure is based on existing calibration procedures used during Run 2. At the beginning of the data taking process of Run 3, a sample of events will be used to determine the absolute energy scale for each tower, based on the comparison of the reconstructed  $\pi^0$  mass to the nominal mass. Identification of bad channels, which need to be removed in the analysis process, and calibration of the time measurement with respect to the collision time are based on the event-by-event tower energy and time measurement and are performed for all data blocks. Calibration histograms are filled and calibration parameters are determined in an automatized procedure during the synchronous and asynchronous reconstruction using the O<sup>2</sup> calibration software framework. The remaining time-dependence of the energy calibration resulting mainly from the sensitivity to the temperature are calibrated using the light-emitting diode (LED) system, by generating an ultra-bright blue light triggered by the CTP [68].

#### 3.10.7 Quality Control

Monitoring of the data quality is based on the Quality Control framework within the ALICE  $O^2$  computing framework. The EMCal Quality Control is designed to provide sufficient information in order to identify problems during the data taking process, and to decide on the usability of data blocks for physics measurements. Quality Control for EMCal consists of the following tasks:

- Raw Data level: A fraction of the raw data is inspected in order to find problematic parts of the detector, and to check for errors in the raw stream
- Digit level: Tower-based quantities (energy, position, time) after the fit to the raw data are monitored in order to find inactive or noisy regions of the detector.
- Cluster: A calorimeter cluster, an aggregate of adjacent calorimeter cells with energy above the noise threshold, is the main object delivered by the reconstruction software. Basic EMCal cluster observables, such as energy, position, number of contributing towers, and others, are monitored at the level of the synchronous and asynchronous reconstruction stages for a fraction of the data.
- Trigger level: Trigger-level digits are monitored for a small fraction of the data in order to identify noisy regions in the trigger system.

The Quality Control is histogram-based. The histograms are further processed to produce derived observables which are monitored continuously to follow the time evolution of the detector state.

### 3.11 Photon Spectrometer

The Photon Spectrometer (PHOS) [77] is a precise electromagnetic calorimeter which specialises in the detection of photons with high energy and spatial resolutions. PHOS covers a limited acceptance at mid rapidity |y| < 0.13 and azimuthal angle  $250^{\circ} < \varphi < 320^{\circ}$ . The layout of the PHOS detector surrounded by the DCal is depicted in Fig. 81. The physics objectives of the PHOS detector are the measurement of direct photon yields in the energy range from  $\approx 0.1$  to 100 GeV, azimuthal anisotropy of photon emission, photon-hadron correlations, as well as measurements of light neutral mesons  $\pi^0$ ,  $\eta$ ,  $\omega$  with transverse momenta  $p_{\rm T}$  above  $\approx 0.6$  GeV/c, with the upper  $p_{\rm T}$  limit being driven mainly by available statistics.

### 3.11.1 Detector layout

PHOS consists of 4 modules assembled from the active detection elements consisting of lead tungstate (PbWO<sub>4</sub>) crystals with avalanche photodiode (APD) photodetectors and preamplifiers. These detection elements, called cells, compose rectangular matrices which are called modules. Three PHOS modules covering the azimuthal angle range  $260^{\circ} < \phi < 320^{\circ}$  consist of  $64 \times 56$  cells, and one module at the angles  $250^{\circ} < \phi < 260^{\circ}$  is a matrix of  $32 \times 56$  cells. Figure 83 depicts one cell and cells stacked into a module matrix. The front surface of each crystal is positioned at a distance of 460 cm from the beam



**Figure 83:** PHOS detection element consisting of a PbWO<sub>4</sub> crystal and APD with preamplifier (left) and a fraction of a cell matrix of one PHOS module (right).

axis.

The active detection elements of the cells are made of lead tungstate, PbWO<sub>4</sub>, an inorganic crystalline scintillator which has a high density  $\rho = 8.3$  g/cm<sup>3</sup>, a radiation length  $X_0 = 0.89$  cm, and Molière radius  $R_M = 2.0$  cm. The light yield of the crystal is about 0.3% of the light yield of NaI. The luminescence of PbWO<sub>4</sub> has a wide spread in the region of visible photons with a maximum at  $\lambda_{max} = 420$  nm. The light yield of the crystals changes by -2.5% for every Kelvin temperature change around the operating temperature.

The PHOS modules are operated at a temperature of  $-25^{\circ}$ C to achieve an increase of the light yield by a factor of three compared to room temperature. In order to ensure these working conditions and to provide thermal stabilization of 0.1%, the crystal matrices of the PHOS modules are housed in a thermoinsulated volume cooled down by C<sub>6</sub>F<sub>14</sub> for which the flow is provided by the cooling plant installed outside the ALICE solenoid magnet at about 10 m from the PHOS modules. The PHOS modules also contain a so-called "warm volume" with front-end electronics. The warm volume and the cold crystal volume are contained in air-tight boxes, through which dry nitrogen is blown in order to maintain a low humidity. The environment inside the modules is monitored by a set of temperature and humidity sensors.

## 3.11.2 Readout

For LHC Run 3, like in the previous Runs 1 and 2, PHOS remains a triggered detector, i.e. acquiring data upon receiving a trigger from the ALICE trigger system. The new ALICE trigger protocol provides 2-level triggers with the level-0 trigger generated 0.8 µs after the collision, followed by the level-1 trigger with a latency of 6.5 µs. The PHOS readout system receives the L0 trigger generated by one of the ALICE trigger detectors (FIT, EMCal, TOF, PHOS, etc.). The choice of the L0 trigger source is configured by the central trigger processor (CTP). After receiving the L0 trigger is processed and shipped to the FLP upon receiving the L1 trigger. The busy signal is lowered after sending the whole data payload to the FLP. The dead time depends on the payload size and varies from 20 to 55 µs. Shipping the L1 trigger is performed by the CTP, and the absence of the L1 trigger within the time window corresponding to the L0–L1 latency is considered as a rejection of the triggered event.

Energy deposited in each PHOS cell by high-energy particles is detected by an APD Hamamatsu S8664-55 with a sensitive area of  $5 \times 5 \text{ mm}^2$ . The APD gain is adjusted to the nominal value of 50 by setting the bias voltage with an accuracy of  $10^{-3}$  in the range from 210 to 400 V. The APD signals are passed to charge-sensitive preamplifiers with an output signal proportional to the APD charge conversion. Dualgain shapers with an average gain ratio of 16.7 generate semi-Gaussian signals with a rise time of 2.1 µs, which are further digitized by a 10-bit sampling ADC (ALTRO [72]) at a sampling rate of 10 MHz. The dynamic range of photons detected in PHOS spans from 5 MeV to 5 GeV in the high-gain channels and from 80 MeV to 80 GeV in the low-gain channels. The number of samples is configurable via ALTRO registers and is chosen to be 37 in order to cover the rising edge of the signal and its maximum. The sampled digitized waveform of a signal in one channel is shown in Fig.84. One front-end board processes 64 signals generated by high-gain channels and low-gain channels from 32 PHOS cells. Data collection from ALTRO, FEE board configuration and data shipping to the readout units is provided by the Altera ACEX 1 K Family EP1k100 FPGA. The design of the PHOS front-end board is described in [78].



**Figure 84:** Digitized signal waveform from one channel of the PHOS FEE board. Sampling time is 100 ns, digitization of sampled amplitude is 10 bits.

The PHOS generates triggers at the L0 and L1 levels to select events with high-energy deposition in the PHOS cells. The input to the trigger decision starts from the analog sum of the amplitudes of the group of  $2 \times 2$  cells implemented in the FEE boards as "fast-OR" signals. Each FEE board produces 8 such "fast-OR" signals. The L0 trigger is produced by the Trigger Region Units (TRU) [75] covering an area of  $16 \times 28$  cells. The TRU measures the energy deposits in a sliding window of  $2 \times 2$  "fast-OR" channels, or  $4 \times 4$  cells. If the energy in at least one window exceeds the configurable threshold, the TRU generates the L0 trigger. The whole detector is inspected by 28 TRUs. All 28 L0 triggers generated by the TRUs are collected by the Summary Trigger Unit [79] (STU), which performs a logical OR operation of the

inputs and generates the common L0 trigger if at least one TRU generated a trigger. The TRU boards deployed by PHOS are similar to those designed for the EMCal detector, with the only difference being the number of channels: PHOS TRU has 112 channels, whereas the EMCal one has 96 channels. While the STU boards are electronically identical for PHOS and EMCal, different firmware is used in the two cases.

FEE boards are read out by a point-to-point protocol via the designated DTC links (described earlier in the EMCal section) to the readout concentrator, the Scalable Readout Unit (SRU) [73]. One SRU can serve up to 40 front-end boards connected to its 40 DTC ports. The PHOS readout topology is closely related to the geometry of the PHOS modules, using 28 FEE boards and 2 TRU per SRU. The whole PHOS detector is read out by 14 SRUs. Triggering and synchronization of the SRU is provided by the TTC signal distributed by the ALICE central trigger. The TTC clock and trigger is propagated by the SRU to each FEE board or TRU via the DTC links. The SRU raises the busy signal upon receiving the L0 trigger via TTC and releases the busy signal when all FEE boards and TRU are read out. Data collected by the SRU are shipped to the FLP via the DDL link with a bandwidth of 2.125 Gb/s. All electronic modules (FEE, TRU, SRU) remain unchanged compared to their hardware state during Run 2. However, the upgrade concerned the SRU firmware which was adapted to comply with the new trigger protocol and was modified from the 3-level trigger sequence in Run 2 to the L0-L1 trigger results in a significant reduction of the busy time, allowing an increase of the readout rate.

## 3.11.3 Performance

While the PHOS active detection elements, photodetectors and front-end electronics in Run 3 remain the same as they were during Run 2, the physics performance of the detector achieved during Run 2 remains valid for the upcoming Run 3.

The high light yield, short radiation length and small Molière radius of lead tungstate, enable high energy and spatial resolutions for the PHOS. The energy resolution measured in a beam test is parameterized by the equation [80]

$$\frac{\sigma_E}{E} = \sqrt{\left(\frac{a}{E}\right)^2 + \frac{b^2}{E} + c^2} \tag{1}$$

with a = 0.013 GeV, b = 0.036 GeV<sup>1/2</sup>, c = 0.011 and the photon energy *E* is expressed in GeV. Spatial resolution was evaluated in Monte Carlo simulations and indirectly confirmed by the mass resolution of  $\pi^0$  mesons in data collected by PHOS during LHC Runs 1 and 2. The value of the noise parameter *a* of the energy resolution (1) is determined by the APD intrinsic noise and design of the APD preamplifiers and FEE boards. The stochastic term *b* is driven by the light yield of the PbWO<sub>4</sub> scintillator at the nominal working temperature and by the light collection efficiency defined by the surface area of the APD. Since the PHOS electronics did not change since Run 1 and 2, the energy resolution parametrized by (1) remains valid for Run 3.

The spatial resolution is parametrised as [81]

$$\sigma_x = \sqrt{A^2 + \frac{B^2}{E}},\tag{2}$$

where parameters A and B depend on the photon incident angle; averaging over all angles yields A = 0.96 mm,  $B = 2.29 \text{ mm} \cdot \text{GeV}^{1/2}$ . One of the key performance parameters of PHOS is the two-photon separation distance, i.e. the distance at which individual photons striking the PHOS surface can be identified. As discussed in [81], PHOS can distinguish photon showers split by at least one PHOS cell,  $\delta_r = 2.2 \text{ cm}$ . This feature is especially important in the high-multiplicity environment of heavy-ion collisions, as well as for resolving single photons from  $\pi^0$  decay at high  $p_{\text{T}}$ .

The mass resolution for  $\pi^0$  and  $\eta$  mesons in pp collisions at  $\sqrt{s} = 13$  TeV in Run 2 is discussed in [82]. The mass resolution is affected by large incident angles of photons at low transverse momenta and by the splitting procedure of overlapping showers at high  $p_T$ . Photon showers from  $\pi^0$  decay start to overlap in PHOS at  $p_T > 25$  GeV/*c*, and shower splitting efficiency is reduced with the growth of the  $\pi^0$ 's  $p_T$ . At  $p_T \approx 50$  GeV/*c* the most probable distance between decay photons from the  $\pi^0$  becomes to be one PHOS cell, then the photons cannot be resolved anymore. At such high  $p_T$ , the reconstructed  $\pi^0$  mass is distorted by overlapping showers, and the mass resolution becomes rather large. The widths of the  $\pi^0$  and  $\eta$  mesons measured with PHOS in Run 3 will remain the same,  $\sigma_{\pi^0} = 4.5$  MeV/ $c^2$  and  $\sigma_{\eta} = 15$  MeV/ $c^2$ .

Photon identification in PHOS is based on charged-particle background rejection using matching between charged-particle tracks and clusters, as well as selections of cluster-shape parameters [81] to discriminate between electromagnetic and hadronic showers. Measurements of the arrival time could also be a strong criterion to identify fast neutral clusters produced by photons and slow clusters produced by heavy neutral hadrons such as neutrons and antineutrons. The intrinsic time resolution of PbWO<sub>4</sub> is rather good and can reach  $\sigma_t = 0.15$  ns at photon energy E = 1 GeV. However, the front-end electronics deployed by PHOS is not designed for precise time measurement. Timing-resolution dependence on photon energy, measured using physics data collected by PHOS during Run 2, is shown in Fig. 85. At energies below a few hundred MeV, the time resolution  $\sigma_t$  rises above 10 ns. The best time resolution of 2 ns is achieved for an energy  $E \approx 5$  GeV, but then deteriorates because the low-gain channel is used for larger signals. The achieved time resolution is sufficient to separate photons of energies E > 1 GeV produced in different bunch crossings with 25-ns intervals.



Figure 85: PHOS time resolution dependence on photon transverse momentum during the Run 2 data taking.

As mentioned above, the PHOS performance will not change in Run 3. Hence, the PHOS energy and time resolutions and the related systematic uncertainties on photon and neutral meson measurements achieved in Run 2 will not improve. However, the ability to collect larger data samples with the new readout strategy will allow to improve the statistical uncertainties by a factor of two to five with respect to Run 2.

## 3.12 Zero-Degree Calorimeter

The aim of the ZDC upgrade is to cope with the high collision rate foreseen for Runs 3 and 4. Since the zero-degree calorimeters behaved well with the irradiations during the Run 1 and 2 operations, the calorimeter stacks are unchanged[1]. However, the infrastructure had to be consolidated and the readout system upgraded.

Concerning the infrastructure, two actions were performed. Firstly, the control electronics of the movable platform was upgraded. This platform is used to move the ZDC calorimeters in a garage position where it is shielded from potential beam losses during beam injection or adjustment operations. Moreover, it allows to align them with the neutron (proton) average impact position during data taking. Secondly, additional power supplies for the voltage dividers of the ZDC photomutipliers were installed, in order to stabilize the gain in the high event rate conditions that are foreseen.

The main upgrade activity concerned a new readout system based on faster electronics. In fact the Run 1 and 2 readout electronics were based on VME charge-to-digital converters with a conversion time of  $\sim 10 \,\mu$ s that cannot sustain an event rate of 50 to 100 kHz without dead time (taking also into account a possible luminosity increase beyond the LS3 baseline). Moreover, in order to fully exploit the ALICE physics potential in ultra-peripheral heavy-ion collisions, the ZDC aims to take data in continuous (autotrigger) readout mode. This operating condition is particularly challenging since the acceptance of the ZDC not only covers nucleon emission from hadronic interactions but also the ones resulting from electromagnetic dissociation [83–85] that have  $\sim 50$  times higher cross sections for Pb–Pb collisions at LHC energies. The designed Pb–Pb readout rate of 100kHz will be accompanied by an additional  $\sim 5$  MHz event rate, mostly uncorrelated among the two neutron ZDCs (ZNA on side "A" and ZNC on side "C" of the experiment, at positive and negative pseudorapidities, respectively), resulting from electromagnetic interactions that do not involve barrel detectors.

Because of the low number of channels to be instrumented, the new readout system is based on commercial digitizers, in particular ALICE will use VITA 57 FPGA Mezzanine Card (FMC) digitizers, that allow a continuous sampling of the signal waveform followed by a real time analysis on an FPGA. The adequate bandwidth available through the FMC connection from the digitizer to the FPGA allows for the full waveform to be analyzed. Fast trigger and selection algorithms are executed on the FPGA and the relevant portions of the waveform (see below) are transferred to the acquisition and reconstruction system via optical GBT links (see Sec. 2.2).

To preserve the time and charge resolution and to match the bandwidth of the ZDC signals, the digitizers should have about 12bit resolution (with an effective number of bits of  $\sim 10$  bit) with a sampling frequency of  $0.5 \div 1$  GHz. Since the photomultiplier signal is unipolar the digitizer has to be DC coupled. After evaluating a few modules, the ADC3112 FMC [86] mounting digitizer ADS5409 [87] was chosen. Thanks to the shielded location of the readout electronics there is no requirement of radiation hardness. The FMC is hosted on the carrier "Intelligent FPGA Controller" IFC\_1211 [88] with a Kintex UltraScale XCKU40 FPGA. The ADC3112 on-board oscillator is locked to the LHC revolution frequency recovered from a GBT link and dispatched through the FMC connector. Since the ADC will acquire 24 samples per bunch crossing, it will run at a frequency of  $\sim 960$  MHz. Internally each ADC channel is acquired by two digitizers which work in interleaved mode. In order to reduce the data size the low pass filtering with digital downsampling is enabled on the ADC. This has the benefit of improving the measurement accuracy by averaging over the even and odd samples removing the need to correct for the slightly different gains and offsets between the two circuits. The data throughput to the FPGA will therefore be reduced to 12 samples per bunch crossing at  $\sim 480$  Msps, simplifying the firmware design.

A critical aspect of the ZDC operation in Run 3 is triggering at high rates in Pb–Pb with the bunch spacing reduced to 50 or 25 ns since the duration of the photomultiplier signals will be comparable or longer than the bunch spacing. This is complicated by the large signal dynamics (from 1 to  $\sim$  60 neutrons in the

acceptance of the neutron calorimeters). In order to identify the presence of a signal, a differential trigger algorithm was developed. Samples at different times are compared (sample  $y_i$  with sample  $y_{i+shift}$  where shift is a tunable parameter from 3 to 5 samples). If two consecutive differences are above threshold, the trigger condition is satisfied, effectively rejecting fake triggers due to electronic noise, and the bunch is flagged for acquisition. This autotrigger condition drives the acquisition in continuous readout mode while in triggered mode the readout system acquires data regardless of the autotrigger flag. The same flags are used also to measure the interaction rate that is used to estimate the instantaneous luminosity.

The measurements of signal arrival time and amplitude need to take into account the baseline (pedestal) oscillations and the possible presence of a signal in an earlier bunch crossing (pile-up).

Two methods for pedestal evaluation were implemented. Given the bunch structure of LHC that alternates "trains" of colliding bunches to "gaps" where no collisions can occur, it is possible to measure the pedestal considering portions of the digitized data where no collision can occur. These are prescribed by a filling map uploaded on the front-end at each fill. Using this information the pedestal average for each LHC orbit is computed and then transmitted on GBT. This allows taking into account a possible low frequency drift of the baseline and obtaining an accurate reference. A second method allows to effectively subtract the pedestal in presence of noise at higher frequencies. For each trigger (or autotrigger), in addition to the bunch where the signal peaks ( $BC_0$ ), the 12 samples of the preceding bunch crossing will be transferred in order to evaluate and correctly subtract the pedestal in case of a significant discrepancy in the orbit average computed with the first method.

For what concerns the pile-up from a signal in an earlier bunch crossing, in autotrigger mode all ZDC signals are transmitted and reconstructed, allowing to identify and correct for pile-up. On the other hand, in triggered mode, the firmware ensures that the information on the signal inducing pile-up is not lost due to trigger selectivity. Consequently, for each triggered bunch crossing, up to four bunch crossings will be transferred: the triggered and the preceding one (pedestal evaluation) and additionally  $BC_{-2}$  and  $BC_{-3}$  in case a pile-up signal is detected.

During Pb–Pb data taking in 2018 a prototype of the ZDC system was tested in parallel to the ALICE data acquisition by using a custom system based on Labview reading the ADC\_3112 mounted on a Xilinx evaluation board or using the IOxOS IFC\_1210 carrier. An example of the achieved performances is shown in Fig. 86. The resolution on 2.76 TeV single neutron emission detected by ZNC is  $\sim 17\%$ , resulting in an improvement w.r.t. the  $\sim 20\%$  of the previous electronics. The time resolution w.r.t. the ALICE L0 trigger is  $\sim 0.35$  ns, a value that is comparable with the performance of the previous system.



**Figure 86:** Performance of the digitizer during Pb–Pb 2018 data taking in the operating conditions chosen for Run 3. On the left plot: the lower part of the triggered spectrum of ZNC common photomultiplier in Pb–Pb collisions where the emission of a single 2.76 TeV neutron and multiples are visible. The spectrum is fitted to a superposition of gaussian functions whose peak positions  $\mu_i$  are related to the neutron multiplicity by the relation  $\mu_i = \mu_{1n} \times i$  and their widths by the relation  $\sigma_i = \sigma_{1n} \sqrt{i}$ , where *i* is the neutron multiplicity and  $\mu_{1n}$  and  $\sigma_{1n}$  are the mean and the r.m.s. of the single neutron peak, respectively. The autotrigger algoritm effectively rejects pedestal events. On the right plot: the arrival time of ZNC common photomultiplier signals w.r.t. the reference ALICE L0 trigger signal.

## 4 Mechanics and integration

The principal layout and infrastructure of the original ALICE detector is described in Ref. [1]. During LS1 (2013 to 2014), the DCal detector had been added as an extension of the electromagnetic calorimeter on a 60 degree azimuthal acceptance opposite of the EMCal detector. For this purpose, new support rails and a new support structure holding the PHOS and the DCal modules were installed in the bottom part of the L3 magnet. These new support rails are also used for injecting 10000 m<sup>3</sup>/h of cold air into the L3 magnet volume for stabilising the air temperature around the ALICE detector.

For the LS2 upgrade, the global mechanical structures of ALICE remained unchanged. The most important modification was related to the support of the beam pipe and the ITS2 detector. In the original ALICE setup, the TPC had to be moved to the parking position in order to carry out maintenance of the ITS2 detector. This required the disconnection of about 30% of all ALICE services and would therefore only have been possible in a long shutdown of more than one year. In addition, the beam pipe, ITS2, and TPC were connected in a way that did not allow relative adjustment, so alignment of the beam pipe with the nominal LHC beamline required the adjustment of the TPC or even of the entire ALICE experiment. Such an operation had been carried out in 2008.

For the ALICE 2 detector, the support structures of the ITS2 and the beam pipe inside the TPC were therefore completely re-designed. The cage, a support structure made from carbon fiber material, was installed inside the TPC as shown in Fig. 87. The cage holds the beam pipe and has a rail system that allows the installation of the ITS2 and MFT detectors with the TPC in place. This makes it possible to perform maintenance of the ITS2 detector during a year end technical stop of about three months. In addition, it allows the alignment of the beam pipe with the nominal beamline within a range of  $\pm 4$  mm without the need to move the TPC.



**Figure 87:** The cage: a support structure for beam pipe, ITS2, and MFT. Shown is the cage (magenta) with the bottom half of ITS and MFT as well as the beam pipe already installed.

The new ALICE beam pipe has a central beryllium section with a length of 888 mm, an outer diameter of 36 mm, and wall thickness of 0.8 mm (Fig. 88).



Figure 88: Beam pipe installed for the ALICE 2 detector with an outer diameter of 36 mm.

## 5 Readout and data processing

In this section, an overview of the readout concepts and the data flow is given. Subsequently, the individual systems for experiment and detector control, triggering, data acquisition, synchronous and asynchronous event reconstruction, and the processing of analysis object data are discussed.

#### 5.1 Readout data flow

In order to minimise the costs and requirements for data processing and storage, the ALICE computing model for Runs 3 and 4 is designed for a maximum compression of the data volume read out from the detectors synchronously with data taking [89]. In order to compress the large data flow from the TPC, tracks are reconstructed online. Moreover, data for detector calibration are extracted during online processing avoiding additional offline calibration passes over the full data set. Online data processing is performed in two steps on the ALICE online/offline facility (O<sup>2</sup>) located at Point 2. The facility consists of two types of computing nodes: the First Level Processor (FLP) located in the experiment access shaft (CR1), and the Event Processing Nodes (EPN) in dedicated computing containers (CR0), see Fig. 89. The facility provides also the network for data distribution, large disk storage capacity as well as interfaces with the GRID and the permanent data store at the Tier 0 computing center.

The upgraded online system supports both continuous and triggered readout. Legacy sub-systems not upgraded to continuous readout are not capable of reading the full event rate and thus require a hardware trigger signal. These detectors are therefore read out whenever they are not busy. Triggered readout for all detectors is also used for commissioning and calibration runs. Data produced by the detectors are transferred to the Common Readout Units (CRU) (see Sec. 2.2) where they are compressed, multiplexed, and then transferred to the memory of the FLPs.

During the revolution period of the LHC ( $\sim$ 88.92 µs = "LHC orbit") there is an LHC filling scheme dependent number of bunch crossings (BC) at which collisions can occur. The ALICE data stream is divided into so called heartbeat frames (HBF) which have a duration of one LHC orbit and are synchronized with the LHC clock. A configurable number of HBFs form a time frame (TF), which represents the data container for data processing and replaces the traditional event entity. The nominal TF length is 128 orbits ( $\sim$ 11.4 ms). At 50 kHz interaction rate, it contains on average 569 Pb–Pb collisions. Continuous and triggered data are tagged by HBF and BC identifiers.

The FLPs perform a first level of data compression to 900 GB/s by zero suppression. In addition, they have the possibility to perform calibration tasks based on local information from the part of the detector they serve. One example is the TPC for which a first calibration step is already performed on the CRU. The signals from the GEM readout detectors feature an ion tail and at high occupancy a common baseline shift, that is best removed as early as possible. A Sub Time Frame (STF) comprises all HBFs belonging to a TF from one FLP. After all FLPs have built their STFs of an individual TF, an available EPN is selected and all STFs are sent there and the full TF is built.

A dedicated FLP is used to collect and process data from the Detector Control System (DCS) in two workflows. The first one processes DCS data shipped via the ALICE datapoint server and stores detector conditions like voltages, temperature, and pressure in compact objects (see also Sec. 5.6.3). The second one processes configuration files sent by detectors as well as LHC information. The calibration objects are stored in the condition and calibration database (CCDB) and from there they are read by the following processing stages. Another dedicated FLP is used to collect all trigger signals sent by the CTP to the detectors.

The EPN farm consists of 280 servers hosting 8 GPUs and 64 CPU cores each. The capacity has been dimensioned such that it can achieve a first reconstruction pass (referred to as synchronous reconstruction), extraction of calibration objects for subsequent asynchronous reconstruction passes, and data compres-



Figure 89: Overview of the components of the  $O^2$  data read out and processing systems and the main data flows.

sion. The compressed data are aggregated into so-called compressed time frames (CTF) replacing the original raw data and written to a disk buffer at an output rate of about 130 GB/s. The disk buffer has a raw capacity of 150 PB and is managed by the EOS system [90]. The erasure coding configuration used for storage protection reduces the usable capacity to about 120 PB.

Calibration data from EPNs are aggregated on dedicated nodes, processed, and stored in the CCDB. CCDB objects are distributed back to the whole  $O^2$  farm through multi-casting and migrated to the offline CCDB as well as to GRID storage elements, for usage by the ongoing synchronous reconstruction steps and for the later asynchronous processing and simulation, respectively.

The CTFs are transferred to the GRID for archiving. After data taking and full detector calibration, two or more asynchronous reconstruction passes are performed on the GRID as well as on the EPN farm. The output of these reconstruction passes is stored as Analysis Object Data (AOD), the input for physics analysis. For specific physics signals, a further data size reduction and speed-up of the corresponding analyses is achieved by filtering out events of interest and writing out only the minimum event information needed. The processing of pp data follows the same chain with an additional step of selection of interesting collisions during an asynchronous reconstruction passes are followed by Monte Carlo production cycles taking into account the time dependent detector conditions.

Besides the computing infrastructure, a common software framework has been developed within which all online and offline components are operated [91]. It consists of three main layers. The Transport Layer has been developed in collaboration with GSI (FAIR) and it uses the FairMQ message passing toolkit [92] with FairMQDevices as its main building blocks. It enables efficient parallelism by providing abstraction of network and inter-process communication as well as by supporting shared memory backed message passing for devices on the same node. The data model provides language agnostic and extensible descriptions of messages that are passed between devices [93]. It provides support for various back-ends such as a so called zero-copy format (a format that optimises performance by allowing to efficiently map files or portions thereof to memory and to share buffers between processes), serialisation based

on the ROOT data analysis framework [94], and Apache Arrow [95] for analysis and integration with external tools. Finally, the Data Processing Layer (DPL) abstracts computation as a set of data processors organized in a logical data flow specifying how data are transformed. Depending on the deployment environment, the data flow is mapped to a concrete topology and from there to a set of processes running FairMQ devices.



Figure 90: Synchronous reconstruction workflow.

# 5.1.1 Synchronous reconstruction

A schematic representation of the synchronous reconstruction workflow is shown in Fig. 90. The main objectives of the synchronous processing are the reduction of the data rate from the TPC, which accounts for most of the raw data volume and the extraction of data for calibration. This is achieved by performing clustering and full track reconstruction in the TPC and removing background hits from the data. More-over, cluster space point coordinates are stored as relative coordinates, thus reducing the entropy and allowing for efficient ANS entropy encoding [96] of the data. The TPC space charge distortion calibration uses the information of fully reconstructed barrel tracks including ITS, TOF, and TRD information. However, only a small fraction of all tracks needs to be fully reconstructed to gain sufficient data. Hence, full TPC reconstruction needed for data compression is the most demanding step in terms of computing time. The online processing makes extensive use of Graphic Processing Units (GPU), which provide a significant speed-up by about a factor of 50 [97] compared to one CPU core in an EPN server, without compromising the physics performance.

The TPC reconstruction code has been developed starting from the existing Run 2 High Level Trigger (HLT) algorithms. It starts with the cluster finding and is followed by tracking comprising the track finding, track merging, fitting, and compression steps. The presence of Space Charge Distortions (SCD) of up to 10 cm represents a particular challenge for the reconstruction of continuous data. In absence

of triggers, which provide reference for the drift time estimate, the *z*-positions of clusters are unknown. However, this information is needed for *z*-dependent quantities used during track reconstruction: the corrections of the SCDs, the magnetic field strength, and the cluster error parameterisation. Therefore, TPC tracking is first performed without these corrections. Since the distortion effects are smooth, the track finding is not strongly affected. Track seeds are extrapolated to the beam line and the most probable *z*-coordinate is calculated under the assumption that the track is from a primary particle and the vertex is at the interaction point. If the track turns out to be a secondary, an average pseudorapidity is assumed. The track is refitted with the corresponding corrections. The average SCD corrections require a first order correction map obtained from simulation or a previous reference run which is scaled by the instantaneous luminosity. In addition, the 1D integrated digital currents containing information about the fluctuations of the number of ion pile-up events and of the track multiplicity are used to achieve a partial correction of SCD fluctuations. During synchronous reconstruction, cluster positions can be corrected with a precision of  $\mathscr{O}(mm)$  which is sufficient for correct cluster associations to tracks. The full correction with a precision of  $\mathscr{O}(100\,\mu\text{m})$  will be performed during asynchronous reconstruction.

Two options for TPC data compression are supported by the software. In the first option (A) clusters from background (for example from noisy pads or charge clouds related to low momentum protons) and clusters that are associated to or in the proximity of background tracks are rejected. Background tracks include those from very low momentum particles spiralling around the magnetic field lines, track segments with large inclination with respect to the TPC pad rows, and clusters from secondary legs of looping low-momentum tracks used for physics. However, clusters in a tube around good tracks are protected. For the second option (B), only clusters that are attached to, or in the proximity of identified good tracks that may be used for physics analysis are kept. The estimated rejection fractions for options A and B are 12.5 - 39.1% and 37 - 53%, respectively. While option B yields lower data size it bears the risk that in case the SCD corrections are not precise enough track merging and partially also track following might lose good tracks or parts thereof. Optimal performance of option A requires identification of hits from particles with momenta below 10 MeV/c, since they contribute about 15% of all TPC hits. Tracking in this momentum region is challenging and is currently under development.

Further data size compression is achieved by converting the cluster properties from the single-precision floating point format used in reconstruction to custom integer and floating point formats with exactly as many bits as needed for the intrinsic TPC resolution. The entropy is reduced before the ANS encoding for further data compression. This includes the following steps. Coordinates of hits that are not assigned to tracks are sorted by geometrical coordinates and the difference to the previous hit is stored. Raw coordinates (row, pad, time) of hits assigned to tracks are stored relative to the extrapolated track (Track Model Compression). Cluster properties, maximum charge, total charge, and cluster size, are encoded together in order to profit from their correlation.

Synchronous data processing of the remaining detectors is performed on CPU cores in parallel with the GPU processing. For the ITS and the muon spectrometer system (MFT, MCH, MID), processing starts with space point reconstruction (clustering). For the barrel calorimeters EMCal, DCal, and PHOS, the cell properties (time, amplitude) are determined by fitting the raw time distributions. Clusterization is performed in order to select cells to write to the CTF, while final clustering is performed during analysis. Data for time calibration and dead-channel maps are extracted. For FT0, the reconstruction of collision times is performed for the needs of barrel global tracking and vertexing. FT0, FV0, and FDD digits converted from raw data are stored in the CTF.

For a subsample of tracks selected from peripheral collisions (about 1% of all tracks), full tracking including all barrel detectors is performed, i.e. ITS tracking after clustering, matching of ITS tracks to TPC tracks, and finally track matching to TRD and TOF. As in Run 2, residuals between global tracks and TPC clusters are used to create 3-dimensional space charge distortion maps with a granularity of 1-2 minutes when running with Pb beams and 10 minutes in pp collisions. These maps together with the

TPC integrated digital currents recorded during synchronous processing become part of the calibration used in asynchronous processing.

Global barrel tracks are also used to obtain fast TPC drift time and TRD calibration (gain,  $t_0$ ,  $E \times B$ , and drift velocity). Moreover, the drift of the LHC clock with time (due to temperature changes that impact the fiber refractive index and the distribution of the LHC clock time to the experiments), which affects the reference for the time of flight measurement as a global offset, is calibrated using global tracks matched to TOF. At the same time, the TOF channel-level offset in the measured times related to the cable lengths and electronics is determined. In addition, other calibration algorithms are running during online reconstruction, particularly, for the determination of the interaction region, calorimeter bad channels, and gain parameters. The general way to perform these online calibrations is to extract for every TF compact data related to the parameters being calibrated and send them to dedicated aggregator servers. The workflows running on these servers attribute the incoming calibration data to time slots with a granularity characteristic for each calibration type and automatically create a CCDB object for every slot once they have accumulated enough data for processing. During synchronous processing, also input data are accumulated for those calibration constants that need a large amount of data or are too demanding to be determined synchronously. One example is the TOF channel time slewing. The corresponding calibration information is extracted before the asynchronous reconstruction takes place, and the CCDB is updated.

The final processing step consists in compressing all data stored in the CTF using the rANS algorithm, a variant of Asymmetric Numeral System coders, which allows to reach the entropy limit [96, 98].

#### 5.2 First-Level Processors

The O<sup>2</sup>/FLP subsystem includes the First-Level Processors (FLPs) detector readout farm, the data quality control system, and the services for control, configuration, monitoring, logging, and bookkeeping.

#### 5.2.1 The FLP detector readout farm

|          |        |               |      |                |     | •             |
|----------|--------|---------------|------|----------------|-----|---------------|
| Detector | Link   | Readout links |      | Readout boards |     | Readout nodes |
|          | type   | DDL           | GBT  | C-RORC         | CRU | FLPs          |
| CPV      | GBT    |               | 16   |                | 1   | 1             |
| CTP      | GBT    |               | 14   |                | 1   | 1             |
| EMC      | DDL    | 40            |      | 8              |     | 2             |
| FIT      | GBT    |               | 34   |                | 3   | 3             |
| HMP      | DDL    | 14            |      | 4              |     | 2             |
| ITS      | GBT    |               | 432  |                | 22  | 11            |
| MCH      | GBT    |               | 550  |                | 30  | 11            |
| MFT      | GBT    |               | 304  |                | 11  | 5             |
| MID      | GBT    |               | 32   |                | 2   | 1             |
| PHS      | DDL    | 16            |      | 4              |     | 2             |
| TOF      | GBT    |               | 72   |                | 4   | 2             |
| TPC      | GBT    |               | 5832 |                | 361 | 145           |
| TRD      | Custom |               | 1044 |                | 36  | 12            |
| ZDC      | GBT    |               | 1    |                | 1   | 1             |
| Total    |        | 69            | 9291 | 16             | 472 | 199           |

**Table 10:** FLP readout farm used to transfer the data from the detectors to the  $O^2$  system.

The readout farm consists of 199 nodes, with 488 readout cards (472 new CRUs and 16 legacy C-RORC) to transfer the data from each detector to the  $O^2$  system. The number of FLP nodes and readout cards associated with each detector is given in Table 10. The total nominal readout bandwidth amounts to

3.5 TB/s from the detector electronics to the readout cards where it is compressed before the transfer to the memory of the FLP servers. Most of the detectors use the GBT link and the CRU (see Sec. 2.2) adopted for this upgrade. The system is also backward compatible with the Detector Data Link (DDL) [62] and the Common ReadOut Receiver Card (C-RORC) [7] used during the LHC Runs 1 and 2.

The server selected for the FLPs is the Dell Poweredge R740. The selection has been done after numerous hardware and software tests [99] and a competitive tender. Each FLP is equipped with 96 GB of DDR memory and two CPUs. The CPUs are of two different flavours of the Intel Cascade Lake generation (the Silver 4210 or the Gold 6230 with 10 or 20 hardware cores, respectively) depending on the processing needs of the detector. Each FLP hosts up to three CRUs, up to four C-RORCs, and one Infiniband network interface, each using one PCIe Gen3 x16 slot. The readout software performance allows data to be transferred from three CRUs simultaneously to the FLP memory for a total input throughput of 330 Gb/s corresponding to 85% of the maximum PCIe Gen3 bandwidth. The maximum output bandwidth available to the Infiniband network is 100 Gb/s.

The first layer on top of the PCIe interface of the cards is the PDA (Portable Driver Architecture) UIO (Userspace IO) kernel module [100]. PDA also provides a user space library in C [101] which supports PCIe device enumeration and provides a handle to PCI devices. The readout software includes the readout program and the readoutCard library [102] which orchestrate the simultaneous data transfers from the GBT links to the FLP memory as shown in Fig. 91. The transfer of data to the EPN farm is handled by the  $O^2$  data distribution (see Sec. 5.3).



**Figure 91:** Simultaneous dataflows inside the FLP from the CRUs to the DDR memories and from the memory to the Infiniband network to the EPN farm.

### 5.2.2 Data quality control

The online execution of the calibration and the reconstruction and the replacement of the raw input data by compressed data make reliable data Quality Control (QC) mandatory. Its main purposes are to quickly identify and overcome problems during data taking and to provide good quality data for physics analyses. It is also crucial to ensure that the data processing behaves as expected, especially when running synchronously with the data taking.

The  $O^2$  QC system [103] includes a distributed software framework as shown in Fig. 92.

Data samples are selected following a pseudo-random sampling and configurable policies at key points in the dataflow and are dispatched to local (on the FLPs and the EPNs) or remote (on QC servers) QC

tasks executing detector-specific algorithms. Their results are published as QC objects, for example hit distributions in sub-detectors, which are typically represented as ROOT [104] histograms. The results of the QC tasks running in parallel on many nodes are assembled by the mergers. Checkers evaluate the quality of the objects, resulting in QC qualities, that summarise e.g. whether the hit distributions are good or bad. The QC qualities can optionally be aggregated and are stored together with the QC objects in the QC repository. This database has reused the software developed for the CCDB of ALICE O<sup>2</sup>. The post-processing component encompasses asynchronous tasks such as correlation and trending of data derived from QC objects and qualities. It is triggered periodically, manually, or on certain events (e.g. start of run or end of fill). The Machine Learning component will be a particular type of post-processing. QC and quality objects are accessible to shifters and experts through a web-based QC GUI.



**Figure 92:** O<sup>2</sup> Quality Control design.

#### 5.2.3 Services

**Web User Interface framework** The Web User Interface (Web UI) framework provides the core functionalities and building blocks to easily create rich web applications. The server side features REST and WebSocket API, authentication via CERN Single Sign-On and authorisation using CERN e-groups. The client-side features Cascading Style Sheet building blocks for the user interface, asynchronous data fetching (Ajax), and bidirectional sockets (WebSockets). Several O<sup>2</sup>/FLP GUIs are based on this web interface: AliECS, InfoLogger, QC, and Bookkeeping.

**Control and configuration** The ALICE Experiment Control System (AliECS) [105] integrates the experiment control and configuration, the FLP farm control, and a high-level control interface to the  $O^2$ /EPN cluster. It implements a distributed state machine to represent the aggregated state of the constituent  $O^2$  processes of a data-driven workflow. Furthermore, it allows reconfiguration of running processes and simultaneous operation of multiple worflows, with easy reallocation of resources among workflows. Finally, it reacts to inputs, handling events from the user, the LHC, the trigger system, the DCS, and the FLP cluster itself with a high degree of autonomy.

Figure 93 shows the architecture of the system. The AliECS core is the control scheduler implementing the distributed state machine communicating over the google Remote Procedure Call (gRPC) protocol with the operator using the GUI and other interactive applications based on the AliECS Command Line Interfaces (CLI). The AliECS also uses a variety of communication protocols for the exchanges with other systems: DIP with the LHC, gRPC with the trigger system, and DIM for the communication with the DCS. Apache Mesos [106] is used by AliECS as cluster resource management system for the

management of  $O^2$ /FLP components, resources, and tasks inside the  $O^2$ /FLP facility, effectively enabling the developer to program against the datacenter (i.e., the  $O^2$ /FLP facility at LHC Point 2) as if it was a single pool of resources. AliECS supports two  $O^2$  Configuration and Control (OCC) interfaces to Mesos agents: either through the OCC library or through an OCC plugin for all the processes based on FairMQ, part of ALFA [107], which is the common  $O^2$  transport layer for physics data.

AliECS interfaces with Consul [108], a key-value store which acts as the configuration repository of the system. Once acquired by the AliECS core, configuration information is processed into an in-memory hierarchical key-value store, and from there it is fed into a template system in order to generate task deployment and configuration structures.

Most components of AliECS are written in Go [109], a statically typed general purpose programming language in the tradition of C, which is particularly suitable for distributed system development because of its advanced synchronization and threading facilities.



Figure 93: AliECS design.

**Monitoring** The monitoring subsystem [110, 111] provides a complete overview of the overall system health and detects performance degradation and component failures by collecting, processing, storing, and visualising values from hardware and software sensors and probes. As presented in Fig. 94, metrics are sent to the system from both Telegraf [112] (for system metrics) and the C++ monitoring library (via Telegraf, for application metrics). These metrics are processed in an Apache Kafka [113] cluster and later written to an InfluxDB [114] time-series database for permanent storage.

The InfluxDB time-series database supports downsampling, which decreases the value resolution over time reducing the total database size. It is planned to keep high resolution metrics for several days. After that, time metrics will be downsampled in order to decrease the number of points and store them until the end of the calendar year.

The system includes a data visualisation interface based on Grafana [115] and channels for alarms and reporting.

**Logging** The logging system has been adapted from the ALICE Run 2 DAQ software [116]. A new web-based user interface has been developed in addition to the existing GUIs.

**Bookkeeping** A new bookkeeping system [117] has been developed. It unifies two functionalities: gathering, storing, and presenting metadata associated with the operations of the ALICE detector and



**Figure 94:** The  $O^2$  computing system monitoring design.

tracking the asynchronous processing of the physics data. The front-end and back-end are based on the WebUI framework like the other applications and are adaptive to various clients such as tablets, mobile devices, and other screens. The back-end includes a relational database and a REST API specified in the OpenAPI standard that allows to easily build bindings in various languages (as C++ and Go).

### 5.2.4 Installation and commissioning

The O<sup>2</sup>/FLP system has replaced the former DAQ system used during the LHC Run 1 and 2 in the Counting Room 1 (CR1) located in the access shaft of the ALICE experimental cavern at the LHC Point 2. All the optical fibers transferring the data from the detectors to the CRUs have been installed in four campaigns: February–June 2019, October–November 2019, February–March 2020, and November–December 2020.

The FLP specification was reviewed in April and May 2019 and the purchase order made in August 2019. The FLPs were delivered in several batches from September to November 2019. The FLPs have then been prepared to house the CRUs which required a mechanical modification of the chassis from September 2019 to February 2020. The connection of fibers to the CRUs (readout and trigger) was performed from April to August 2020 and the cabling to the network from July to September 2020.

The FLP software has been released as a coherent set of packages monthly since July 2019 and weekly from August 2021.

The test of the FLP system with detector electronics started first in the laboratories in June 2018. Its commissioning with (large pieces of) individual detectors started on the surface in June 2019, and in the experimental cavern at the LHC Point 2 in March 2020. The global tests with several detectors began in July 2021 and the first realistic experience with beams was collected during the LHC pilot beam in October 2021.

### 5.3 Event Processing Nodes

The EPN farm is designed to perform a first online data reconstruction pass, extract detector calibration objects, and reduce the data volume in order to fit into the available storage buffer space of about 80 PB. The compression algorithms rely on data reconstruction properties, and the subsequent asynchronous reconstruction passes rely on the calibration objects. During periods when data is not being collected, the EPN farm, in addition to other resources, will be used for the asynchronous reprocessing of the data and will contribute computing resources to the physics analysis of the previously recorded data.

### 5.3.1 EPN farm

Due to the increased Run 3 computing needs and the resulting space and cooling requirements [89], a new data centre for the EPN farm was built on the surface at Point 2 of the LHC, close to the ALICE detector. The new EPN data centre shown in Fig. 95 consists of four modules for standard Information

Technology (IT) equipment and one infrastructure module. Each IT module has a cooling capacity of 525 kW and allows for power densities of up to 1 kW computing load per rack height unit (4.5 cm). This design allows the use of highly integrated servers, with the maximum number of supported GPUs per server.



Figure 95: The ALICE CR0 data centre which houses the EPN farm.

During data taking, the EPN farm will receive up to  $\sim 900 \text{ GB/s}$  from the FLP farm. This data rate will then need to be reduced to  $\sim 130 \text{ GB/s}$  in real time, in order to write the data to the final storage space, see Fig. 89. The EPN farm is connected to the FLPs via a fast InfiniBand HDR network with a total throughput of 14.4 Tbit/s. Connectivity to the disk buffer in the CERN IT data centre is realized via Ethernet with 100 Gbit/s link speed and a total bandwidth of 2.4 Tbit/s, in a high availability setup.

The EPN farm consists of 280 EPN servers, which provide the necessary computing power for the synchronous Run 3 data processing, required by the O<sup>2</sup> software. The servers were dimensioned benchmarking the compute performance with simulated data via the O<sup>2</sup> software. This determined the required CPU cores, number of GPUs, and size of the memory per server. In the current state of the computing and software infrastructure, 230 EPNs are needed to process the 50 kHz Pb–Pb data, replaying simulated data. A 4U Supermicro GPU server was chosen for its capability to house eight double-width GPUs as well as an InfiniBand HDR host adapter. The servers are equipped with two 32 core AMD Rome CPUs (PSE-ROM7452-0057) with 512 GB DDR4-3200 RAM, eight AMD MI50 GPUs with 32 GB Memory, a 1 TB NVMe disk, along with an InfiniBand high data rate (HDR) host channel adaptor (HCA) operated at 100 Gbit/s.

# 5.3.2 EPN installation

The first containers for the data centre were delivered at the end of September 2018, the last two containers at the end of July 2019. Extensive load tests were performed to commission the control of the cooling system, before installing the IT equipment. The first usage of the data centre was via a test system using old Run 2 servers, called vertical slice, in October 2019. The first batch of the final network was installed and enabled the commissioning of the infrastructure. The vertical slice commissioning allowed the testing of the complete chain, from FLP to EPN to the EOS Open Storage system hosted in the computing center at CERN, in a reduced capacity. In June 2020, the final network for the EPN farm was installed in preparation for the arrival of the EPN servers, and following the Production Readiness Review for

the EPN servers in August 2020, the order was prepared. The installation of the servers into the final rack positions was performed in January 2021. Another round of tuning for the cooling system was done with the production servers at the beginning of 2021, to optimize settings to the final Run 3 system. The EPNs were used for commissioning at Point 2 since beginning of 2021. The final part of the network, the gateways from the InfiniBand network to the Ethernet network of the storage facility, was finalised by June 2022 due to delays in the availability of the gateways and issues with operating multiple gateways in a high-availability cluster.

# 5.3.3 $O^2$ data distribution

The data flow in the O<sup>2</sup> system starts with the CRU performing direct memory access (DMA) transfers of the detector data into the memory of the FLP. The CRU DMA region is mapped as a shared memory region, which allows efficient intra-node communication between readout, data distribution, and local synchronous processing tasks. The zero-copy network transfers are implemented using remote direct memory access (RDMA) protocols of the InfiniBand network interface. Detector data corresponding to the configured number of HeartBeat Frames (HBF), nominal 128 and up to 256, are aggregated from all FLPs on a single EPN node, forming the Time Frame (TF), which is the input for synchronous reconstruction.

 $O^2$  data distribution network By far the largest data bandwidth requirement in the  $O^2$  network comes from the readout data stream from FLPs to EPNs. As the data moves from all FLPs to a single EPN node, as mandated by synchronous processing, the flow must be actively regulated to avoid any points of congestion.



Figure 96: Network diagram of the Run 3 O<sup>2</sup> facility.

The network diagram of the entire  $O^2$  facility is shown in Fig. 96. To satisfy individual FLP data rates, which vary from FLP to FLP depending on, e.g. the connected subdetector and number of installed CRUs, a 100 Gb/s InfiniBand network interface was chosen. The same interface is used for EPN nodes. The architecture of the InfiniBand network is implemented using a two-level folded-Clos topology network, often referred to as a fat tree. The network is built using 40 port switches for both core and top-of-the-rack (ToR) level switches. Two nodes interface at 100 Gb/s using a single switch port utilizing copper splitter cables. The core of the network is implemented using fiber optic cables operating at 200 Gb/s. Following the requirements, the network fabric features non-blocking communication from FLPs to EPNs, but implements a high blocking ratio within individual EPN and FLP sub-segments, where the bandwidth is not required. Additionally, three InfiniBand to Ethernet gateways, with 8 links at 100 Gb/s each, provide the required throughput for mass storage and other services for offline processing.

 $O^2$  data distribution software The ALICE detector implements a readout scheme with both continuous and triggered readout where the full online reconstruction is performed during the data acquisition. The data stream from the detectors is grouped into time frames, where one particular TF is processed on a single EPN. The detector data arrive into the cluster of 199 input nodes (FLP).

Given the large number of FLPs, such a scheme has to implement a deep pipeline, where in the worst case scenario the number of TFs in-building approaches the number of FLP senders. However, as the transfers use RDMA read primitives, the EPNs are able to pull Sub-Time Frames (STFs) close to the line rate of network interfaces, without creating congestion in the network or the receivers. This allows for optimal transfers of the STFs and reduces the TF building pipeline. The data rates across FLPs from different detectors range from 10 kB/s up to 8 GB/s, where the size of STFs depends on number of HBFs in a STF and can fluctuate in presence of changing conditions or equipment faults. Therefore, the size of time slices can vary strongly in time and the processing time of time slices is also quite variable. Therefore, the data distribution scheduling framework has to take such fluctuations into account. There is a trade-off between the implementation of larger buffers on the processing nodes, allowing to average some of the fluctuations, and the overall latency of the time-slice processing. Further, the system has to be stable against failures, such as the crash of a processing job or failure of a processing node. In such cases, the data loss is localized only to the TFs currently being processed by the affected process or node. The data distribution system is designed to accommodate changing processing requirements by allowing the addition or removal of EPNs into the ongoing data taking run. This enables the use of processing nodes for offline processing during times when there is a low load on the online system.



Figure 97: Data distribution software framework.

The high-bandwidth many-to-one data flow that is needed to assemble full TFs on the EPNs is managed by a data distribution scheduling framework. Figure 97 shows the main components of the developed data distribution system: sub-time-frame builder (StfBuilder) and sender (StfSender) on FLP nodes, time frame builder (TfBuilder) on EPN nodes, and time frame scheduler (TfScheduler) on an EPN infrastructure node which orchestrates the data distribution components and regulates the data flow. Detector data (HBFs) are transmitted to the StfBuilder, which publishes STF objects for the local processing and the quality control. The StfSenders report the availability and information about STFs to the scheduler, which selects a suitable EPN node where the TF can be built, once all STF components are available. Therefore, the scheduler keeps track of the utilization of all EPNs and buffer states of sending nodes. As several TFs are being built at the same time, due to the many-to-one traffic nature of TF aggregation, the scheduler keeps track of network fabric utilization to avoid creating congestion hotspots. Once TfBuilders are given information about all the STFs, they fetch the data from FLPs using remote DMA (RDMA), decreasing the CPU utilization on both the sender and receiver side. In the case when the data cannot be scheduled to any EPN, the StfSenders are instructed to drop the data in order to avoid creating back pressure. The dropping of complete TFs will occur only when there is not enough processing power available, e.g. insufficient number of EPNs in the partition, or in the presence of failures, e.g. frontend or readout misconfiguration or network issues.

The TF scheduling task keeps track of the utilization of TF destination buffers on EPNs and utilization of the shared memory (readout) buffer on FLPs. In case EPN or FLP buffer utilization is reaching the high watermark (configurable, and typically 32 GiB for FLP and 112 GiB for EPN), the TF scheduler throttles the transfers. This ensures the network transfers are not propagating back pressure to the FLP readout processes. Overflowing FLP buffers could be a result of misconfiguration of the readout card or frontend, where an FLP would generate more than 100 Gb/s of raw data or an unstable network link resulting in reduced network bandwidth. Sufficient EPN buffers might not be available if the number of allocated EPNs is not sufficient to process the incoming TFs. In the nominal case, when there is no back pressure in the whole data flow and processing chain, the TF scheduler assigns an EPN for each individual TF, maintaining even EPN and network utilization.

## 5.4 Physics data processing

## 5.4.1 Asynchronous reconstruction

Pb–Pb and pp data taking with synchronous reconstruction is followed by an about four to six week period during which the final calibration constants are evaluated. For some detectors, this requires also a short calibration pass over CTF data before the full asynchronous production passes can start.

The final calibration is performed during reconstruction passes, in particular targeting full correction of TPC space-charge distortions and nominal resolution. At this stage, all detectors are included in the reconstruction. The TPC tracks are matched to ITS tracks and propagated to the outer detectors. The global tracks are established by combining information from multiple detectors, and improved track fits are performed. Primary vertices are reconstructed and secondary vertices are identified in order to reconstruct V0 and cascade candidates. For long-lived particles decaying at large radius and producing TPC tracks unconstrained by other detectors, continuous readout poses an additional challenge. Every pair of unconstrained TPC tracks needs to be tested for multiple hypotheses of V0s from different primary vertices compatible with the allowed time (or z) range of the tracks. Since the TPC track corrections depend on their z position, this may even require on-the-fly re-calibration and refit of TPC tracks. In a final step, the particle identification hypothesis is assigned, based on combined information from all detectors. For the muon spectrometer system, stand-alone tracking is performed for MFT and MCH, followed by matching of MFT-MCH track-segments and track selection to form global muon tracks. At least two full passes of asynchronous reconstruction are planned to achieve the full performance.

For pp data at full energy, the first reconstruction pass includes an event selection procedure in order to reduce the overall data size. In addition to physics events of interest, such as heavy-flavor, highmultiplicity, or diffractive events, events needed for the TPC distortion calibration are selected. The CTF size is reduced by only keeping the clusters associated to tracks that point to the primary vertex of a selected collision within  $\pm 30$  cm in z. The goal is an event rejection factor of 1000 leading to a CTF reduction to 1.2% of the original size. Data from reference runs with pp collisions at the same centre-of-mass energy as the Pb–Pb data taking will not be pre-selected, but fully transfered to mass storage.

When the EPN farm is not (fully) used for synchronous processing, e.g. outside data taking periods, it will be used for asynchronous reconstruction. During asynchronous reconstruction, the number of processing steps is larger than in synchronous reconstruction and without further code running on the GPU processing is CPU bound. To make optimal use of the GPU resources on the EPN farm also during asynchronous reconstruction, ALICE aims to offload more processing steps onto the GPU with the final goal to run the complete barrel tracking on GPUs. The reconstruction code is written using generic C++ code and can run on different GPU hardware. This opens also the possibility to run reconstruction efficiently on heterogeneous computing platforms that become available on the GRID.

Reconstruction passes are followed by Monte Carlo (MC) simulation productions. Physics analysis will be performed using GRID computing resources and on dedicated analysis facilities, using AODs from collision data and from MC productions as input and produces additional physics objects like fully reconstructed charmed hadrons and jets.

#### 5.4.2 Simulation

Physics simulation comprises primary event simulation, the transport of particles through the detector geometry, detector response simulation, and digitisation of the detector signals. The O<sup>2</sup> software framework for simulation [118] has been developed within the ALFA project, an ALICE/FAIR collaborative effort based on common components such as FairRoot [119] and FairMQ [92].

GEANT4 [120] is employed as the main transport engine. As for AliRoot in Runs 1 and 2 [121], the  $O^2$  simulation framework uses external transport codes through the Virtual Monte Carlo (VMC) layer [122]. This allows also the use of GEANT3 [44] and FLUKA [123] with the same user code, for example for studies of systematic uncertainties or radiation calculations. The detector geometry is described using ROOT/TGeo and the detector response using the VMC API and callbacks. As part of new developments for  $O^2$  the VMC interface has been extended to interfacing fast detector simulation components which can replace detailed simulation in parts of the detector or for certain particle types [124]. Preserving the VMC interface has allowed efficient porting of detector code from AliRoot into  $O^2$ .

The  $O^2$  simulation framework has been developed with two main objectives in mind: the possibility to leverage opportunistic resources, in particular High Performance Computing (HPC) facilities, which frequently offer only very short processing time windows, and performance optimization through parallelism on top of the capability of individual parts, going beyond standard event multi-threading of GEANT4. To this end, the simulation process is broken up into individual subprocesses: primary particle generation (event server), detector simulation, and I/O processes. These subprocesses run in parallel and interact with each other via sending and receiving messages. Parallelism is improved by further dividing the event simulation task into the processing of sub-events. Multiple independent detector simulation worker devices are instantiated at the same time. Each of them asks the event server for work chunks to process, where a chunk is either a full event or a sub-event. Hence, the system is able to process multiple events in parallel or collaborate on the simulation of a single event concurrently. A strategy based on so called late forking makes optimal use of common memory between the different processes. Processing speed-up as a function of simulation workers shows ideal strong scaling up to the physical number of cores. By reducing the processing time for a unit of work, the framework naturally supports the usage of opportunistic resources providing short processing time windows. In addition, there is the possibility to combine the results of various smaller transport simulations during digitization, so that a large and costly timeframe simulation can be split across multiple smaller jobs if necessary.

The output of detector response simulations are hits typically containing space-point and energy loss information of particles passing sensitive detectors. They serve as the input to the digitization step in which also the Time Frames are created. Since at the peak Pb–Pb luminosity on average five events can overlap within the drift-time of the TPC, a simulated time frame cannot be assembled from independently digitized events. Hence, the digitisation workflow takes into account the contributions from different collisions to the same digits.

Since the full simulation of Pb–Pb collisions is very time consuming, an optimized simulation strategy, named embedding, has been developed for AliRoot and used in production during Run 2. The background events are reused multiple times and overlaid with rare signal events Owing to the overlapping features mentioned above, the  $O^2$  simulation framework supports this strategy naturally. The maximum time gain by embedding is limited by the time spent in digitization, in particular for the TPC. For this reason, effort has been put into reducing digitization time to a minimum. The fraction of digitization time of the total simulation time is  $\approx 10\%$  without embedding and reaches 40% with embedding. In addition to strategies such as embedding, an efficient MC workload execution engine based on a directed-acyclic graph scheduler was developed. This engine performs dynamic scheduling of tasks in the MC processing chain with the goal to make optimal use of multi-core GRID resources. Moreover, it naturally brings novel features, such as checkpointing or start-stop-continue possibilities to the processing. This is important for debugging or to split the processing over multiple GRID jobs.

### 5.4.3 Analysis

In Run 3, about 4 PB of AODs will be produced per Pb–Pb running period and a total of about 50 PB will be accumulated in Runs 3 and 4. Considering a typical analysis turnaround cycle of a few days for the full data set and assuming that all data is read only once, a data throughput of the order of up to 100 GB/s is required. In order to meet this requirement, an optimized analysis model as well as a new analysis framework have been designed.

In order to achieve a fast turnaround cycle for analysis code validation and cut optimization, 10% of each data set (including simulated data) are copied to dedicated analysis facilities with exclusive access for ALICE (at the time of writing GSI, Darmstadt and Wigner, Budapest). Each of these consists of 20000 CPU cores, equipped with fast local storage and an internal network capable of sustaining high rates of data transfer from the storage to the the computing nodes. The facility only analyzes local data, to reduce problems due to slow external network connections or remote storage instabilities. The fast internal network allows for data to be moved quickly from the storage to the nodes. Analysis task validation on the analysis facility before running over full data sets on the GRID avoids inefficiencies in the most costly stage of processing. Moreover, a large reduction of processing time is expected from the systematic usage of so called derived data sets of reduced size. This is in particular the case for analysis of rare processes. Derived data sets can be obtained through event selection (filtering) and/or event data reduction (selection of only those quantities strictly needed for a specific analysis).

The new data analysis framework [125] fully leverages the DPL and is built on top of it offering an even higher level of abstraction for the benefit of analysis code writers. As in Run 2, analysis is organized in trains consisting of wagons, the individual analysis tasks [126]. In the new framework wagons correspond to a group of DPL devices allowing to process the tasks in parallel and to remove crashing tasks from the train. Data are represented in memory as flat tables similar to a relational database and stored as flat ROOT trees. This saves the processing time for de-serialization needed for the nested C++ objects used in the old framework. In order to keep the size on disk small, a number of quantities are recomputed automatically when the data are read from disk. Significant development has been done to perform these operations transparent to the user (building on C++17 extensions). The in-memory tables are implemented using Apache Arrow, an Open Source cross-language development platform [95]. It provides interoperability between external tools like Python Pandas [127], Apache Spark [128], and many others. The compatibility with ROOT is guaranteed by using the TArrowDS data source which allows using Arrow with RDataFrame. Besides I/O efficiency, the new data format naturally allows for optimized vectorized processing and declarative analysis. The frameworks API isolates many of the advanced features from the user and data access methods are similar to the ones used in the analysis framework in Runs 1 and 2. This facilitates porting of user analysis code into the new framework.

### 5.5 Central Trigger System

ALICE operates at an interaction rate of 50 kHz for Pb–Pb collisions and up to 1 MHz for pp and p–Pb collisions. The majority of ALICE detectors are read out continuously. A minimum bias trigger signal is recorded along with the continuous data in order to flag collisions. For legacy detectors that are not upgraded to continuous readout, the minimum bias trigger initiates the readout if their readout electronics are available and not busy from a previous readout operation. For pp running, event filtering based on fully reconstructed events will run on the EPN farm. The upgraded central trigger system (CTS) [129]

provides clock, timing, and trigger signals.

# 5.5.1 Requirements of the Central Trigger System

The CTS supports continuous and triggered readout. Detectors upgraded to continuous readout use triggered mode for commissioning and dedicated runs only. The CTS governs the continuous readout by sending regular heartbeat (HB) triggers to the front-end of upgraded detectors to synchronise data streams of the detectors and to adjust the data-taking bandwidth by either sending a HB accept (HBa) or a HB reject (HBr) trigger message. The CTS also provides minimum bias triggers at three different latencies, referred to as LM, L0, and L1, depending on the timing requirements of each detector. The CTS operates without dead-time by processing trigger inputs and distributing the corresponding trigger output signal for each bunch crossing. The CTS is connected to the ALICE readout via its own dedicated CRU, such that trigger decisions are recorded together with the detector data. In addition, the system may be used to monitor the status of all CRUs and is able to throttle the readout rate depending on the status of the CRU buffers.

## 5.5.2 Trigger hardware and interfaces

Just like the trigger system for Runs 1 and 2 [1], the CTS system is located in the experimental cavern. It employs a two-stage distribution system, which includes a Central Trigger Processor (CTP, see Fig. 98) and Local Trigger Units (LTU). The CTP receives the LHC timing signals and the trigger input signals, and is connected via bidirectional TTC-PON optical links [5] to up to 18 Local Trigger Units (LTU), one for each detector. The standard CTS timing and trigger signal distribution path is from the CTS via the detector specific CRUs to the detector front-ends via bidirectional, radiation-tolerant GBT links [6]. Detectors that require latency-critical trigger signals receive these trigger signals additionally on a direct path from the CTS to the detector front-ends on GBT links. Legacy detectors not supporting continuous readout are read out via C-RORC readout cards [7, 102] and require a hardware trigger signal to initiate the readout. They receive the clock and trigger signals via the legacy TTC system [4]. In commissioning runs, the LTUs may be decoupled from the CTP and used to emulate CTP signals for testing purposes.

The CTP and LTU are based on identical PCBs. Each board contains one Xilinx Kintex Ultrascale FPGA, two 1 GB DDR4 memories, two Si5345 PLLs, one FME-HPC connector, two six-fold SFP+ cages, a single SFP+ cage, and two UCD90120A power controllers. The CTP and LTU boards only differ by the installed FPGA: the CTP board is equipped with the more performant XCKU060-2FFVA1156E; the LTU uses the pin-compatible XCKU040-2FFVA1156E FPGAs. The boards feature a triple-width 6U VME format using the VME backplane for power supply only, with all data interfaces being available via the front panel.

The CTP board utilises an FPGA mezzanine card with a total of 72 LVDS I/O connections: two differential links are used for clock signals, 48 for trigger inputs (12 LM, 12 L0, and 24 L1), four for BUSY inputs from legacy detectors, and two for direct LM trigger outputs for detectors requiring a minimum latency trigger. Some of the LTUs are equipped with a commercial FMC S-18 card to extend the number of optical connections to the detectors from 12 to 19. The LHC clock and orbit signals are connected via Lemo connections. The cards are remotely programmable via their JTAG ports connected to an Ethernet adapter and are controlled via an IPbus interface [130]. The CTS allows monitoring of internal counters, including trigger inputs, subdetector BUSY signals, and an internal snapshot memory.

# 5.5.3 Trigger protocol and data format

The minimum bias trigger input signals are delivered to the CTP by the FIT detector. The TOF, EMCal, DCal, and PHOS detectors also deliver trigger inputs for dedicated run scenarios. The CTS aligns the trigger inputs and synchronises them to the BC clock. The trigger algorithm is applied using a lookup table and produces the trigger output signals. The latencies for the trigger input signals to reach the



Figure 98: Photograph of a CTP module.

CTS are 425 ns, 1200 ns, and 6100 ns for LM, L0, and L1, respectively. The CTS processing and signal propagation time is about 150 ns. The total latency from interaction to output trigger signal is 575 ns. The CTS can generate internal triggers controlled by software which are used for debugging and detector calibration.

The CTS allows grouping of detectors into up to 18 clusters, forming a data acquisition partition independent from other clusters. Naturally, it also foresees the inclusion and exclusion of individual detectors depending on run conditions. Similar to Runs 1 and 2, the trigger signal distribution for the triggered legacy detectors is protected by a BUSY signal, which communicates whether a detector is ready to receive the trigger signal. The trigger message transmitted to the CRUs and detector front-ends consists of the trigger type (32 bit), the LHC orbit counter (32 bit), and bunch crossing counter (12 bit). The CTP is read out similar to a detector and its data is merged into the continuous data stream. The CTS readout contains information on the trigger messages sent to the detectors, the trigger input mask (specifying the active CTS trigger inputs), and the trigger mask (specifying the trigger conditions and the active detectors) for each bunch crossing. The transmission of the status information of all the CRU buffers sent upstream to CTP from CRUs will be implemented at a later stage.

A general overview of the trigger system is shown in Fig. 99.

In the continuous readout scheme, the readout rate may be downscaled by introducing periodic HBr triggers. That data rate can be adjusted by changing the ratio between HBa and HBr triggers as required. At present, the system allows the application of a pre-defined sequence. Dynamic modification of the ratio depending on the CRU buffer status will be implemented at a later stage, once the operation of the ALICE system is fully tuned. This functionality uses the transmission of the state of the CRU buffers in the HB acknowledge messages that are sent by the CRUs to the CTS upon reception of a HBa or HBr message. The HB acknowledge messages of all CRUs are assembled into a HB map in the CTS which is used to assess the buffer states of all CRUs and to decide on modulating the HBa/HBr rate accordingly.

### 5.6 Detector Control System

The ALICE Detector Control System (DCS) [131] ensures safe and stable operation of the experiment. Its architecture is derived from the previous versions used in Run 1 and Run 2, but significant extensions were developed and deployed for Run 3 to allow for integration of new readout electronics. The internal conditions data flow was also modified, to provide continuous streaming of conditions data in real time [132].

An optimized standalone DCS system is available for each detector. These systems are built by detector



Figure 99: Trigger system overview.

teams, following the guidelines prepared by the central DCS team. The central DCS further integrates the detector DCS into one distributed system, which can be operated by a single operator.

During the design phase of the DCS great attention was given to the selection of hardware components and software tools. All systems are based on the commercial SCADA (Supervisory Control and Data Acquisition) system WINCC OA [133] provided by Siemens. SCADA systems are based on industrial standards and are widely used to efficiently supervise processes by monitoring and controlling the devices. High level of standardization allows for deployment of common solutions which largely simplify the development cycle and reduce deviations from operational standards adopted in ALICE:

- The CERN Joint Control Project (JCOP) provides a common framework for integration of standard devices such as power supplies, embedded logical controllers (ELMB) [134], magnetic field sensors, etc. It is developed as a joint effort of LHC experiments. ALICE contributes with a component for integration and control of ISEG power supplies. The JCOP framework also contains, for example, all necessary tools for integration of CERN standard protocols DIM [135] and DIP [136] as well as the State Management Interface (SMI++) [137] used to model the device operation as FSMs.

- The ALICE framework further extends the JCOP framework with tools specific to ALICE such as a unified user interface, or FRED framework for integration of front-end modules.

Despite its complexity, the whole DCS can be operated by one person using a single user interface. The FSM mechanism guarantees that commands are executed in a correct order and experiment conditions (such as status of cooling or power systems) are always taken into account. Predefined states were implemented as a response to different conditions of the experiment. For example, the requirements for high voltage settings during the beam injections differ from those during data taking.

To minimize the human factor in the operation, most actions are executed by the system without the need of manual intervention. The operator specifies the desired state (for example, move ALICE to a state compatible with beam injection or magnet ramps) and the system determines and executes the corresponding sequence of commands. Direct interaction of the operator with the system is required only in case of exceptions (for example the recovery from power cut or detector trip). Almost all operations could be executed automatically, however certain checkpoints where operator response is required were introduced to ensure that sufficient attention to the operation is given by the shift crew.

### 5.6.1 DCS computing hardware upgrades

The core of DCS computing is installed in ALICE counting room CR 3. Three rows of racks originally hosting the DCS cluster were replaced with 26 new racks in two rows, each equipped with water cooled doors. All DCS computers were replaced with new hardware. In total 200 new servers, mostly running WINCC OA [133], were installed and configured. These servers run the central distributed SCADA system and front-end control servers, and provide services required for DCS operation (DNS, BootP servers providing boot images to diskless controllers, fileservers, databases, etc.). Old servers were kept operational alongside the new servers in order to provide a smooth transition to the new hardware for the detector groups. Using this approach, the DCS maintained almost uninterrupted operation during the whole LS2 period.

Files shared between several systems are residing on a new redundant cluster of file servers. These are also hosting installation repositories and system backups. The fileservers are mirrored in a cluster installed outside of the ALICE experiment network.

The DCS configuration data (device settings, front-end configuration, etc.) and conditions data (temperatures, currents, pressures, system states, etc.) are stored in an ORACLE database. This mission critical service is provided by a new cluster consisting of four servers and highly redundant storage. The whole database is replicated using ORACLE Active Data Guard technology (ADG) [138] to a twin cluster installed outside of the ALICE detector network. The ADG provides data availability and protection by mirroring data to a standby database which can replace the ALICE online cluster in case of severe failure. The standby database further provides ALICE data in read-only mode, to offload heavy load operations from the production cluster.

All computer racks hosting the DCS cluster are equipped with switches with 10 Gigabit Ethernet uplinks to the router. This provides sufficient bandwidth for DCS operation also beyond Run 3. A total of 144 multi-mode and 48 single-mode patch panels are installed in the DCS counting room. Two high-speed (40 Gb/s) Ethernet links connect the DCS cluster with the FLP farm. These links carry the traffic to the front-end electronics and are used for the streaming of DCS data to  $O^2$  as explained below.

The DCS counting room network renewal is part of the overall  $O^2$  network upgrade. The DCS network services counting rooms, control rooms, surface areas (gas system, EPN containers), and the whole cavern. The infrastructure put in operation before Run 1 reached the end of its lifetime and was entirely replaced. All routers and switches were installed in parallel to the installation and commissioning of detectors. New infrastructure was installed alongside the old one and was put in operation in phases, in order to minimize the impact on the commissioning activities of the experiment. The network upgrade process lasted two years and the old network was decommissioned recently.

One of the key factors affecting the selection of communication busses is the distance between the servers installed on the surface and the devices in the cavern. To provide stability in a harsh environment, the Controller Area Network (CANbus) [139] has been adopted. It is based on a serial bus designed for robust performance over long distances. A majority of the devices controlled by CANbus are commercial power supplies or Embedded Logical Monitoring Boards (ELMB) mainly used for environment monitoring and rack control. CANbus is also used in the ITS controls, where it provides a redundant channel for hardware access and also served the interlock control during the on-surface commissioning. All CANbus controllers, previously based on USB or PCI devices, were replaced with ANAGATE CANbus-Ethernet gateways [140]. This change allowed to remove the dependency on a physical connection between the DCS server and CAN controller. In case of a server failure, a physical intervention on the CANbus network is no longer required.

# 5.6.2 DCS software upgrades

The DCS software is structured into several layers as shown in Fig. 100.

The driver layer is connected directly to controlled hardware. It provides the device specific low-level interface. To integrate the numerous hardware configurations used in ALICE, this layer also provides hardware abstraction, which hides all device-specific details. The devices are presented to the DCS by standardized interfaces. Most commercial devices use the industrial communication standard OPC [141] for this purpose. An OPC server communicates with the hardware and exposes its functionality in a form of standard commands and services carrying the monitored data.

Custom hardware modules developed for ALICE are controlled by software exposing its functionality to WINCC OA using the CERN DIM protocol. Similar to OPC, the DIM servers convert the device specific interface to a standardized set of of commands and services. Both technologies require that either an OPC or DIM client is deployed on the WINCC system, however, this component is common for all detectors.

The controls layer implemented in WINCC OA performs the basic control and monitoring tasks. It sends commands to devices and reads back the responses. Scripted actions allow to execute procedures, based on the received response. Monitored values are in addition compared with predefined settings and in case of deviations a message can be sent to an alert system. In most cases an automated procedure can regulate the settings without any human intervention.

The logical layer encodes detector specific operations in a form of a Finite State Machine (FSM) organized in a hierarchical way. This layer is entirely encoded in SMI++ language, which allows for definition of stable states and rules for the transitions between them. The FSM logic encodes experience gained over years of operation and defines reaction of detectors to different experiment conditions (such as beam injections, data taking, ALICE magnet ramps, etc.)

Finally, the user interface layer presents the DCS to the operator using a unified interface. All DCS functionality can be reached from a single panel in an intuitive way. Each action sent from the user interface is verified by high level scripts and protects the experiment from incorrect commands sent while detectors are not in compatible condition.



Figure 100: DCS software architecture.

A significant part of the software upgrade concerned the communication with the devices. In the past years the widely used commercial OPC standard evolved and a new OPC UA technology [141] emerged. It replaces the older OPC DA technology used in Run 1 and 2. During the LS2, the new standard was tested and adopted in ALICE. Currenty, a total of 49 OPC servers control commercial hardware (power supplies, PLCs, etc.).

Upgraded front-end electronics in ALICE are read out through GBT links described in Section 2.2. Those are controlled by CRUs, installed in FLP servers. The same links are used to configure, control, and monitor the detector front-end electronics. This grouping of functionality has an impact on DCS, because the FLPs are not part of the DCS domain. As a result, the DCS does not have direct control over the front-end electronics.

A client-server based architecture, named ALFRED, was developed to address the link sharing between the readout and the DCS. The low-level link access is established through ALF (ALICE Low-level Access) module and executed on the FLPs. This software is detector agnostic and transmits data received from FRED (Front-End Device server) [142] to the front-end electronics. If the communication with the front-end electronics is based on the Slow Control Adapter (SCA) protocol, commands are sent through a dedicated communication channel using reserved bits in the transmitted GBT frames. For faster controls (mass configurations), the SCA channel does not provide sufficient throughput. Custom protocols, such as Single Word Transactions (SWT) are mastered by ALFRED. The data are transmitted in dedicated GBT frames in this case. The response produced by the detector front-end electronics is propagated from ALF back to FRED as illustrated in Fig. 101.



Figure 101: Access to the hardware implemented in ALF-FRED mechanism.

The FRED component is a framework provided by central DCS. Detector specific code is embedded inside the FRED server framework. The framework functionality covers the ALF-to-FRED communication and synchronization. FRED also provides a communication layer to interface the system with WINCC OA. Using this architecture, a large uniformity has been achieved even between the largely heterogeneous detector front-ends. Currently eight detectors use FRED to communicate with electronics through GBT links and one detector uses FRED to control the user part of the CRU firmware.

Profiting from the flexibility of the ALFRED framework, the ITS detector developed an interlock system controlled via CANbus. A modified version of ALF (CANALF) communicates with the devices using

the ANAGATE gateway instead of the CRU. The ITS electronics can also use CANbus in place of GBT as a fallback solution, bypassing the FLP. Due to reduced bandwidth of CANbus, this solution cannot fully replace the standard communication over GBT link, however, it provides the necessary level of redundancy. Using a command originating in WINCC OA, FRED can redirect communication from ALF to CANALF and continue detector operation still using the same user code.

Finally, FRED is a fully scalable and parallel framework. For large detectors, this can result in high amounts of data to be processed by a single instance of FRED. Depending on detectors size, the deployment of FRED ranges from single FRED instance on a single server, through multiple FRED instances on the same server, up to several instances distributed across multiple servers.

### 5.6.3 DCS conditions data

Data collected from detectors are stored by WINCC OA systems in an ORACLE database. Part of these data are used in the  $O^2$  online processing. During LHC Runs 1 and 2, these data were retrieved by a dedicated process after each run (a period of stable data taking during a LHC fill). The collected data were merged with detector data acquired by the DAQ system and further processed by the offline framework. The Run 3 conditions processing expects continuous data processing, and the DCS conditions data must be provided in real time. This challenge was addressed by the newly developed ADAPOS (Alice DAtaPOint Service) system [143].

The ADAPOS server collects a requested set of conditions data from individual WINCC OA systems and assembles a so-called Full Buffer Image (FBI). Each value stored in the FBI is updated whenever the WINCC OA system provides a new measurement. The FBI represents an up-to-date image of current DCS conditions.

The ADAPOS system is able to periodically send the up-to-date FBI for further processing with an update frequency of 20 Hz. However, most of the values do not usually change significantly, and most records in the FBI remain unchanged for longer periods of time. To save resources, ADAPOS can operate in transparent mode, where the FBI is maintained in the ADAPOS memory, but only parameters that recently changed are forwarded to the consumer. The mode selected for operation is a hybrid one, which complements the transparent mode with a full FBI sent at regular intervals to reinforce data integrity at the point of processing.

To ensure stability and high availability, several ADAPOS servers can operate in parallel and provide full system redundancy. Implementation of ADAPOS required significant changes inside the WINCC OA system. In the standard configuration used at CERN, the communication with the ORACLE database is handled by a specialized manager (RDB manager). This connects WINCC OA directly with the database. Part of this data stream requested by ADAPOS had to be duplicated and transferred from WINCC OA to ADAPOS using a DIM protocol.

A new technology developed by Siemens and further extended at CERN allowed a novel approach. The Next Generation Archive (NGA) manager plugs into the core of the system and collects data tagged for archival. The NGA can split the data stream between several backends. A CERN developed ORACLE backend handles all communication with ORACLE and fully replaces the RDB manager. The ADAPOS backend, developed in ALICE, forwards condition parameters to ADAPOS. To add new parameters to the ADAPOS data stream, the detector expert needs to tag the corresponding data, and all related configuration will be executed on the fly by the individual software components. Figure 102 shows the data flow from WINCC OA to ORACLE and ADAPOS using the NGA manager and corresponding backends. Each WINCC OA system is built as a collection of highly specialized managers. The data flow is managed by the Data manager (DM) and Event manager (EM).

The described mechanism represents the main data flow from WINCC OA systems to ORACLE or ADAPOS, respectively. The data streaming, however, is not optimal for large data sets, such as masks,



Figure 102: DCS conditions data flow to ORACLE and ADAPOS.

chip configurations, etc., that usually do not change during the run. A dedicated sevice called FilePush allows the injection of such data directly to the CCDB. A similar service called FilePull moves data from the CCDB to the DCS configuration database. This service was created to handle configuration data produced on the  $O^2$  data processing side.

A large part of the data is consumed by various online displays and user interfaces. This part of the data flow is covered by WINCC OA tools. Analysis and mitigation of operational issues and analysis of long term performance require retrieval of large amounts of data from the database. The online cluster in ALICE offers data recorded since 2008. Retrieval of large historical records is, however, inefficient when only WINCC OA built-in tools are used. The DCS was therefore extended by a new system, named DARMA (the DCS ARchive MAnager).

DARMA is a web-based system allowing the retrieval of historical values from the database. To protect the online ORACLE cluster from possible overload, DARMA is configured to access data from the online replica of the ALICE production database. Based on ORACLE RAC ADG, this replica contains up-todate values in real time mode. The user of DARMA can choose between a web-based GUI and a scripting interface to configure the requests. Finally, the DARMA system is extended by a plugin, allowing the Grafana system to access the DCS ORACLE archive. The full DCS data flow from/to external services is shown in Fig. 103.

### 5.6.4 DCS operator environment

Even if the DCS provides fully automated operation and is available at all times with minimum downtime (about two or three days during the whole year for scheduled updates and backups), human supervision is maintained during detector operation. The main task of the operator is to react to anomalies indicated by the alert system and to assist the experts in mitigation. An intuitive and reliable operator environment is an important part of the DCS design.

The complexity of the control system presents a challenge for the system designers when creating user interfaces: about one million parameters can be accessed through DCS tools and the global DCS logic is encoded in 70000 finite state machines. Central operations require large sequences of steps that need to be executed by the operator. To facilitate these tasks and protect the systems from operator errors, an ALICE DCS UI component was developed. It provides a coherent way for accessing all ALICE



**Figure 103:** DCS data exchange with  $O^2$  and external consumers.

parameters from one user interface. Navigation is based on detector hierarchy, which makes this process intuitive.

All controlled objects are represented in a visual pane. Navigating through the hierarchy, the main user interface displays panels for selected objects. Instructions for the use of the panel are either available in the operator manual, or are embedded in the alert instructions if an anomaly occurred. Various operator panels embedded in the common user interface are shown in Fig. 104.



Figure 104: ALICE DCS UI with various panels used by the operators.

Complex High-level operations related to detector safety, preparation of the detectors for data taking, or communication with LHC are further embedded in macro commands. The DCS operator is guided through the operations, and the procedure evaluates the status of all components in parallel and prevents human error. For example, a detector not yet fully recovered from a condition compatible with a magnet ramp operation will not allow the operator to ramp up the high voltage until the recovery steps are executed. Similar to this, a detector in a state not compatible with beam injection ('not beam safe') will not allow the operator to grant the injection permits to the LHC and hence blocks the injection until the detector is brought to a state safe for beam operation.

The new ALICE DCS UI component is a key operational tool. More than 300 operators were already trained for the Run 3 period, most of them being colleagues with no prior experience with system con-

trols. Use of a single user interface component hiding as many detector specificities as possible significantly reduces the learning curve and improves the operational stability.

#### 6 Physics performance

The physics programme motivating the upgrade was first established in the Letter of Intent [144] and the individual Technical Design Reports and was further expanded upon in the summary report of the heavy-ion physics programme at the LHC in Run 3 and 4 across all the LHC experiments [145]. A precise measurement of the long-wavelength behaviour, i.e. the macroscopic fluid-like evolution of a high-density and high-temperature system, allows the determination of transport properties, such as the viscosity of the QGP, and provides information about the equation of state. Probes that are sensitive to short wavelengths give access to the microscopic parton dynamics in a deconfined QGP state. A further goal is to study particle production in order to assess the validity of the fluid description and the role of collectivity in high-multiplicity pp collisions and in p–Pb collisions. Finally, the precise measurement of nuclear parton densities over a wide  $(x, Q^2)$  range is fundamental to constrain the initial conditions.

In the following, some examples of the physics performance with the upgraded detector will be briefly



**Figure 105:** Transverse (solid circle markers) and longitudinal (open square markers) impact parameter resolution in Pb–Pb collision data with ALICE 1 (blue points) and simulations the upgraded detector, ALICE 2 (red points).



**Figure 106:** Performance projections for measurements of the nuclear modification factor  $R_{AA}$  and elliptic flow coefficient  $v_2$  (see text for details) for charm and beauty hadrons [145].

highlighted. Figure 105 shows the resolution in the distance of closest approach of tracks to the primary vertex, in both the transverse ( $r\varphi$ , solid circle markers) and longitudinal (z, open square makers) directions in Pb–Pb collisions, as simulated with the full O<sup>2</sup> simulation framework for the upgraded detector (red points) compared to Run 2 data (blue points). The impact parameter resolution is seen to improve by factor of approximately three in the transverse direction and a factor ten or more in the longitundal direction.

Heavy quarks are produced in hard scatterings between incoming quarks and gluons, and lose energy as they interact with the quark–gluon plasma while propagating out of the collision zone. To quantify the effect of energy loss, the transverse momentum distributions of produced hadrons are measured both in proton-proton and Pb–Pb collisions at the same centre-of-mass energy per nucleon pair. The ratio of the  $p_{\rm T}$  spectra measured in Pb–Pb collisions to those in pp collisions scaled by the nuclear overlap function  $T_{\rm AA}$ , which is proportional to the number of binary nucleon–nucleon collisions, is then calculated:

$$R_{\rm AA} = \frac{\mathrm{d}N/\mathrm{d}p_{\rm T}|_{\rm AA}}{T_{\rm AA}\,\mathrm{d}\sigma/\mathrm{d}p_{\rm T}|_{\rm pp}}.\tag{3}$$

The ratio  $R_{AA}$  is called the nuclear modification factor. The left panel of Fig. 106 shows the expected precision for measurements of the nuclear modification factor for charm and beauty hadrons, which are used to determine the mass dependence of energy loss of heavy quarks propagating through the quark–gluon plasma [145]. The figure shows projections for the nuclear modification factor for the open charm D<sup>0</sup> mesons and for measurements of beauty mesons via full hadronic reconstruction and measurements of production of non-prompt charm particles. The mass dependence of parton energy loss is most pronounced at transverse momenta around or below the quark mass, when the quark is moving through the QGP with a velocity that is significantly smaller than the speed of light. The upgraded detector provides measurements with uncertainties that are small enough to reveal the expected mass dependence in this momentum range.

The right panel of Fig. 106 shows the expected precision for measurements of the azimuthal asymmetry for charm mesons and baryons [145], as characterised by the elliptic flow  $v_2$ , which is the second harmonic coefficient of the Fourier expansion of the distibutions of the azimuthal angles  $\varphi$  of the hadrons with respect to the reaction plane angle  $\Psi$ :

$$\frac{\mathrm{d}N}{\mathrm{d}\varphi} \propto 1 + 2v_2 \cos 2(\varphi - \Psi). \tag{4}$$

These measurements are in particular sensitive to the diffusion of charm quarks in the QGP and the path length dependence of parton energy loss effects.

Figure 107 shows the expected performance for measurements of the beauty baryon-to-meson ratio. These measurements probe the mechanisms for meson and baryon formation. For example, hadronisation via quark coalescence (red and blue curve) is expected to lead to an increased production rate of beauty baryons relative to that of the  $B^+$  meson [146–148]. Performing such measurements for the first time in the beauty sector will enhance the understanding of hadronisation mechanisms.

Figure 108 shows the expected precision of the measurement of electron-positron pair production. The left panel shows the mass distribution of all pairs, and the right panel shows the result after subtracting the calculated contributions from the decays of light hadrons, except the  $\rho$  meson and heavy-flavour decays. The result in the right panel is sensitive to the modification of the  $\rho$  meson spectral function in the high-density collision system (indicated by the red line), as well as thermal emission (orange line) from all stages of the collision, including the QGP phase [149], which provides a unique window on the system before hadronisation.

A further upgrade of the inner tracking system is planned to further improve the capabilities of ALICE for measurements of electron-positron pairs as well as heavy-flavour measurements. To achieve this, the



**Figure 107:** Performance projection for the ratio of beauty baryon and meson yields in central Pb–Pb collisions with the total expected integrated luminosity of  $10 \text{ nb}^{-1}$  (at full magnetic field) for Run 3 and 4. From [145].



**Figure 108:** Projection of the invariant mass distribution of electron-positron pairs before (left panel) and after (right panel) subtraction of the contribution from decays of light and heavy flavour mesons, except the  $\rho$  meson. From [145].



**Figure 109:** Projected yield ratio of  $\psi(2S)$  and J/ $\psi$  in pp and Pb–Pb collisions, measured with the MFT and muon arm [144]. Model calculations from [151, 152].



**Figure 110:** Projected limits on jet energy loss from the measurement of the yields of jets recoiling from high- $p_T$  trigger particles in different collision systems. From [153].

three innermost layers of the ITS will be replaced with wafer-scale silicon sensors that are bent into a cylindrical shape, supported by carbon foam. More details can be found in the ITS3 Letter of Intent [150].

Figure 109 shows the expected performance for measurements of the production of the two main charmonium states with the MFT and the muon detectors. The MFT provides additional background rejection of non-prompt muons and an improved momentum resolution, which make a precise measurement of the production of the  $\psi(2S)$  possible. The production rates of J/ $\psi$  and  $\psi(2S)$  are compared to gain further insight into the mechanisms for quarkonium dissociation and regeneration in the QGP.

Another important part of the physics program is the investigation of several effects that have been observed in high-multiplicity collisions of protons that are reminiscent of heavy-ion collisions [154], such as the increased production yields of multi-strange baryons with respect to pions and azimuthal asymmetry of produced particles. In particular, these observations raise the question whether quark–gluon plasma is formed in high-multiplicity pp and p–Pb collisions. Parton energy loss is a distinctive signature of QGP formation that was yet observed in small collision systems. The search for this effect will continue with the much larger pp and p–Pb samples of Run 3 and also using a short pilot run with oxygen–oxygen collisions, which provides similar multiplicities but larger geometrical size. Figure 110 shows the projected sensitivity limits on energy transported outside the jet cone based on the measurement of jet yield recoiling from high- $p_T$  particles in various smaller collision systems [153] compared to the values determined from existing measurements in p–Pb and Pb–Pb collisions [155, 156]. The large data sample of proton–proton collisions of Runs 3 and 4 (with a target integrated luminosity of 200 pb<sup>-1</sup>) will also enable unique studies of perturbative QCD effects, such as the dead cone in gluon radiation off heavy quarks, of the residual strong interaction among hadrons pairs, including multistrange hadrons and light nuclei, and of hypernuclei production [154].

# 7 Conclusions and outlook

After the successful conclusion of data taking in the LHC Runs 1 and 2, the ALICE detector has undergone a major upgrade in order to enable new or more precise measurements in Runs 3 and 4. The inner tracking system was completely replaced and is now fully instrumented with silicon pixel sensors, which provide a much better pointing resolution and support continuous readout. The muon forward tracker uses the same pixel sensors and provides for the first time precise pointing information for forward-rapidity muons reconstructed in the muon chambers and the muon identifier. The time projection chamber is now read out using new detectors based on gas electron multiplication foils that reduce ion backflow and enable continuous readout at much higher interaction rates. The new fast interaction trigger system provides the interaction trigger, as well as multiplicity measurements and time measurements for offline analysis. All other detector systems have undergone consolidation work and several systems have been equipped with upgraded readout electronics and/or new firmware to increase the readout speed or support continuous readout.

All upgrades were completed on schedule and have been commissioned with standalone runs as well as global commissioning runs in 2021 and 2022. The systems have been been tested with pilot beams in October 2021 before the official start of Run 3 and are being operated successfully for routine data taking with proton-proton collisions at  $\sqrt{s} = 13.6$  TeV since July 2022. High-rate tests are being performed to qualify the systems for the high data rates in Pb–Pb collisions. The upgraded detectors will strongly extend the physics reach of the experiment, in particular in the sector of open and hidden heavy-flavour probes, measurements of thermal radiation from the quark–gluon plasma, as well as measurements of light nuclear states, final-state interactions of hadrons, and the internal structure of jets.

Two additional upgrades are in preparation for Long Shutdown 3 (2026–2028), with the goal of further enhancing the physics reach of the experiment in Run 4: new inner layers for the inner tracking system (ITS3 [150]) and a forward calorimeter with high-granularity readout (FoCal [157]).

The ITS3 upgrade consists in the replacement of the three innermost layers of the ITS2 with three ultralight truly-cylindrical layers made with curved large-area MAPS sensors [158]. The innermost layer will have a radius of 18 mm and will surround a new beam pipe with reduced radius (16 mm) and thickness. The pointing resolution will be better than that of the ITS2 by a factor of two up to a transverse momentum of 5 GeV/*c*, reaching down to about 12 µm at  $p_T = 1$  GeV/*c*. The low- $p_T$  tracking efficiency will also improve. The ITS3 will strongly enhance the low-mass dielectron, heavy-flavour meson, and baryon production measurements.

The FoCal consists of an electromagnetic calorimeter with high-granularity readout for optimal separation of direct-photon showers from those of neutral pions at forward pseudorapidity ( $3.4 < \eta < 5.8$ ), coupled to a hadronic calorimeter for additional hadron rejection. The required granularity is achieved using a combination of MAPS silicon pixel readout planes and silicon pad readout planes. The main physics goal of the FoCal is the study of gluon parton distribution functions in the lead nucleus at Bjorken x values down to  $10^{-6}$  using the nuclear modification factor of forward direct photons with transverse momentum  $2 < p_T < 20$  GeV/c in p–Pb collisions.

For the LHC Runs 5 and 6, a completely new apparatus, named ALICE 3, is proposed [159]. The AL-ICE 3 detector consists of a vertexing and tracking system with unique pointing resolution over a large pseudorapidity range ( $-4 < \eta < +4$ ), complemented by multiple sub-detector systems for particle identification, including silicon time-of-flight layers, a ring-imaging Cherenkov detector with high-resolution readout, a muon identification system, and an electromagnetic calorimeter. Unprecedented pointing resolution of 10 µm at  $p_T = 200 \text{ MeV/}c$  at midrapidity in both the transverse and longitudinal directions can be achieved by placing the first layers as close as possible to the interaction point, on a retractable structure to leave sufficient aperture for the beams at injection energy. This next-generation apparatus will, on the one hand, enable novel studies of the QGP and, on the other hand, open up important

physics opportunities in other areas of QCD and beyond. The main new studies in the QGP sector focus on low- $p_T$  heavy-flavour production, including beauty hadrons, multi-charm baryons and charm–charm correlations, as as well as on precise multi-differential measurements of dielectron emission to probe the mechanism of chiral-symmetry restoration and the time-evolution of the QGP temperature. Besides QGP studies, ALICE 3 can uniquely contribute to hadronic physics, for example with femtoscopic studies of the interaction potentials between charm mesons and searches for nuclei with charm, to fundamental physics, with tests of the Low theorem for ultra-soft photon emission, and to searches for physics beyond the Standard Model, for example in the sector of axion-like particles and the anomalous magnetic moment of  $\tau$  leptons. The programme aims to collect an integrated luminosity of about 35 nb<sup>-1</sup> with Pb–Pb collisions and 18 fb<sup>-1</sup> with pp collisions at top LHC energy. The potential to further increase the luminosity for ion running in the LHC by using smaller ions (e.g. <sup>84</sup>Kr or <sup>128</sup>Xe), as well as further runs with small collision systems to explore the approach to thermal equilibrium, are being explored.

### References

- [1] ALICE Collaboration, K. Aamodt *et al.*, "The ALICE experiment at the CERN LHC", *JINST* **3** (2008) S08002.
- [2] ALICE Collaboration, B. B. Abelev *et al.*, "Performance of the ALICE Experiment at the CERN LHC", *Int. J. Mod. Phys. A* **29** (2014) 1430044, arXiv:1402.4476 [nucl-ex].
- [3] ALICE Collaboration, "The ALICE experiment A journey through QCD", arXiv:2211.04384 [nucl-ex].
- [4] **RD12 Collaboration** Collaboration, B. G. Taylor, "Timing distribution at the LHC",. https://cds.cern.ch/record/592719.
- [5] E. B. S. Mendes, S. Baron, D. M. Kolotouros, C. Soos, and F. Vasey, "The 10G TTC-PON: challenges, solutions and performance", *JINST* **12** (2017) C02041.
- [6] https://espace.cern.ch/GBT-Project/default.aspx.
- [7] ALICE, ATLAS Collaboration, A. Borga *et al.*, "The C-RORC PCIe card and its application in the ALICE and ATLAS experiments", *JINST* **10** (2015) C02022.
- [8] J. Cachemiche *et al.*, "The PCIe-based readout system for the LHCb experiment", *JINST* **11** (2016) P02013.
- [9] O. Bourrion *et al.*, "Versatile firmware for the Common Readout Unit (CRU) of the LHC ALICE experiment", *JINST* **16** (2021) P05019.
- [10] G. Aglieri Rinella, "The ALPIDE pixel sensor chip for the upgrade of the ALICE Inner Tracking System", Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 845 (2017) 583 – 587. http://www.sciencedirect.com/science/article/pii/S0168900216303825. Proceedings of the Vienna Conference on Instrumentation 2016.
- [11] W. Snoeys, "CMOS monolithic active pixel sensors for high energy physics", Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 765 (2014) 167 – 171. HSTD-9 2013 - Proceedings of the 9th International Hiroshima Symposium on Development and Application of Semiconductor Tracking Detectors, International Conference Center, Hiroshima, Japan, 2-5 September 2013.
- [12] S. Senyukov, J. Baudot, A. Besson, G. Claus, L. Cousin, A. Dorokhov, W. Dulinski, M. Goffe, C. Hu-Guo, and M. Winter, "Charged particle detection performances of CMOS pixel sensors produced in a process with a high resistivity epitaxial layer", *Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment* 730 (2013) 115 118. Proceedings of the 9th International Conference on Radiation Effects on Semiconductor Materials Detectors and Devices, October 9-12 2012, Dipartimento di Fisica e Astronomia, Firenze.
- [13] H. Hernandez, *et al.*, "A monolithic 32-channel front end and dsp asic for gaseous detectors", *IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT* **69** (2020) 2686–2697.
- [14] M. van Elzakker, E. van Tuijl, P. Geraedts, D. Schinkel, E. A. M. Klumperink, and B. Nauta, "A 10-bit charge-redistribution adc consuming 1.9 μw at 1 ms/s", *IEEE Journal of Solid-State Circuits* 45 (2010) 1007–1015.

- [15] ALICE TPC Collaboration, J. Adolfsson *et al.*, "The upgrade of the ALICE TPC with GEMs and continuous readout", JINST 16 (2021) P03022, arXiv:2012.09518 [physics.ins-det].
- [16] ALICE Collaboration, B. Abelev *et al.*, "Technical Design Report for the Upgrade of the ALICE Inner Tracking System", *J. Phys. G* **41** (2014) 087002.
- [17] ALICE Collaboration, M. Mager, "The Monolithic Active Pixel Sensor for the ALICE ITS upgrade", Nucl. Instrum. Meth. A 848 (2016) 434–438.
- [18] **ALICE** Collaboration, A. Szczepankiewicz, "Readout of the upgraded ALICE-ITS", *Nucl. Instrum. Meth. A* **824** (2016) 465–469.
- [19] ALICE Collaboration, J. Schambach *et al.*, "A Radiation-Tolerant Readout System for the ALICE Inner Tracking System Upgrade", *NSS/MIC* (2018) 1–6.
- [20] L. Amaral et al., "The versatile link, a common project for super-LHC", JINST 4 (2009) P12003.
- [21] "Two-wire bus-system comprising a clock wire and a data wire for interconnecting a number of stations." https://patents.google.com/patent/US4689740A/en.
- [22] ALICE Collaboration, F. Reidt, "Upgrade of the alice its detector", Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 1032 (2022) 166632. https://arxiv.org/abs/2111.08301.
- [23] ALICE Collaboration, J. Adam *et al.*, "Technical Design Report for the Muon Forward Tracker", Tech. Rep. CERN-LHCC-2015-001. ALICE-TDR-018, May, 2015. https://cds.cern.ch/record/1981898.
- [24] ALICE Collaboration, "Upgrade of the ALICE Time Projection Chamber", Tech. Rep. CERN-LHCC-2013-020, ALICE-TDR-16, CERN, Geneva, 2013. https://cds.cern.ch/record/1622286.
- [25] ALICE TPC Collaboration, "The ALICE TPC, a large 3-dimensional tracking device with fast readout for ultra-high multiplicity events", *Nucl. Instr. Meth. A* 622 (2010) 316–367.
- [26] F. Sauli, "Gem: A new concept for electron amplification in gas detectors", Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 386 (1997) 531-534.
   https://www.sciencedirect.com/science/article/pii/S0168900296011722.
- [27] A. Deisting, C. Garabatos, A. Szabo, and D. Vranic, "Measurements of ion mobility in argon and neon based gas mixtures", *Nucl. Instrum. Meth. A* 845 (2017) 215–217, arXiv:1603.07638 [physics.ins-det].
- M. Villa, S. Duarte Pinto, M. Alfonsi, I. Brock, G. Croci, E. David, R. de Oliveira, L. Ropelewski, H. Taureg, and M. van Stenis, "Progress on large area gems", *Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment* 628 (2011) 182–186. https://www.sciencedirect.com/science/article/pii/S0168900210015020. VCI 2010.
- [29] M. Ball, B. Ketzer, J. Ottnad, V. Ratza, and S. Urban, "Quality assurance of GEM foils for the upgrade of the ALICE TPC", JINST 12 (Jan, 2017) C01081–C01081. https://doi.org/10.1088%2F1748-0221%2F12%2F01%2Fc01081.

- [30] E. Brucken and T. Hilden, "The GEM QA protocol of the ALICE TPC upgrade project", PoS MPGD2017 (2019) 073.
- [31] E. Brucken and T. Hilden, "GEM foil quality assurance for the ALICE TPC upgrade", EPJ Web Conf. 174 (2018) 03004. https://doi.org/10.1051/epjconf/201817403004.
- [32] T. E. Hilden, J. E. Brucken, D. VARGA, and M. Vargyas, "GEM foil gain prediction", *PoS* MPGD2017 (2019) 010.
- [33] M. Capeáns-Garrido, R. Fortin, L. Linssen, M. Moll, and C. Rembser, "A GIF++ Gamma Irradiation Facility at the SPS H4 Beam Line", Tech. Rep. CERN-SPSC-2009-029. SPSC-P-339, CERN, Geneva, 2009. https://cds.cern.ch/record/1207380.
- [34] M. R. Jäkel, M. Capeáns, I. Efthymiopoulos, A. Fabich, R. Guida, G. Maire, M. Moll,
   D. Pfeiffer, F. Ravotti, and H. Reithler, "CERN-GIF<sup>++</sup>: a new irradiation facility to test large-area particle detectors for the high-luminosity LHC program", *PoS* TIPP2014 (2014) 102.
- [35] P. Moreira, et al., "The GBT Project", Proceedings of the TWEPP09 2 (2009) 342. https://cds.cern.ch/record/1235836.
- [36] ALICE Collaboration, W. H. Trzaska, "New Fast Interaction Trigger for ALICE", Nucl. Instrum. Meth. A845 (2017) 463–466.
- [37] ALICE Collaboration, M. Bondila et al., "ALICE T0 Detector", IEEE Trans. Nucl. Sc. 52 (2005) 1705.
- [38] ALICE Collaboration, E. Abbas et al., "Performance of the ALICE VZERO system", JINST 8 (2013) P10016, arXiv:1306.3130 [nucl-ex].
- [39] M. Broz et al., "Performance of ALICE AD modules in the CERN PS test beam", JINST 16 (2021) P01017, arXiv:2006.14982 [physics.ins-det].
- [40] D. Finogeev, T. Karavicheva, D. Serebryakov, A. Tikhonov, W. Trzaska, and N. Vozniuk, "Readout system of the ALICE Fast Interaction Trigger", *JINST* **15** (2020) C09005.
- [41] ALICE Collaboration, W. H. Trzaska, "New ALICE detectors for Run 3 and 4 at the CERN LHC", Nucl. Instrum. Meth. A 958 (2020) 162116.
- [42] ALICE Collaboration, Y. Melikyan, "Performance of Planacon MCP-PMT photosensors under extreme working conditions", *Nucl. Instrum. Meth. A* 952 (2020) 161689.
- [43] T. Sjöstrand, "The PYTHIA Event Generator: Past, Present and Future", Comput. Phys. Commun. 246 (2020) 106910, arXiv:1907.09874 [hep-ph].
- [44] R. Brun, F. Bruyant, M. Maire, A. C. McPherson, and P. Zanarini, "GEANT3, CERN-DD-EE-84-1", tech. rep., 9, 1987.
- [45] V. Grabski, "New fiber readout design for the large area scintillator detectors: providing good amplitude and time resolutions", arXiv:1909.01184 [physics.ins-det].
- [46] ALICE Collaboration, S. Rojas-Torres, "The Forward Diffractive Detector for ALICE", PoS LHCP2020 (2020) 221.
- [47] R. Arnaldi *et al.*, "Front-end electronics for the rpcs of the alice dimuon trigger", *IEEE Transactions on Nuclear Science* 52 (2005) 1176–1181.

- [48] P. Dupieux *et al.*, "Upgrade of the ALICE muon trigger electronics", *Journal of Instrumentation* 9 (2014) C09013–C09013.
- [49] M. Marchisone, "Performance of a resistive plate chamber equipped with a new prototype of amplified front-end electronics in the ALICE detector", *Journal of Physics: Conference Series* 889 (2017) 012011.
- [50] A. Bianchi *et al.*, "Characterization of tetrafluoropropene-based gas mixtures for the resistive plate chambers of the ALICE muon spectrometer", *Journal of Instrumentation* 14 (2019) P11014–P11014.
- [51] B. Joly *et al.*, "Production readiness review for the upgrade of the muon trigger front-end electronics", 2016. https://edms.cern.ch/document/1728246/1, accessed 2016-10-24.
- [52] G. Blanchard *et al.*, "The local trigger electronics of the alice dimuon trigger", 2003. https://edms.cern.ch/ui/file/406309/1/ALICE-EN-2003-010.pdf, accessed 2003-12-10.
- [53] C. Renard *et al.*, "Mid readout electronics documentation on the web", 2019. https: //www-subatech.in2p3.fr/~electro/projets/alice/dimuon/trigger/upgrade, accessed 2019-12-11.
- [54] P. Dupieux et al., "Mid data format and content", 2020. https://twiki.cern.ch/twiki/pub/ALICE/MIDRO/MID-DataFormat-150520.pdf, accessed 2020-05-15.
- [55] ALICE Collaboration, S. Acharya *et al.*, "The ALICE Transition Radiation Detector: construction, operation, and performance", *Nucl. Instrum. Meth. A* 881 (2018) 88–127, arXiv:1709.02743 [physics.ins-det].
- [56] J. Jadlovsky et al., "Communication Architecture of the Detector Control System for the Inner Tracking System", in Proc. of International Conference on Accelerator and Large Experimental Control Systems (ICALEPCS'17), Barcelona, Spain, 8-13 October 2017, no. 16 in International Conference on Accelerator and Large Experimental Control Systems, pp. 1930–1933. JACoW, 2018.
- [57] ALICE Collaboration, G. Dellacasa *et al.*, "ALICE technical design report of the time-of-flight system (TOF)", *CERN-LHCC-2000-012* (2000).
- [58] ALICE Collaboration, G. Dellacasa *et al.*, "ALICE Addendum to the technical design report of the time-of-flight system (TOF)", *CERN-LHCC-2002-016* (2000).
- [59] A. Akindinov *et al.*, "Design aspects and prototype test of a very precise TDC system implemented for the multigap RPC of the ALICE-TOF", *Nucl. Instrum. Meth. A* 533 (2004) 178–182.
- [60] ALICE Collaboration, P. Antonioli, A. Kluge, and W. e. Riegler, "Upgrade of the ALICE Readout & Trigger System", CERN-LHCC-2013-019, ALICE-TDR-015 (2013).
- [61] ALICE Collaboration, D. Falchieri, "DRM2: the readout board for the ALICE TOF upgrade", *PoS* **TWEPP-17** (2018) 081.
- [62] ALICE Collaboration, F. Costa *et al.*, "DDL, the ALICE data transmission protocol and its evolution from 2 to 6 Gb/s", *JINST* **10** (2015) C04008.

- [63] ALICE Collaboration, D. Falchieri, "Radiation tests and production test strategy for the ALICE TOF readout upgrade board", *PoS* TWEPP2018 (2019) 025.
- [64] D. Falchieri, P. Antonioli, C. Baldanza, F. Giorgi, A. Mati, and C. Tintori, "Design and Test of a GBTX-Based Board for the Upgrade of the ALICE TOF Readout Electronics", *IEEE Trans. Nucl. Sci.* 64 (2017) 1357–1362.
- [65] CAEN, "CAENVME Library and VME-PCI optical bridges", https://www.caen.it/products/caenvmelib-library/.
- [66] D. Falchieri, P. Antonioli, C. Baldanza, M. Giacalone, and A. Mati, "Readout board validation setup for the ALICE Time of Flight detector upgrade", in 2019 IEEE Nuclear Science Symposium (NSS) and Medical Imaging Conference (MIC), pp. 1–3. 2019.
- [67] ALICE Collaboration, S. Beole *et al.*, "ALICE technical design report: Detector for high momentum PID", *CERN-LHCC-98-19* (8, 1998).
- [68] ALICE Collaboration, P. Cortese *et al.*, "ALICE electromagnetic calorimeter technical design report", *CERN-LHCC-2008-014*, *CERN-ALICE-TDR-014* (2008).
- [69] J. Allen *et al.*, "ALICE DCal: An Addendum to the EMCal Technical Design Report for Di-Jet and Hadron-Jet correlation measurements in ALICE", *CERN-LHCC-2010-011*, *ALICE-TDR-14-add-1* (2010).
- [70] ALICE Collaboration, "Performance of the ALICE Electromagnetic Calorimeter", arXiv:2209.04216 [physics.ins-det].
- [71] ALICE Collaboration, H. Muller, R. Pimenta, Z.-B. Yin, D.-C. Zhou, X. Cao, Q.-X. Li, Y.-Z. Liu, F.-F. Zou, B. Skaali, and T. Awes, "Configurable electronics with low noise and 14-bit dynamic range for photodiode-based photon detectors", *Nucl. Instrum. Meth. A* 565 (2006) 768–783.
- [72] R. Esteve Bosch, A. Jimenez de Parga, B. Mota, and L. Musa, "The ALTRO chip: A 16-channel A/D converter and digital processor for gas detectors", *IEEE Trans. Nucl. Sci.* 50 (2003) 2460–2469.
- [73] F. Zhang, H. Muller, T. C. Awes, S. Martoiu, J. Kral, D. Silvermyr, A. Tarazona Martinez,
   G. Huang, and D. Zhou, "Point-to-point readout for the ALICE EMCal detector", *Nucl. Instrum. Meth. A* 735 (2014) 157–162.
- [74] J. Christiansen, A. Marchioro, P. Moreira, and A. Sancho, "Receiver ASIC for timing, trigger and control distribution in LHC experiments", *IEEE Trans. Nucl. Sci.* 43 (1996) 1773–1777.
- [75] J. Kral, T. Awes, H. Muller, J. Rak, and J. Schambach, "L0 trigger for the EMCal detector of the ALICE experiment", *Nucl. Instrum. Meth. A* 693 (2012) 261–267.
- [76] ALICE EMCal Collaboration, O. Bourrion, N. Arbor, G. Conesa-Balbastre, C. Furget, R. Guernane, and G. Marcotte, "The ALICE EMCal L1 trigger first year of operation experience", JINST 8 (2013) C01013, arXiv:1210.8078 [physics.ins-det].
- [77] ALICE Collaboration, G. Dellacasa *et al.*, "ALICE technical design report of the photon spectrometer (PHOS)", http://cds.cern.ch/record/381432.
- [78] H. Muller, D. Budnikov, M. Ippolitov, Q. Li, V. Manko, R. Pimenta, D. Rohrich, I. Sibiryak, B. Skaali, and A. Vinogradov, "Front-end electronics for PWO-based PHOS calorimeter of ALICE", *Nucl. Instrum. Meth. A* 567 (2006) 264–267.

- [79] O. Bourrion, R. Guernane, B. Boyer, J. Bouly, and G. Marcotte, "Level-1 jet trigger hardware for the ALICE electromagnetic calorimeter at LHC", JINST 5 (2010) C12048, arXiv:1010.2670 [physics.ins-det].
- [80] ALICE PHOS calorimeter Collaboration, D. Aleksandrov *et al.*, "A high resolution electromagnetic calorimeter based on lead-tungstate crystals", *Nucl. Instrum. Meth. A* 550 (2005) 169–184.
- [81] ALICE Collaboration, C. W. Fabjan *et al.*, "ALICE: Physics performance report, volume II", J. Phys. G32 (2006) 1295–2040.
- [82] ALICE Collaboration, S. Acharya *et al.*, "Calibration of the photon spectrometer PHOS of the ALICE experiment", *JINST* 14 (2019) P05025, arXiv:1902.06145 [physics.ins-det].
- [83] I. A. Pshenichnov, J. P. Bondorf, I. N. Mishustin, A. Ventura, and S. Masetti, "Mutual heavy ion dissociation in peripheral collisions at ultrarelativistic energies", *Physical Review C* 64 (Aug, 2001) 249031–2490319, 0101035. http://link.aps.org/doi/10.1103/PhysRevC.64.024903.
- [84] I. Pshenichnov, "Electromagnetic excitation and fragmentation of ultrarelativistic nuclei", Phys. Part. Nucl. 42 (Mar, 2011) 215–250. http://link.springer.com/10.1134/S1063779611020067.
- [85] ALICE Collaboration, B. Abelev *et al.*, "Measurement of the Cross Section for Electromagnetic Dissociation with Neutron Emission in Pb-Pb Collisions at  $\sqrt{s_{NN}} = 2.76$  TeV", *Phys. Rev. Lett.* **109** (2012) 252302, arXiv:1203.2436 [nucl-ex].
- [86] IOxOS Technologies SA, "ADC\_3112 Four Channel 900 Msps 12-bit ADC", 2018. https://www.ioxos.ch/wp-content/uploads/2018/02/ADC\_3112\_DS\_A1.pdf, accessed 2020-09-30.
- [87] Texas Instruments Inc, "Dual Channel 12-Bit 900Msps Analog-to-Digital Converter, ADS5409", 2014. https://www.ti.com/lit/ds/symlink/ads5409.pdf, accessed 2020-09-30.
- [88] IOxOS Technologies SA, "IFC\_1211 Intelligent FPGA Controller." https://www.ioxos.ch/wp-content/uploads/2018/02/IFC\_1211\_DS\_A2.pdf, accessed 2020-09-30.
- [89] P. Buncic, M. Krzewicki, and P. Vande Vyvre, "Technical Design Report for the Upgrade of the Online-Offline Computing System", Tech. Rep. CERN-LHCC-2015-006. ALICE-TDR-019, Apr, 2015. https://cds.cern.ch/record/2011297.
- [90] A. Peters, E. Sindrilaru, and G. Adde, "EOS as the present and future solution for data storage at CERN", *Journal of Physics: Conference Series* 664 (Dec, 2015) 042042. https://doi.org/10.1088/1742-6596/664/4/042042.
- [91] G. Eulisse, P. Konopka, M. Krzewicki, M. Richter, D. Rohr, and S. Wenzel, "Evolution of the alice software framework for run 3", *EPJ Web Conf.* 214 (2019) 05010. https://doi.org/10.1051/epjconf/201921405010.
- [92] A. Rybalchenko, D. Klein, M. Al-Turany, and T. Kollegger, "Shared Memory Transport for ALFA", *EPJ Web Conf.* **214** (2019) 05029.
- [93] M. Richter *et al.*, "Data Handling in the ALICE O2 Event Processing", in *Proceedings of the conference Computing in High Energy Physics (CHEP'18) in Sofia, Bulgaria.* Nov., 2018.

- [94] R. Brun and F. Rademakers, "ROOT: An object oriented data analysis framework", Nucl. Instrum. Meth. A 389 (1997) 81–86.
- [95] "A cross-language development platform for in-memory analytics." https://arrow.apache.org.
- [96] ALICE Collaboration, M. Lettrich, "Fast and Efficient Entropy Compression of ALICE Data using ANS Coding", EPJ Web Conf. 245 (2020) 01001.
- [97] D. Rohr, "Usage of GPUs in ALICE online and offline processing during LHC run 3", EPJ Web of Conferences 251 (2021) 04026. https://doi.org/10.1051%2Fepjconf%2F202125104026.
- [98] J. Duda, "Asymmetric numeral systems as close to capacity low state entropy coders", CoRR abs/1311.2540 (2013), 1311.2540. http://arxiv.org/abs/1311.2540.
- [99] F. Costa *et al.*, "Assessment of the ALICE O<sup>2</sup> readout servers", in *Proceedings of the conference Computing in High Energy Physics (CHEP'19) in Adelaide, Australia.* Nov., 2019.
- [100] D. Eschweiler and V. Lindenstruth, "The portable driver architecture", in *Proceedings 16th Real-Time Linux Workshop*. Open Source Automation Development Lab (OSADL), Oct., 2014.
- [101] D. Eschweiler, *Efficient device drivers for supercomputers*. PhD thesis, Goethe University Frankfurt, Frankfurt am Main, Germany, 2015. http://d-nb.info/1120712483.
- [102] ALICE Collaboration, K. Alexopoulos and F. Costa, "The ReadoutCard Userspace Driver for the New Alice O2 Computing System", *IEEE Trans. Nucl. Sci.* 68 (2021) 1876–1883, arXiv:2010.16327 [physics.ins-det].
- [103] ALICE Collaboration, P. Konopka and B. von Haller, "The ALICE O2 data quality control system", vol. 245, p. 01027. 2020.
- [104] R. Brun and F. Rademakers, "ROOT: An object oriented data analysis framework", Nucl. Instrum. Meth. A 389 (1997) 81–86.
- [105] T. Mrnjavac, K. Alexopoulos, V. Chibante Barroso, and G. Raduta, "AliECS: a New Experiment Control System for the ALICE Experiment", vol. 245, p. 01033. 2020.
- [106] "Apache mesos", 2020. http://mesos.apache.org/, accessed 2020-10-10.
- [107] M. Al-Turany, A. Rybalchenko, D. Klein, M. Kretz, D. Kresan, R. Karabowicz, A. Lebedev, A. Manafov, T. Kollegger, and F. Uhlig, "ALFA: A framework for building distributed applications", vol. 245, p. 05021. 2020.
- [108] "Consul by hashicorp", 2020. https://www.consul.io/, accessed 2020-10-10.
- [109] "The go programming language", 2020. https://golang.org/, accessed 2020-10-10.
- [110] G. Vino et al., "A Monitoring System for the New ALICE O<sup>2</sup> Farm", in Proceedings of the International Conference on Accelerator and Large Experimental Physics Control Systems (ICALEPCS'19) in New York, USA. Oct., 2019.
- [111] V. Barroso *et al.*, "Towards the integrated ALICE Online-Offline (O<sup>2</sup>) monitoring subsystem", in Proceedings of the conference Computing in High Energy Physics (CHEP'18) in Sofia, Bulgaria. Nov., 2018.

- [112] "Telegraf", 2020. https://www.influxdata.com/time-series-platform/telegraf/, accessed 2020-01-23.
- [113] "Apache kafka", 2018. https://kafka.apache.org/, accessed 2018-11-18.
- [114] "Influxdb downsampling and data retention", 2020. https: //docs.influxdata.com/influxdb/v1.8/guides/downsample~\_~and~\_~retain, accessed 2020-09-02.
- [115] "Grafana the open platform for analytics and monitoring", 2020. https://grafana.com/, accessed 2020-01-12.
- [116] ALICE Collaboration, S. Chapeland et al., "The ALICE DAQ infoLogger", J.Phys.Conf.Ser. 513 (2014) 012005.
- [117] M. Teitsma, V. C. Barosso, P. Boeschoten, and P. Hendriks, "Jiskefet, a bookkeeping application for ALICE", vol. 245, p. 04023. 2020. arXiv:2003.05756 [cs.HC].
- [118] S. Wenzel, "A scalable and asynchronous detector simulation system based on alfa", EPJ Web Conf. 214 (2019) 02029. https://doi.org/10.1051/epjconf/201921402029.
- [119] M. Al-Turany, D. Bertini, R. Karabowicz, D. Kresan, P. Malzacher, T. Stockmanns, and F. Uhlig, "The FairRoot framework", J. Phys. Conf. Ser. 396 (2012) 022001.
- [120] J. Allison et al., "Recent developments in Geant4", Nucl. Instrum. Meth. A 835 (2016) 186-225.
- [121] F. Carminati and A. Morsch, "Simulation in ALICE", eConf C0303241 (2003) TUMT004, arXiv:physics/0306092.
- [122] I. Hrivnacova, D. Adamova, V. Berejnoi, R. Brun, F. Carminati, A. Fasso, E. Futo, A. Gheata, I. Gonzalez Caballero, and A. Morsch, "The Virtual Monte Carlo",. https://cds.cern.ch/record/619573. Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, 8 pages, LaTeX, 6 eps figures. PSN THJT006. See http://root.cern.ch/root/vmc/VirtualMC.html.
- [123] G. Battistoni et al., "Overview of the FLUKA code", Annals Nucl. Energy 82 (2015) 10-18.
- [124] B. Volkel, A. Morsch, I. Hřivnáčová, J. Grosse-Oetringhaus, and S. Wenzel, "Using multiple engines in the virtual monte carlo package", EPJ Web of Conferences 245 (01, 2020) 02008.
- [125] A. Alkin, G. Eulisse, J. F. Grosse-Oetringhaus, P. Hristov, and M. Kabus, "ALICE Run 3 Analysis Framework", *EPJ Web Conf.* 251 (2021) 03063.
- [126] R. Quishpe, J. F. Grosse-Oetringhaus, R. Cruceru, and C. Grigoras, "Hyperloop The ALICE analysis train system for Run 3", PoS LHCP2021 (2021) 250, arXiv: 2109.09594 [physics.ins-det].
- [127] Wes McKinney, "Data Structures for Statistical Computing in Python", in *Proceedings of the 9th Python in Science Conference*, Stéfan van der Walt and Jarrod Millman, eds., pp. 56–61. 2010.
- [128] M. Zaharia, *et al.*, "Apache spark: A unified engine for big data processing", *Communications of the ACM* **59** (11, 2016) 56–65.
- [129] J. Kvapil et al., "ALICE Central Trigger System for LHC Run 3", EPJ Web Conf. 251 (2021) 04022.

- [130] C. Ghabrous Larrea, K. Harder, D. Newbold, D. Sankey, A. Rose, A. Thea, and T. Williams,"IPbus: a flexible Ethernet-based control system for xTCA hardware", *JINST* 10 (2015) C02019.
- [131] A. Augustinus, P. Chochula, G. d. Cataldo, L. Jirdén, A. Kurepin, M. Lechman, O. Pinazza,
   P. Rosinský, and A. Moreno, "The wonderland of operating the alice experiment", *ICALEPCS* 2011.
  - http://accelconf.web.cern.ch/accelconf/icalepcs2011/papers/thbhaust02.pdf.
- [132] P. Chochula, A. Augustinus, P. Bond, A. Kurepin, M. Lechman, J. Lang, and O. Pinazza, "Challenges of the alice detector control system for the lhc run3", 16th International Conference on Accelerator and Large Experimental Physics Control Systems (2017). https://doi.org/10.18429/JACoW-ICALEPCS2017-TUMPL09, 2018.
- [133] "Wincc open architecture", https://winccoa.com/.
- [134] H. Boterenbrood and B. Hallgren, "The development of embedded local monitor board (elmb)", 9th Workshop on Electronics for LHC Experiments, Amsterdam, The Netherlands (2003).
- [135] C. Gaspar, M. Dönszelmann, and P. Charpentier, "Dim, a portable, light weight package for information publishing, data transfer and inter-process communication", *Computer Physics Communications* 140 (2001). https://doi.org/10.1016/S0010-4655(01)00260-0.
- [136] W. Salter *et al.*, "LHC Data Interchange Protocol (DIP) Definition." https://edms.cern.ch/file/457113/2/DIPDescription.doc.
- [137] B. Franek and C. Gaspar, "Smi++-object oriented framework for designing control systems", IEEE Nuclear Science Symposium Conference Record (1997). https://ieeexplore.ieee.org/document/672515/.
- [138] https://www.oracle.com/database/data-guard.
- [139] https://www.can-cia.org/can-knowledge.
- [140] https://www.anagate.de/products/can-ethernet-gateways.php.
- [141] https://opcfoundation.org/about/opc-technologies/opc-ua.
- [142] M. Tkáčik, J. Jadlovský, S. Jadlovská, L. Koska, A. Jadlovská, and M. Donadoni,
   "Fred—flexible framework for frontend electronics control in alice experiment at cern", *Processes* 8 (2020). https://www.mdpi.com/2227-9717/8/5/565.
- [143] P. Chochula, A. Augustinus, P. Bond, A. Kurepin, M. Lechman, J. Lang, and O. Pinazza, "Adapos: An architecture for publishing alice dcs conditions data", 16th International Conference on Accelerator and Large Experimental Physics Control Systems (2017). https: //accelconf.web.cern.ch/icalepcs2017/doi/JACoW-ICALEPCS2017-TUPHA042.html.
- [144] ALICE Collaboration, B. Abelev *et al.*, "Upgrade of the ALICE Experiment: Letter Of Intent", *J. Phys. G* 41 (2014) 087001.
- [145] Z. Citron et al., Report from Working Group 5: Future physics opportunities for high-density QCD at the LHC with heavy-ion and proton beams, vol. 7, pp. 1159–1410. 12, 2019. arXiv:1812.06772 [hep-ph].
- [146] Y. Oh, C. M. Ko, S. H. Lee, and S. Yasui, "Heavy baryon/meson ratios in relativistic heavy ion collisions", *Phys. Rev. C* 79 (2009) 044905, arXiv:0901.1382 [nucl-th].

- [147] S. Plumari, V. Minissale, S. K. Das, G. Coci, and V. Greco, "Charmed Hadrons from Coalescence plus Fragmentation in relativistic nucleus-nucleus collisions at RHIC and LHC", *Eur. Phys. J. C* 78 (2018) 348, arXiv:1712.00730 [hep-ph].
- [148] M. He and R. Rapp, "Hadronization and Charm-Hadron Ratios in Heavy-Ion Collisions", *Phys. Rev. Lett.* **124** (2020) 042301, arXiv:1905.09216 [nucl-th].
- [149] R. Rapp, "Dilepton Spectroscopy of QCD Matter at Collider Energies", Adv. High Energy Phys. 2013 (2013) 148253, arXiv:1304.2309 [hep-ph].
- [150] ALICE Collaboration, "Letter of Intent for an ALICE ITS Upgrade in LS3, CERN-LHCC-2019-018", tech. rep., CERN, Geneva, 2019. https://cds.cern.ch/record/2703140.
- [151] A. Andronic, P. Braun-Munzinger, K. Redlich, and J. Stachel, "Decoding the phase structure of QCD via particle production at high energy", *Nature* 561 (2018) 321–330, arXiv:1710.09425 [nucl-th].
- [152] X. Du and R. Rapp, "Sequential Regeneration of Charmonia in Heavy-Ion Collisions", Nucl. Phys. A 943 (2015) 147–158, arXiv:1504.00670 [hep-ph].
- [153] ALICE Collaboration, "ALICE physics projections for a short oxygen-beam run at the LHC, ALICE-PUBLIC-2021-004",. https://cds.cern.ch/record/2765973.
- [154] **ALICE** Collaboration, "Future high-energy pp programme with ALICE, ALICE-PUBLIC-2020-005", http://cds.cern.ch/record/2724925.
- [155] ALICE Collaboration, S. Acharya *et al.*, "Constraints on jet quenching in p-Pb collisions at  $\sqrt{s_{NN}} = 5.02$  TeV measured by the event-activity dependence of semi-inclusive hadron-jet distributions", *Phys. Lett. B* **783** (2018) 95–113, arXiv:1712.05603 [nucl-ex].
- [156] ALICE Collaboration, J. Adam *et al.*, "Measurement of jet quenching with semi-inclusive hadron-jet distributions in central Pb-Pb collisions at  $\sqrt{s_{NN}} = 2.76$  TeV", *JHEP* **09** (2015) 170, arXiv:1506.03984 [nucl-ex].
- [157] ALICE Collaboration, "Letter of Intent: A Forward Calorimeter (FoCal) in the ALICE experiment, CERN-LHCC-2020-009", tech. rep., CERN, Geneva, Jun, 2020. https://cds.cern.ch/record/2719928.
- [158] ALICE ITS project Collaboration, G. A. Rinella *et al.*, "First demonstration of in-beam performance of bent Monolithic Active Pixel Sensors", *Nucl. Instrum. Meth. A* 1028 (2022) 166280, arXiv:2105.13000 [physics.ins-det].
- [159] ALICE Collaboration, "Letter of intent for ALICE 3: A next-generation heavy-ion experiment at the LHC", arXiv:2211.02491 [physics.ins-det].