Towards the integrated ALICE Online-Offline (O2) monitoring subsystem

  • ALICE (A Large Ion Collider Experiment) is preparing for a major upgrade of the detector, readout and computing systemsfor LHC Run 3. A new facility called O2 (Online-Offline) will play a major role in data compression and event processing. To efficiently operate the experiment, we are designing a monitoring subsystem, which will provide a complete overview of the O2 overall health, detect performance degradation and component failures. The monitoring subsystem will receive and collect up to 600 kHz of performance metrics. It consists of a custom monitoring library and a server-side, distributed software covering five main functional tasks: parameter collection and processing, storage, visualisation and alarms. To select the most appropriate tools for these tasks, we evaluated three options: “Modular Stack”, Zabbix and the currently used ALICE Grid monitoring tool called MonALISA. The former one consists of a toolkit including collectd, Apache Flume, Apache Spark, InfluxDB, Grafana and Riemann. This paper describes the monitoring subsystem functional architecture. It goes through a complete evaluation of the three considered options, the selection process, risk assessment and justification for the final decision. The in-depth comparison includes functional features and throughput measurement to ensure the required processing and storage performance.

Download full text files

Export metadata

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Vasco Miguel Chibante BarrosoORCiD, Domenico EliaORCiDGND, Costin Grigoraș, Andrés Gómez RamírezORCiDGND, Gioacchino Vino, Adam WegrzynekORCiD
URN:urn:nbn:de:hebis:30:3-718075
DOI:https://doi.org/10.1051/epjconf/201921403043
ISSN:2100-014X
Parent Title (English):EPJ Web of Conferences
Publisher:EDP Sciences
Place of publication:Les Ulis
Document Type:Article
Language:English
Date of Publication (online):2019/09/17
Date of first Publication:2019/09/17
Publishing Institution:Universitätsbibliothek Johann Christian Senckenberg
Contributing Corporation:International Conference on Computing in High Energy and Nuclear Physics (23. : 2018 : Sofia)
Release Date:2023/02/02
Volume:214
Issue:03043
Page Number:8
HeBIS-PPN:505760770
Institutes:Physik / Physik
Dewey Decimal Classification:5 Naturwissenschaften und Mathematik / 53 Physik / 530 Physik
Sammlungen:Universitätspublikationen
Licence (German):License LogoCreative Commons - Namensnennung 4.0