TY - JOUR A1 - Chibante Barroso, Vasco Miguel A1 - Elia, Domenico A1 - Grigoraș, Costin A1 - Gómez Ramírez, Andrés A1 - Vino, Gioacchino A1 - Wegrzynek, Adami Tadeusz T1 - Towards the integrated ALICE Online-Offline (O2) monitoring subsystem T2 - EPJ Web of Conferences N2 - ALICE (A Large Ion Collider Experiment) is preparing for a major upgrade of the detector, readout and computing systemsfor LHC Run 3. A new facility called O2 (Online-Offline) will play a major role in data compression and event processing. To efficiently operate the experiment, we are designing a monitoring subsystem, which will provide a complete overview of the O2 overall health, detect performance degradation and component failures. The monitoring subsystem will receive and collect up to 600 kHz of performance metrics. It consists of a custom monitoring library and a server-side, distributed software covering five main functional tasks: parameter collection and processing, storage, visualisation and alarms. To select the most appropriate tools for these tasks, we evaluated three options: “Modular Stack”, Zabbix and the currently used ALICE Grid monitoring tool called MonALISA. The former one consists of a toolkit including collectd, Apache Flume, Apache Spark, InfluxDB, Grafana and Riemann. This paper describes the monitoring subsystem functional architecture. It goes through a complete evaluation of the three considered options, the selection process, risk assessment and justification for the final decision. The in-depth comparison includes functional features and throughput measurement to ensure the required processing and storage performance. Y1 - 2019 UR - http://publikationen.ub.uni-frankfurt.de/frontdoor/index/index/docId/71807 UR - https://nbn-resolving.org/urn:nbn:de:hebis:30:3-718075 SN - 2100-014X VL - 214 IS - 03043 PB - EDP Sciences CY - Les Ulis ER -