Classifying, evaluating and advancing big data benchmarks

The main contribution of the thesis is in helping to understand which software system parameters mostly affect the performance of Big Data Platforms under realistic workloads. In detail, the main research contributions o
The main contribution of the thesis is in helping to understand which software system parameters mostly affect the performance of Big Data Platforms under realistic workloads. In detail, the main research contributions of the thesis are:
1. Definition of the new concept of heterogeneity for Big Data Architectures (Chapter 2);
2. Investigation of the performance of Big Data systems (e.g. Hadoop) in virtualized environments (Section 3.1);
3. Investigation of the performance of NoSQL databases versus Hadoop distributions (Section 3.2);
4. Execution and evaluation of the TPCx-HS benchmark (Section 3.3);
5. Evaluation and comparison of Hive and Spark SQL engines using benchmark queries (Section 3.4);
6. Evaluation of the impact of compression techniques on SQL-on-Hadoop engine performance (Section 3.5);
7. Extensions of the standardized Big Data benchmark BigBench (TPCx-BB)(Section 4.1 and 4.3);
8. Definition of a new benchmark, called ABench (Big Data Architecture Stack Benchmark), that takes into account the heterogeneity of Big Data architectures (Section 4.5).
The thesis is an attempt to re-define system benchmarking taking into account the new requirements posed by the Big Data applications. With the explosion of Artificial Intelligence (AI) and new hardware computing power, this is a first step towards a more holistic approach to benchmarking.
show moreshow less

Download full text files

Export metadata

  • Export Bibtex
  • Export RIS

Additional Services

    Share in Twitter Search Google Scholar
Metadaten
Author:Todor Ivanov
URN:urn:nbn:de:hebis:30:3-511574
Place of publication:Frankfurt am Main
Referee:Roberto V. Zicari, Carsten Binning
Advisor:Roberto V. Zicari
Document Type:Doctoral Thesis
Language:English
Date of Publication (online):2019/12/09
Year of first Publication:2019
Publishing Institution:Universitätsbibliothek Johann Christian Senckenberg
Granting Institution:Johann Wolfgang Goethe-Universität
Date of final exam:2019/07/23
Release Date:2019/09/19
Tag:Big Data; Big Data Benchmarks
Pagenumber:354
HeBIS PPN:453443508
Institutes:Informatik und Mathematik
Dewey Decimal Classification:004 Datenverarbeitung; Informatik
Sammlungen:Universitätspublikationen
Licence (German):License Logo Veröffentlichungsvertrag für Publikationen

$Rev: 11761 $