Classifying, evaluating and advancing big data benchmarks
- The main contribution of the thesis is in helping to understand which software system parameters mostly affect the performance of Big Data Platforms under realistic workloads. In detail, the main research contributions of the thesis are: 1. Definition of the new concept of heterogeneity for Big Data Architectures (Chapter 2); 2. Investigation of the performance of Big Data systems (e.g. Hadoop) in virtualized environments (Section 3.1); 3. Investigation of the performance of NoSQL databases versus Hadoop distributions (Section 3.2); 4. Execution and evaluation of the TPCx-HS benchmark (Section 3.3); 5. Evaluation and comparison of Hive and Spark SQL engines using benchmark queries (Section 3.4); 6. Evaluation of the impact of compression techniques on SQL-on-Hadoop engine performance (Section 3.5); 7. Extensions of the standardized Big Data benchmark BigBench (TPCx-BB)(Section 4.1 and 4.3); 8. Definition of a new benchmark, called ABench (Big Data Architecture Stack Benchmark), that takes into account the heterogeneity of Big Data architectures (Section 4.5). The thesis is an attempt to re-define system benchmarking taking into account the new requirements posed by the Big Data applications. With the explosion of Artificial Intelligence (AI) and new hardware computing power, this is a first step towards a more holistic approach to benchmarking.
Author: | Todor Ivanov |
---|---|
URN: | urn:nbn:de:hebis:30:3-511574 |
Place of publication: | Frankfurt am Main |
Referee: | Roberto V. Zicari, Carsten Binning |
Advisor: | Roberto V. Zicari |
Document Type: | Doctoral Thesis |
Language: | English |
Date of Publication (online): | 2019/12/09 |
Year of first Publication: | 2019 |
Publishing Institution: | Universitätsbibliothek Johann Christian Senckenberg |
Granting Institution: | Johann Wolfgang Goethe-Universität |
Date of final exam: | 2019/07/23 |
Release Date: | 2019/09/19 |
Tag: | Big Data; Big Data Benchmarks |
Page Number: | 354 |
HeBIS-PPN: | 453443508 |
Institutes: | Informatik und Mathematik |
Dewey Decimal Classification: | 0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik |
Sammlungen: | Universitätspublikationen |
Licence (German): | Deutsches Urheberrecht |