OPUS 4 | Institutes

Masur–Veech volume of the gothic locus (2020)

We calculate the Masur–Veech volume of the gothic locus G in the stratum H(23) of genus 4. Our method is based on the use of the formulae for the Euler characteristics of gothic Teichmu ̈ller curves to determine the number of lattice points of given area. We also use this method to recal- culate the Masur–Veech volumes of the Prym loci P3 ⊂ H(4) and P4 ⊂ H(6) in genus 3 and 4.

Voting for POS tagging of latin texts: using the flair of FLAIR to better ensemble classifiers by example of latin (2020)

Stoeckel, Manuel ; Henlein, Alexander ; Hemati, Wahed ; Mehler, Alexander

Despite the great importance of the Latin language in the past, there are relatively few resources available today to develop modern NLP tools for this language. Therefore, the EvaLatin Shared Task for Lemmatization and Part-of-Speech (POS) tagging was published in the LT4HALA workshop. In our work, we dealt with the second EvaLatin task, that is, POS tagging. Since most of the available Latin word embeddings were trained on either few or inaccurate data, we trained several embeddings on better data in the first step. Based on these embeddings, we trained several state-of-the-art taggers and used them as input for an ensemble classifier called LSTMVoter. We were able to achieve the best results for both the cross-genre and the cross-time task (90.64% and 87.00%) without using additional annotated data (closed modality). In the meantime, we further improved the system and achieved even better results (96.91% on classical, 90.87% on cross-genre and 87.35% on cross-time).

Entwicklung und Erprobung eines interaktiven 3D-Stadtmodells am Beispiel des Personennahverkehrsnetzwerks der Stadt Frankfurt (2020)

Smajić, Alen

Das Ziel dieser Arbeit ist die realitätsgetreue Entwicklung eines interaktiven 3D-Stadtmodells, welches auf den ÖPNV zugeschnitten ist. Dabei soll das Programm anhand von Benutzereingaben und mit Hilfe einer Datenquelle, automatisch eine dreidimensionale Visualisierung der Gebäude erzeugen und den lokalen ÖPNV mitintegrieren. Als Beispiel der Ausarbeitung diente das ÖPNV-Netz der Stadt Frankfurt. Hierbei wurde auf die Problematik der Erhebung von Geoinformationen und der Verarbeitung von solchen komplexen Daten eingegangen. Es wurde ermittelt, welche Nutzergruppen einen Mehrwert durch eine derartige 3D Visualisierung haben und welche neuen Erweiterungs- und Nutzungspotenziale das Modell bietet. Dem Leser soll insbesondere ein Einblick in die Generierung von interaktiven 3D-Modellen aus reinen Rohdaten verschafft werden. Dazu wurde als Entwicklungsumgebung die Spiele-Engine Unity eingesetzt, welche sich als sehr fähiges und modernes Entwicklungswerkzeug bei der Erstellung von funktionalen 3D-Visualisierungen herausgestellt hat. Als Datenquelle wurde das OpenStreetMap Projekt benutzt und im Rahmen dieser Arbeit behandelt. Anschließend wurde zur Evaluation, das Modell verschiedenen Nutzern bereitgestellt und anhand eines Fragebogens evaluiert.

First passage percolation in the mean field limit (2020)

Schertzer, Adrien

This dissertation deals with two classical problems in statistical mechanics: the first passage percolation on Euclidean spaces, FPP for short, in both directed and undirected settings.

Scalable generation of random graphs (2020)

Penschuck, Manuel

Netzwerkmodelle spielen in verschiedenen Wissenschaftsdisziplinen eine wichtige Rolle und dienen unter anderem der Beschreibung realistischer Graphen. Sie werden häufig als Zufallsgraphen formuliert und stellen somit Wahrscheinlichkeitsverteilungen über Graphen dar. Meist ist die Verteilung dabei parametrisiert und ergibt sich implizit, etwa über eine randomisierten Konstruktionsvorschrift. Ein früher Vertreter ist das G(n,p) Modell, welches über allen ungerichteten Graphen mit n Knoten definiert ist und jede Kante unabhängig mit Wahrscheinlichkeit p erzeugt. Ein aus G(n,p) gezogener Graph hat jedoch kaum strukturelle Ähnlichkeiten zu Graphen, die zumeist in Anwendungen beobachtet werden. Daher sind populäre Modelle so gestaltet, dass sie mit hinreichend hoher Wahrscheinlichkeit gewünschte topologische Eigenschaften erzeugen. Beispielsweise ist es ein gängiges Ziel die nur unscharf definierte Klasse der sogenannten komplexen Netzwerke nachzubilden, der etwa viele soziale Netze zugeordnet werden. Unter anderem verfügen diese Graphen in der Regel über eine Gradverteilung mit schweren Rändern (heavy-tailed), einen kleinen Durchmesser, eine dominierende Zusammenhangskomponente, sowie über überdurchschnittlich dichte Teilbereiche, sogenannte Communities. Die Einsatzmöglichkeiten von Netzwerkmodellen gehen dabei weit über das ursprüngliche Ziel, beobachtete Effekte zu erklären, hinaus. Ein gängiger Anwendungsfall besteht darin, Daten systematisch zu produzieren. Solche Daten ermöglichen oder unterstützen experimentelle Untersuchungen, etwa zur empirischen Verifikation theoretischer Vorhersagen oder zur allgemeinen Bewertung von Algorithmen und Datenstrukturen. Hierbei ergeben sich insbesondere für große Probleminstanzen Vorteile gegenüber beobachteten Netzen. So sind massive Eingaben, die auf echten Daten beruhen, oft nicht in ausreichender Menge verfügbar, nur aufwendig zu beschaffen und zu verwalten, unterliegen rechtlichen Beschränkungen, oder sind von unklarer Qualität. In der vorliegenden Arbeit betrachten wir daher algorithmische Aspekte der Generierung massiver Zufallsgraphen. Um Anwendern Reproduzierbarkeit mit vorhandenen Studien zu ermöglichen, fokussieren wir uns hierbei zumeist auf getreue Implementierungen etablierter Netzwerkmodelle, etwa Preferential Attachment-Prozesse, LFR, simple Graphen mit vorgeschriebenen Gradsequenzen, oder Graphen mit hyperbolischer (o.Ä.) Einbettung. Zu diesem Zweck entwickeln wir praktisch sowie analytisch effiziente Generatoren. Unsere Algorithmen sind dabei jeweils auf ein geeignetes Maschinenmodell hin optimiert. Hierzu entwerfen wir etwa klassische sequentielle Generatoren für Registermaschinen, Algorithmen für das External Memory Model, und parallele Ansätze für verteilte oder Shared Memory-Maschinen auf CPUs, GPUs, und anderen Rechenbeschleunigern.

Requirements engineering and tool-support for security and privacy (2020)

Pape, Sebastian

In order to address security and privacy problems in practice, it is very important to have a solid elicitation of requirements, before trying to address the problem. In this thesis, specific challenges of the areas of social engineering, security management and privacy enhancing technologies are analyzed: Social Engineering: An overview of existing tools usable for social engineering is provided and defenses against social engineering are analyzed. Serious games are proposed as a more pleasant way to raise employees’ awareness and to train them. Security Management: Specific requirements for small and medium sized energy providers are analyzed and a set of tools to support them in assessing security risks and improving their security is proposed. Larger enterprises are supported by a method to collect security key performance indicators for different subsidiaries and with a risk assessment method for apps on mobile devices. Furthermore, a method to select a secure cloud provider – the currently most popular form of outsourcing – is provided. Privacy Enhancing Technologies: Relevant factors for the users’ adoption of privacy enhancing technologies are identified and economic incentives and hindrances for companies are discussed. Privacy by design is applied to integrate privacy into the use cases e-commerce and internet of things.

Bayesian machine learning for financial modeling (2020)

Nirwan, Rajbir Singh

Machine Learning (ML) is so pervasive in our todays life that we don't even realise that, more often than expected, we are using systems based on it. It is also evolving faster than ever before. When deploying ML systems that make decisions on their own, we need to think about their ignorance of our uncertain world. The uncertainty might arise due to scarcity of the data, the bias of the data or even a mismatch between the real world and the ML-model. Given all these uncertainties, we need to think about how to build systems that are not totally ignorant thereof. Bayesian ML can to some extent deal with these problems. The specification of the model using probabilities provides a convenient way to quantify uncertainties, which can then be included in the decision making process. In this thesis, we introduce the Bayesian ansatz to modeling and apply Bayesian ML models in finance and economics. Especially, we will dig deeper into Gaussian processes (GP) and Gaussian process latent variable model (GPLVM). Applied to the returns of several assets, GPLVM provides the covariance structure and also a latent space embedding thereof. Several financial applications can be build upon the output of the GPLVM. To demonstrate this, we build an automated asset allocation system, a predictor for missing asset prices and identify other structure in financial data. It turns out that the GPLVM exhibits a rotational symmetry in the latent space, which makes it harder to fit. Our second publication reports, how to deal with that symmetry. We propose another parameterization of the model using Householder transformations, by which the symmetry is broken. Bayesian models are changed by reparameterization, if the prior is not changed accordingly. We provide the correct prior distribution of the new parameters, such that the model, i.e. the data density, is not changed under the reparameterization. After applying the reparametrization on Bayesian PCA, we show that the symmetry of nonlinear models can also be broken in the same way. In our last project, we propose a new method for matching quantile observations, which uses order statistics. The use of order statistics as the likelihood, instead of a Gaussian likelihood, has several advantages. We compare these two models and highlight their advantages and disadvantages. To demonstrate our method, we fit quantiled salary data of several European countries. Given several candidate models for the fit, our method also provides a metric to choose the best option. We hope that this thesis illustrates some benefits of Bayesian modeling (especially Gaussian processes) in finance and economics and its usage when uncertainties are to be quantified.

The 𝒮-cone and a primal-dual view on second-order representability (2020)

Naumann, Helen ; Theobald, Thorsten

The 𝒮-cone provides a common framework for cones of polynomials or exponen- tial sums which establish non-negativity upon the arithmetic-geometric inequality, in particular for sums of non-negative circuit polynomials (SONC) or sums of arithmetic- geometric exponentials (SAGE). In this paper, we study the S-cone and its dual from the viewpoint of second-order representability. Extending results of Averkov and of Wang and Magron on the primal SONC cone, we provide explicit generalized second- order descriptions for rational S-cones and their duals.

Generic tasks for algorithms (2020)

Milicic, Gregor ; Wetzel, Sina ; Ludwig, Matthias

Due to its links to computer science (CS), teaching computational thinking (CT) often involves the handling of algorithms in activities, such as their implementation or analysis. Although there already exists a wide variety of different tasks for various learning environments in the area of computer science, there is less material available for CT. In this article, we propose so-called Generic Tasks for algorithms inspired by common programming tasks from CS education. Generic Tasks can be seen as a family of tasks with a common underlying structure, format, and aim, and can serve as best-practice examples. They thus bring many advantages, such as facilitating the process of creating new content and supporting asynchronous teaching formats. The Generic Tasks that we propose were evaluated by 14 experts in the field of Science, Technology, Engineering, and Mathematics (STEM) education. Apart from a general estimation in regard to the meaningfulness of the proposed tasks, the experts also rated which and how strongly six core CT skills are addressed by the tasks. We conclude that, even though the experts consider the tasks to be meaningful, not all CT-related skills can be specifically addressed. It is thus important to define additional tasks for CT that are detached from algorithms and programming.

From topic networks to distributed cognitive maps: Zipfian topic universes in the area of volunteered geographic information (2020)

Mehler, Alexander ; Gleim, Rüdiger ; Gaitsch, Regina ; Hemati, Wahed ; Uslu, Tolga

Are nearby places (e.g., cities) described by related words? In this article, we transfer this research question in the field of lexical encoding of geographic information onto the level of intertextuality. To this end, we explore Volunteered Geographic Information (VGI) to model texts addressing places at the level of cities or regions with the help of so-called topic networks. This is done to examine how language encodes and networks geographic information on the aboutness level of texts. Our hypothesis is that the networked thematizations of places are similar, regardless of their distances and the underlying communities of authors. To investigate this, we introduce Multiplex Topic Networks (MTN), which we automatically derive from Linguistic Multilayer Networks (LMN) as a novel model, especially of thematic networking in text corpora. Our study shows a Zipfian organization of the thematic universe in which geographical places (especially cities) are located in online communication. We interpret this finding in the context of cognitive maps, a notion which we extend by so-called thematic maps. According to our interpretation of this finding, the organization of thematic maps as part of cognitive maps results from a tendency of authors to generate shareable content that ensures the continued existence of the underlying media. We test our hypothesis by example of special wikis and extracts of Wikipedia. In this way, we come to the conclusion that geographical places, whether close to each other or not, are located in neighboring semantic places that span similar subnetworks in the topic universe.

Open Access

Institutes

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Institute

32 search hits