OPUS 4 | 004 Datenverarbeitung; Informatik

Analysis of machine learning prediction quality for automated subgroups within the MIMIC III dataset (2023)

Vanek, Jakob

The motivation for this master’s thesis is to explore the potential of predictive data analytics in the field of medicine. For this, the MIMIC-III dataset offers an extensive foundation for the construction of prediction models, including Random Forest, XGBOOST, and deep learning networks. These models were implemented to forecast the mortality of 2,655 stroke patients. The first part of the thesis involved conducting a comprehensive data analysis of the filtered MIMIC-III dataset. Subsequently, the effectiveness and fairness of the predictive models were evaluated. Although the performance levels of the developed models did not match those reported in related research, their potential became evident. The results obtained demonstrated promising capabilities and highlighted the effectiveness of the applied methodologies. Moreover, the feature relevance within the XGBOOST model was examined to increase model explainability. Finally, relevant subgroups were identified to perform a comparative analysis of the prediction performance across these subgroups. While this approach can be regarded as a valuable methodology, it was not possible to investigate underlying reasons for potential unfairness across clusters. Inside the test data, not enough instances remained per subgroup for further fairness or feature relevance analysis. In conclusion, the implementation of an alternative use case with a higher patient count is recommended. The code for this analysis is made available via a GitHub repository and includes a frontend to visualize the results.

BOLD Moments: modeling short visual events through a video fMRI dataset and metadata (2024)

Lahner, Benjamin ; Dwivedi, Kshitij ; Iamshchinina, Polina ; Graumann, Monika ; Lascelles, Alex ; Roig Noguera, Gemma ; Gifford, Alessandro Thomas ; Pan, Bowen ; Jin, SouYoung ; Ratan Murty, N. Apurva ; Kay, Kendrick ; Oliva, Aude ; Cichy, Radoslaw Martin

Studying the neural basis of human dynamic visual perception requires extensive experimental data to evaluate the large swathes of functionally diverse brain neural networks driven by perceiving visual events. Here, we introduce the BOLD Moments Dataset (BMD), a repository of whole-brain fMRI responses to over 1,000 short (3s) naturalistic video clips of visual events across ten human subjects. We use the videos’ extensive metadata to show how the brain represents word- and sentence-level descriptions of visual events and identify correlates of video memorability scores extending into the parietal cortex. Furthermore, we reveal a match in hierarchical processing between cortical regions of interest and video-computable deep neural networks, and we showcase that BMD successfully captures temporal dynamics of visual events at second resolution. With its rich metadata, BMD offers new perspectives and accelerates research on the human brain basis of visual event perception.

Threshold testing and semi-online prophet inequalities (2023)

Hoefer, Martin ; Schewior, Kevin

We study threshold testing, an elementary probing model with the goal to choose a large value out of n i.i.d. random variables. An algorithm can test each variable X_i once for some threshold t_i, and the test returns binary feedback whether X_i ≥ t_i or not. Thresholds can be chosen adaptively or non-adaptively by the algorithm. Given the results for the tests of each variable, we then select the variable with highest conditional expectation. We compare the expected value obtained by the testing algorithm with expected maximum of the variables. Threshold testing is a semi-online variant of the gambler’s problem and prophet inequalities. Indeed, the optimal performance of non-adaptive algorithms for threshold testing is governed by the standard i.i.d. prophet inequality of approximately 0.745 + o(1) as n → ∞. We show how adaptive algorithms can significantly improve upon this ratio. Our adaptive testing strategy guarantees a competitive ratio of at least 0.869 - o(1). Moreover, we show that there are distributions that admit only a constant ratio c < 1, even when n → ∞. Finally, when each box can be tested multiple times (with n tests in total), we design an algorithm that achieves a ratio of 1 - o(1).

Blockchains in public administration : a RADIUS on blockchain framework for public administration (2023)

Lang, Zeki Nejat Philipp Konstantin

The emergence of blockchain technology has generated a great deal of attention, as reflected in numerous scientific and journalistic articles. However, the implementation of blockchain for public administrations in Germany has encountered a setback owing to unsuccessful initiatives. Initial enthusiasm was followed by disillusionment. Nevertheless, technology continues to evolve. This paper examines whether the use of a blockchain can still optimize the processes of public administrations. Not only the failed projects are analysed, but also more current applications of the technology and their potential relevance for the administration, especially in the state of Hesse. To answer if blockchains are promising to administrations, a Design Science Research (DSR) research approach is chosen. The DSR method is a research-based approach that aims to create new and innovative solutions to real-world problems through the development and evaluation of artefacts such as models, methods, or prototypes. For this work, the implementation of a framework to realize an Authentication, Authorization, and Accounting (AAA) system on the blockchain was identified as profitable. The framework aims to implement the aforementioned AAA tasks using a blockchain. The Remote Authentication Dial-In User Service (RADIUS) protocol has been identified as a potential protocol of the AAA system. The goal is to create a way to implement the system either entirely on a blockchain or as a hybrid system. Various blockchain technologies will be considered. Suitable for development, the framework AAA-me is named. The development of AAA-me has shown that the desired framework for implementing RADIUS on the blockchain is possible in various degrees of implementation. Previous work mostly relied on full development. Additionally, it has been shown that AAA-me can be used to perform hybrid integration at different implementation levels. This makes AAA-me stand out from the few hybrid previous approaches. Furthermore, AAA-me was investigated in different laboratory environments. This was to determine the expected resilience against Single Point of Failure (SPOF). The results of the lab investigation indicated that a RADIUS system on top of a blockchain can provide benefits in terms of security and performance. In the lab environment, times were measured within which a series of authorization requests were processed. In addition, it was illustrated how a RADIUS system implemented using blockchain can protect itself against Man-in-the-Middle (MITM) attacks. Finally, in collaboration with the Hessian Central Office for Data Processing (German: Hessische Zentrale für Datenverarbeitung) (HZD), another test lab demonstrated how a RADIUS system on the blockchain can integrate with the existing IT systems of the German state of Hesse. Based on these findings, this work reevaluated the applicability of blockchain technology for public administration processes. The work has thus shown that the use of a blockchain can still be purposeful. However, it has also been shown that an implementation can bring many problems with it. The small number of blockchain developers and engineers also poses the risk of finding people to develop and maintain a system. In addition, one faces the problem of determining an architecture now that will be applied to many projects in the future. However, each project can, in turn, have an impact on the choice of architecture. Once one has solved this problem and a blockchain infrastructure is available, it can be established quickly and be more SPOF resistant, for example, for Public Key Infrastructure (PKI) systems. AAA-me was only applied in lab and test environments. As a result, no real data ran over its own infrastructure. This allowed the necessary flexibility for development. However, system-related properties could appear in real situations that are not detectable here in this way. Furthermore, the initial stage of AAA-me’s development is still in its infancy. Many manual adjustments need to be made in order for this to integrate with an existing RADIUS system. Also, no system security effort in and of itself has been carried out in the lab environments. Thus, vulnerabilities can quickly open up on web servers due to misconfigurations and missing updates. For the above reasons, productive use should be discouraged unless major developments are carried out.

PolarCAP – A deep learning approach for first motion polarity classification of earthquake waveforms (2022)

Chakraborty, Megha ; Quinteros Cartaya, Claudia ; Li, Wei ; Faber, Johannes ; Rümpker, Georg ; Stöcker, Horst ; Srivastava, Nishtha

Highlights • We present PolarCAP, a deep learning model that can classify the polarity of a waveform with a 98% accuracy. • The first-motion polarity of seismograms is a useful parameter, but its manual determination can be laborious and imprecise. • We demonstrate that in several cases the model can assign trace polar-ity more accurately than a human analyst. Abstract The polarity of first P-wave arrivals plays a significant role in the effective determination of focal mechanisms specially for smaller earthquakes. Manual estimation of polarities is not only time-consuming but also prone to human errors. This warrants a need for an automated algorithm for first motion polarity determination. We present a deep learning model - PolarCAP that uses an autoencoder architecture to identify first-motion polarities of earth-quake waveforms. PolarCAP is trained in a supervised fashion using more than 130,000 labelled traces from the Italian seismic dataset (INSTANCE) and is cross-validated on 22,000 traces to choose the most optimal set of hyperparameters. We obtain an accuracy of 0.98 on a completely unseen test dataset of almost 33,000 traces. Furthermore, we check the model generalizability by testing it on the datasets provided by previous works and show that our model achieves a higher recall on both positive and negative polarities.

K48- and K63-linked ubiquitin chain interactome reveals branch- and chain length-specific ubiquitin interactors (2024)

Waltho, Anita ; Popp, Oliver ; Lenz, Christopher ; Pluska, Lukas ; Dötsch, Volker ; Mertins, Phillip ; Sommer, Thomas

The ubiquitin (Ub) code denotes the complex Ub architectures, including Ub chains of different length, linkage-type and linkage combinations, which enable ubiquitination to control a wide range of protein fates. Although many linkage-specific interactors have been described, how interactors are able to decode more complex architectures is not fully understood. We conducted a Ub interactor screen, in humans and yeast, using Ub chains of varying length, as well as, homotypic and heterotypic branched chains of the two most abundant linkage types – K48- and K63-linked Ub. We identified some of the first K48/K63 branch-specific Ub interactors, including histone ADP-ribosyltransferase PARP10/ARTD10, E3 ligase UBR4 and huntingtin-interacting protein HIP1. Furthermore, we revealed the importance of chain length by identifying interactors with a preference for Ub3 over Ub2 chains, including Ub-directed endoprotease DDI2, autophagy receptor CCDC50 and p97-adaptor FAF1. Crucially, we compared datasets collected using two common DUB inhibitors – Chloroacetamide and N-ethylmaleimide. This revealed inhibitor-dependent interactors, highlighting the importance of inhibitor consideration during pulldown studies. This dataset is a key resource for understanding how the Ub code is read.

Sampling rare conformational transitions with a quantum computer (2022)

Ghamari, Danial ; Hauke, Philipp ; Covino, Roberto ; Faccioli, Pietro

Structural rearrangements play a central role in the organization and function of complex biomolecular systems. In principle, Molecular Dynamics (MD) simulations enable us to investigate these thermally activated processes with an atomic level of resolution. In practice, an exponentially large fraction of computational resources must be invested to simulate thermal fluctuations in metastable states. Path sampling methods focus the computational power on sampling the rare transitions between states. One of their outstanding limitations is to efficiently generate paths that visit significantly different regions of the conformational space. To overcome this issue, we introduce a new algorithm for MD simulations that integrates machine learning and quantum computing. First, using functional integral methods, we derive a rigorous low-resolution spatially coarse-grained representation of the system’s dynamics, based on a small set of molecular configurations explored with machine learning. Then, we use a quantum annealer to sample the transition paths of this low-resolution theory. We provide a proof-of-concept application by simulating a benchmark conformational transition with all-atom resolution on the D-Wave quantum computer. By exploiting the unique features of quantum annealing, we generate uncorrelated trajectories at every iteration, thus addressing one of the challenges of path sampling. Once larger quantum machines will be available, the interplay between quantum and classical resources may emerge as a new paradigm of high-performance scientific computing. In this work, we provide a platform to implement this integrated scheme in the field of molecular simulations.

Gradient-consistent enrichment of finite element spaces for the DNS of fluid-particle interaction (2019)

Höllbacher, Susanne ; Wittum, Gabriel

Highlights • Monolithic scheme for particulate flows preventing an oscillating pressure along the interface. • The choice of enriching shape functions is driven by the properties of its gradient instead of its value. • The choice of enriching shape functions inherits a natural stabilization on small cut elements. Abstract We present gradient-consistent enriched finite element spaces for the simulation of free particles in a fluid. This involves forces being exchanged between the particles and the fluid at the interface. In an earlier work [23] we derived a monolithic scheme which includes the interaction forces into the Navier-Stokes equations by means of a fictitious domain like strategy. Due to an inexact approximation of the interface oscillations of the pressure along the interface were observed. In multiphase flows oscillations and spurious velocities are a common issue. The surface force term yields a jump in the pressure and therefore the oscillations are usually resolved by extending the spaces on cut elements in order to resolve the discontinuity. For the construction of the enriched spaces proposed in this paper we exploit the Petrov-Galerkin formulation of the vertex-centered finite volume method (PG-FVM), as already investigated in [23]. From the perspective of the finite volume scheme we argue that wrong discrete normal directions at the interface are the origin of the oscillations. The new perspective of normal vectors suggests to look at gradients rather than values of the enriching shape functions. The crucial parameter of the enrichment functions therefore is the gradient of the shape functions and especially the one of the test space. The distinguishing feature of our construction therefore is an enrichment that is based on the choice of shape functions with consistent gradients. These derivations finally yield a fitted scheme for the immersed interface. We further propose a strategy ensuring a well-conditioned system independent of the location of the interface. The enriched spaces can be used within any existing finite element discretization for the Navier-Stokes equation. Our numerical tests were conducted using the PG-FVM. We demonstrate that the enriched spaces are able to eliminate the oscillations.

Rotational test spaces for a fully-implicit FVM and FEM for the DNS of fluid-particle interaction (2019)

Höllbacher, Susanne ; Wittum, Gabriel

The paper presents a fully-implicit and stable finite element and finite volume scheme for the simulation of freely moving particles in a fluid. The developed method is based on the Petrov-Galerkin formulation of a vertex-centered finite volume method (PG-FVM) on unstructured grids. Appropriate extension of the ansatz and test spaces lead to a formulation comparable to a fictitious domain formulation. The purpose of this work is to introduce a new concept of numerical modeling reducing the mathematical overhead which many other methods require. It exploits the identification of the PG-FVM with a corresponding finite element bilinear form. The surface integrals of the finite volume scheme enable a natural incorporation of the interface forces purely based on the original bilinear operator for the fluid. As a result, there is no need to expand the system of equations to a saddle-point problem. Like for fictitious domain methods the extended scheme treats the particles as rigid parts of the fluid. The distinguishing feature compared to most existing fictitious domain methods is that there is no need for an additional Lagrange multiplier or other artificial external forces for the fluid-solid coupling. Consequently, only one single solve for the derived linear system for the fluid together with the particles is necessary and the proposed method does not require any fractional time stepping scheme to balance the interaction forces between fluid and particles. For the linear Stokes problem we will prove the stability of both schemes. Moreover, for the stationary case the conservation of mass and momentum is not violated by the extended scheme, i.e. conservativity is accomplished within the range of the underlying, unconstrained discretization scheme. The scheme is applicable for problems in two and three dimensions.

Uncertainty quantification in the Henry problem using the multilevel Monte Carlo method (2024)

Logashenko, Dmitry ; Litvinenko, Alexander ; Tempone, Raul ; Vasilyeva, Ekaterina ; Wittum, Gabriel

We investigate the applicability of the well-known multilevel Monte Carlo (MLMC) method to the class of density-driven flow problems, in particular the problem of salinisation of coastal aquifers. As a test case, we solve the uncertain Henry saltwater intrusion problem. Unknown porosity, permeability and recharge parameters are modelled by using random fields. The classical deterministic Henry problem is non-linear and time-dependent, and can easily take several hours of computing time. Uncertain settings require the solution of multiple realisations of the deterministic problem, and the total computational cost increases drastically. Instead of computing of hundreds random realisations, typically the mean value and the variance are computed. The standard methods such as the Monte Carlo or surrogate-based methods are a good choice, but they compute all stochastic realisations on the same, often, very fine mesh. They also do not balance the stochastic and discretisation errors. These facts motivated us to apply the MLMC method. We demonstrate that by solving the Henry problem on multi-level spatial and temporal meshes, the MLMC method reduces the overall computational and storage costs. To reduce the computing cost further, parallelization is performed in both physical and stochastic spaces. To solve each deterministic scenario, we run the parallel multigrid solver ug4 in a black-box fashion.

A wholistic view of continual learning with deep neural networks: forgotten lessons and the bridge to active and open world learning (2023)

Mundt, Martin ; Hong, Yongwon ; Pliushch, Iuliia ; Ramesh, Visvanathan

Current deep learning methods are regarded as favorable if they empirically perform well on dedicated test sets. This mentality is seamlessly reflected in the resurfacing area of continual learning, where consecutively arriving data is investigated. The core challenge is framed as protecting previously acquired representations from being catastrophically forgotten. However, comparison of individual methods is nevertheless performed in isolation from the real world by monitoring accumulated benchmark test set performance. The closed world assumption remains predominant, i.e. models are evaluated on data that is guaranteed to originate from the same distribution as used for training. This poses a massive challenge as neural networks are well known to provide overconfident false predictions on unknown and corrupted instances. In this work we critically survey the literature and argue that notable lessons from open set recognition, identifying unknown examples outside of the observed set, and the adjacent field of active learning, querying data to maximize the expected performance gain, are frequently overlooked in the deep learning era. Hence, we propose a consolidated view to bridge continual learning, active learning and open set recognition in deep neural networks. Finally, the established synergies are supported empirically, showing joint improvement in alleviating catastrophic forgetting, querying data, selecting task orders, while exhibiting robust open world application.

Fading memory as inductive bias in residual recurrent networks (2024)

Dubinin, Igor ; Effenberger, Felix

Residual connections have been proposed as an architecture-based inductive bias to mitigate the problem of exploding and vanishing gradients and increased task performance in both feed-forward and recurrent networks (RNNs) when trained with the backpropagation algorithm. Yet, little is known about how residual connections in RNNs influence their dynamics and fading memory properties. Here, we introduce weakly coupled residual recurrent networks (WCRNNs) in which residual connections result in well-defined Lyapunov exponents and allow for studying properties of fading memory. We investigate how the residual connections of WCRNNs influence their performance, network dynamics, and memory properties on a set of benchmark tasks. We show that several distinct forms of residual connections yield effective inductive biases that result in increased network expressivity. In particular, those are residual connections that (i) result in network dynamics at the proximity of the edge of chaos, (ii) allow networks to capitalize on characteristic spectral properties of the data, and (iii) result in heterogeneous memory properties. In addition, we demonstrate how our results can be extended to non-linear residuals and introduce a weakly coupled residual initialization scheme that can be used for Elman RNNs.

Local homeostatic regulation of the spectral radius of echo-state networks (2020)

Schubert, Fabian ; Gros, Claudius

Recurrent cortical network dynamics plays a crucial role for sequential information processing in the brain. While the theoretical framework of reservoir computing provides a conceptual basis for the understanding of recurrent neural computation, it often requires manual adjustments of global network parameters, in particular of the spectral radius of the recurrent synaptic weight matrix. Being a mathematical and relatively complex quantity, the spectral radius is not readily accessible to biological neural networks, which generally adhere to the principle that information about the network state should either be encoded in local intrinsic dynamical quantities (e.g. membrane potentials), or transmitted via synaptic connectivity. We present two synaptic scaling rules for echo state networks that solely rely on locally accessible variables. Both rules work online, in the presence of a continuous stream of input signals. The first rule, termed flow control, is based on a local comparison between the mean squared recurrent membrane potential and the mean squared activity of the neuron itself. It is derived from a global scaling condition on the dynamic flow of neural activities and requires the separability of external and recurrent input currents. We gained further insight into the adaptation dynamics of flow control by using a mean field approximation on the variances of neural activities that allowed us to describe the interplay between network activity and adaptation as a two-dimensional dynamical system. The second rule that we considered, variance control, directly regulates the variance of neural activities by locally scaling the recurrent synaptic weights. The target set point of this homeostatic mechanism is dynamically determined as a function of the variance of the locally measured external input. This functional relation was derived from the same mean-field approach that was used to describe the approximate dynamics of flow control. The effectiveness of the presented mechanisms was tested numerically using different external input protocols. The network performance after adaptation was evaluated by training the network to perform a time delayed XOR operation on binary sequences. As our main result, we found that flow control can reliably regulate the spectral radius under different input statistics, but precise tuning is negatively affected by interneural correlations. Furthermore, flow control showed a consistent task performance over a wide range of input strengths/variances. Variance control, on the other side, did not yield the desired spectral radii with the same precision. Moreover, task performance was less consistent across different input strengths. Given the better performance and simpler mathematical form of flow control, we concluded that a local control of the spectral radius via an implicit adaptation scheme is a realistic alternative to approaches using classical “set point” homeostatic feedback controls of neural firing. Author summary How can a neural network control its recurrent synaptic strengths such that network dynamics are optimal for sequential information processing? An important quantity in this respect, the spectral radius of the recurrent synaptic weight matrix, is a non-local quantity. Therefore, a direct calculation of the spectral radius is not feasible for biological networks. However, we show that there exist a local and biologically plausible adaptation mechanism, flow control, which allows to control the recurrent weight spectral radius while the network is operating under the influence of external inputs. Flow control is based on a theorem of random matrix theory, which is applicable if inter-synaptic correlations are weak. We apply the new adaption rule to echo-state networks having the task to perform a time-delayed XOR operation on random binary input sequences. We find that flow-controlled networks can adapt to a wide range of input strengths while retaining essentially constant task performance.

Local homeostatic regulation of the spectral radius of echo-state networks (2020)

Schubert, Fabian ; Gros, Claudius

Recurrent cortical network dynamics plays a crucial role for sequential information processing in the brain. While the theoretical framework of reservoir computing provides a conceptual basis for the understanding of recurrent neural computation, it often requires manual adjustments of global network parameters, in particular of the spectral radius of the recurrent synaptic weight matrix. Being a mathematical and relatively complex quantity, the spectral radius is not readily accessible to biological neural networks, which generally adhere to the principle that information about the network state should either be encoded in local intrinsic dynamical quantities (e.g. membrane potentials), or transmitted via synaptic connectivity. We present two synaptic scaling rules for echo state networks that solely rely on locally accessible variables. Both rules work online, in the presence of a continuous stream of input signals. The first rule, termed flow control, is based on a local comparison between the mean squared recurrent membrane potential and the mean squared activity of the neuron itself. It is derived from a global scaling condition on the dynamic flow of neural activities and requires the separability of external and recurrent input currents. We gained further insight into the adaptation dynamics of flow control by using a mean field approximation on the variances of neural activities that allowed us to describe the interplay between network activity and adaptation as a two-dimensional dynamical system. The second rule that we considered, variance control, directly regulates the variance of neural activities by locally scaling the recurrent synaptic weights. The target set point of this homeostatic mechanism is dynamically determined as a function of the variance of the locally measured external input. This functional relation was derived from the same mean-field approach that was used to describe the approximate dynamics of flow control. The effectiveness of the presented mechanisms was tested numerically using different external input protocols. The network performance after adaptation was evaluated by training the network to perform a time delayed XOR operation on binary sequences. As our main result, we found that flow control can reliably regulate the spectral radius under different input statistics, but precise tuning is negatively affected by interneural correlations. Furthermore, flow control showed a consistent task performance over a wide range of input strengths/variances. Variance control, on the other side, did not yield the desired spectral radii with the same precision. Moreover, task performance was less consistent across different input strengths. Given the better performance and simpler mathematical form of flow control, we concluded that a local control of the spectral radius via an implicit adaptation scheme is a realistic alternative to approaches using classical “set point” homeostatic feedback controls of neural firing. Author summary How can a neural network control its recurrent synaptic strengths such that network dynamics are optimal for sequential information processing? An important quantity in this respect, the spectral radius of the recurrent synaptic weight matrix, is a non-local quantity. Therefore, a direct calculation of the spectral radius is not feasible for biological networks. However, we show that there exist a local and biologically plausible adaptation mechanism, flow control, which allows to control the recurrent weight spectral radius while the network is operating under the influence of external inputs. Flow control is based on a theorem of random matrix theory, which is applicable if inter-synaptic correlations are weak. We apply the new adaption rule to echo-state networks having the task to perform a time-delayed XOR operation on random binary input sequences. We find that flow-controlled networks can adapt to a wide range of input strengths while retaining essentially constant task performance.

Ground texture based localization (2023)

Schmid, Jan Fabian

This dissertation is concerned with the task of map-based self-localization, using images of the ground recorded with a downward-facing camera. In this context, map-based (self-)localization is the task of determining the position and orientation of a query image that is to be localized. The map used for this purpose consists of a set of reference images with known positions and orientations in a common coordinate system. For localization, the considered methods determine correspondences between features of the query image and those of the reference images. In comparison with localization approaches that use images of the surrounding environment, we expect that using images of the ground has the advantage that, unlike the surrounding, the visual appearance of the ground is often long-term stable. Also, by using active lighting of the ground, localization becomes independent of external lighting conditions. This dissertation includes content of several published contributions, which present research on the development and testing of methods for feature-based localization of ground images. Our first contribution examines methods for the extraction of image features that have not been designed to be used on ground images. This survey shows that, with appropriate parametrization, several of these methods are well suited for the task. Based on this insight, we develop and examine methods for various subtasks of map-based localization in the following contributions. We examine global localization, where all reference images have to be considered, as well as local localization, where an approximation of the query image position is already known, which allows for disregarding reference images with a large distance to this position. In our second contribution, we present the first systematic comparison of state-of-the-art methods for ground texture based localization. Furthermore, we present a method, which is characterized by its usage of our novel feature matching technique. This technique is called identity matching, as it matches only those features with identical descriptors, in contrast to the state-of-the-art that also matches features with similar descriptors. We show that our method is well suited for global and local localization, as it has favorable scaling with the number of reference images considered during the localization process. In another contribution, we develop a variant of our localization method that is significantly faster to compute, as it applies a sampling approach to determine the image positions at which local features are extracted, instead of using classical feature detectors. Two further contributions are concerned with global localization. The first one introduces a prediction model for the global localization performance, based on an evaluation of the local localization performance. This allows us to quickly evaluate any considered parameter settings of global localization methods. The second contribution introduces a learning-based method that computes compact descriptors of ground images. This descriptor can be used to retrieve the overlapping reference images of a query image from a large set of reference images with little computational effort. The most recent contribution included in this dissertation presents a new ground image database, which was recorded with a dedicated platform using a downward-facing camera. In addition to the data, we also explain our guidelines for the construction of the platform. In comparison with existing databases, our database contains more images and presents a larger variety of ground textures. Furthermore, this database enables us to perform the first systematic evaluation of how localization performance is affected by the time interval between the point in time at which the reference images are recorded and the point in time at which the query image is recorded. We find out that for outdoor areas all ground texture based localization methods have reliability issues, if the time interval between the recording of the query and reference images is large, and also if there are different weather conditions. These findings point to remaining challenges in ground texture base localization that should be addressed in future work.

Prediction of transcription factor binding using epigenomics and genome variation data (2023)

Baumgarten, Nina

A central concern in genetics is to identify mechanisms of transcriptional regulation. The aim is to unravel the mapping between the DNA sequence and gene expression. However, it turned out that this is extremely complex. Gene regulation is highly cell type-specific and even moderate changes in gene ex- pression can have functional consequences. Important contributors to gene regulation are transcription factors (TFs), that are able to directly interact with the DNA. Often, a first step in understanding the effect of a TF on the gene’s regulation is to identify the genomic regions a TF binds to. Therefore, one needs to be aware of the TF’s binding preferences, which are commonly summarized in TF binding motifs. Although for many TFs the binding motif is experimentally validated, there is still a large number of TFs where no binding motif is known. There exist many tools that link TF binding motifs to TFs. We developed the method Massif that improves the performance of such tools by incorporating a domain score that uses the DNA binding domain of the studied TF as additional information. TF binding sites are often enriched in regulatory elements (REMs) such as promoters or enhancers, where the latter can be located megabases away from its target gene. However, to understand the regulation of a gene it is crucial to know where the REMs of a gene are located. We introduced the EpiRegio webserver that holds REMs associated to target genes predicted across many cell types and tissues using STITCHIT, a previously established method. Our publicly available webserver enables to query for REMs associated to genes (gene query) and REMs overlapping genomic regions (region query). We illus- trated the usefulness of EpiRegio by pointing to a TF that occurs enriched in the REMs of differential expressed genes in circPLOD2 depleted pericytes. Further, we highlighted genes, which are affected by CRISPR-Cas induced mutations in non-coding genomic regions using EpiRegio’s region query. Non-coding genetic variants within REMs may alter gene expression by modifying TF binding sites, which can lead to various kinds of traits or diseases. To understand the underlying molecular mechanisms, one aims to evaluate the effect of such genetic variations on TF binding sites. We developed an accurate and fast statistical approach, that can assess whether a single nucleotide polymorphism (SNP) is regulatory. Further, we combined this approach with epigenetic data and additional analyses in our Sneep workflow. For instance, it enables to identify TFs whose binding preferences are affected by the analyzed SNPs, which is illustrated on eQTL datasets for different cell types. Additionally, we used our Sneep workflow to highlight cardiovascular disease genes using regulatory SNPs and REM-gene interactions. Overall, the described results allow a better understanding of REM-gene interactions and their interplay with TFs on gene regulation.

Multidimensionale Charakterisierung reaktiver und neoplastischer menschlicher Lymphknoten unter Anwendung von Methoden aus dem Bereich Bioinformatik, digitale Pathologie, Datenanalyse und Graphentheorie (2023)

Wurzel, Patrick

Das adaptive Immunsystem schützt den Menschen vor extra- wie auch intrakorporal auftretenden Pathogenen und Krebszellen. Die Funktionalität dieses Prozesses geht hierbei auf die Interaktion und Kooperation einer Vielzahl verschiedener Zelltypen des Körpers zurück und ist vorwiegend innerhalb der Lymphknoten lokalisiert. Ist auch nur ein Bestandteil dieses sensiblen Prozesses gestört, kann dies zu einem teilweisen oder vollständigen Verlust der immunologischen Fitness des Menschen führen. Daher war es das Ziel dieser Arbeit, solche Aberrationen des humanen Lymphknotengewebes umfassend digital-pathologisch zu detektieren und zu definieren. Hierfür wurde zunächst eine digitale Gewebedatenbank etabliert. Diese basiert auf dem im Rahmen dieser Arbeit implementierten Content-Management-System Digital Tissue Management Suite. Weiterhin wurde die Software Feature analysis in tissue histomorphometry entwickelt, welche die Analyse von zweidimensionalen whole slide images ermöglicht. Hierbei werden Methoden aus dem Bereich Computer Vision und Graphentheorie eingesetzt, um morphologische und distributionale Eigenschaften der Zelltypen des Lymphknotens zu charakterisieren. Darüber hinaus enthält diese Software Plug-ins zur Visualisierung und statistischen Analyse der Daten. Aufbauend auf der eigens implementierten, digitalen Infrastruktur, in Kombination mit der Software Imaris wurden zweidimensional und dreidimensional gescannte, reaktive und neoplastische Gewebeproben digital phänotypisiert. Hierbei konnten neue mechanische Barrieren zur Kompartimentalisierung der Keimzentren aufgeklärt werden. Weiterhin konnte der Erhalt des quantitativen Verhältnisses einzelner Zellpopulationen innerhalb der Keimzentren beschrieben werden. Ausgehend von den reaktiven Phänotypen des Lymphknotens, wurden pathophysiologische Aberrationen in verschiedenen lymphatischen Neoplasien untersucht. Hierbei konnte gezeigt werden, dass speziell die strukturelle Destruktion häufig mit einer morphologischen Veränderung der fibroblastischen Retikulumzellen einhergeht. Neben strukturellen Veränderungen sind auch zytologische Veränderungen der Tumormikroumgebung zu verzeichnen. Eine besondere Rolle spielen hierbei sogenannte Tumor-assoziierte Makrophagen. Im Rahmen dieser Arbeit konnte gezeigt werden, dass speziell Makrophagen in der Tumormikroumgebung des diffus großzelligen B-Zell-Lymphoms und der chronisch lymphatischen Leukämie spezifische pathophysiologische Veränderungen aufzeigen. Auch konnte gezeigt werden, dass genetische Änderungen neoplastischer B-Zellen mit einer generellen Reduktion der CD20-Antigendichte einhergehen. Zusammenfassend ermöglichten die Ergebnisse die Generierung eines umfassenden digital-pathologischen Profils des klassischen Hodgkin-Lymphoms. Hierbei konnten morphologische Veränderungen neoplastischer, CD30-positiver Hodgkin-Reed-Sternberg-Zellen validiert und beschrieben werden. Auch konnten pathologische Veränderungen des Konnektoms und der Tumormikroumgebung dieser Zellen parametrisiert und quantifiziert werden. Abschließend wurde unter Anwendung eines Random forest-Klassifikators die diagnostische Potenz digital-pathologischer Profile evaluiert und validiert.

Metahumans in der Unreal Engine für Multiuser-VR-Anwendungen (2024)

Völkner, Tobias

Metahumans ist ein innovatives Framework für die Unreal Engine, das hochgradig realistische digitale Charaktere zur Verfügung stellt. Metahumans zeichnen sich durch eine vollständige Control Rig aus, die es Entwicklern ermöglicht, vorgefertigte Animationen zu nutzen und sie nach Bedarf anzupassen und zu erweitern. Im Rahmen dieser wissenschaftlichen Arbeit wird die Anwendung von Metahumans in der virtuellen Umgebung der Unreal Engine 5 untersucht. Das Hauptziel besteht darin, die Fähigkeit eines Metahumans zu untersuchen, mittels eines herkömmlichen Virtual Reality Headsets mithilfe von Motion Tracking gesteuert und animiert zu werden. Dabei wird speziell auf die Verwendung von Inverse Kinematics als Methode zur Erzeugung möglichst natürlicher Bewegungsabläufe eingegangen. Zusätzlich wird angestrebt, die Interaktion zwischen verschiedenen Metahuman-Avataren in einer Online-Sitzung zu ermöglichen. Um den Einfluss auf das Immersionserlebnis der Benutzerinnen und Benutzer zu analysieren, werden Probandinnen und Probanden eingeladen, ihre Nutzererfahrungen zu evaluieren. Zu diesem Zweck werden zwei vergleichbare Level erstellt: eines in der Unreal Engine mit Metahumans und das andere in Unity mit den Meta Avataren von Oculus. Diese wissenschaftliche Untersuchung zielt darauf ab, ein umfassendes Verständnis für die Leistungsfähigkeit von Metahumans zu erlangen, insbesondere im Vergleich zu anderen Avatar-Systemen.

Machine learning sentiment analysis, COVID-19 news and stock market reactions (2023)

Costola, Michele ; Hinz, Oliver ; Nofer, Michael ; Pelizzon, Loriana

The recent COVID-19 pandemic represents an unprecedented worldwide event to study the influence of related news on the financial markets, especially during the early stage of the pandemic when information on the new threat came rapidly and was complex for investors to process. In this paper, we investigate whether the flow of news on COVID-19 had an impact on forming market expectations. We analyze 203,886 online articles dealing with COVID-19 and published on three news platforms (MarketWatch.com, NYTimes.com, and Reuters.com) in the period from January to June 2020. Using machine learning techniques, we extract the news sentiment through a financial market-adapted BERT model that enables recognizing the context of each word in a given item. Our results show that there is a statistically significant and positive relationship between sentiment scores and S&P 500 market. Furthermore, we provide evidence that sentiment components and news categories on NYTimes.com were differently related to market returns.

Time constrained verification of analog circuits using model-checking algorithms (2006)

Grabowski, Darius ; Platte, Daniel ; Hedrich, Lars ; Barke, Erich

In this contribution we present algorithms for model checking of analog circuits enabling the specification of time constraints. Furthermore, a methodology for defining time-based specifications is introduced. An already known method for model checking of integrated analog circuits has been extended to take into account time constraints. The method will be presented using three industrial circuits. The results of model checking will be compared to verification by simulation.

Open Access

004 Datenverarbeitung; Informatik

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Institute

891 search hits