Informatik
Refine
Year of publication
- 2019 (159) (remove)
Document Type
- Preprint (107)
- Article (35)
- Book (5)
- Working Paper (5)
- Doctoral Thesis (4)
- Conference Proceeding (2)
- Bachelor Thesis (1)
Has Fulltext
- yes (159)
Is part of the Bibliography
- no (159)
Keywords
Institute
- Informatik (159)
- Frankfurt Institute for Advanced Studies (FIAS) (135)
- Physik (133)
- Präsidium (5)
- Biowissenschaften (2)
- Deutsches Institut für Internationale Pädagogische Forschung (DIPF) (2)
- Medizin (2)
- Gesellschaftswissenschaften (1)
- Hochschulrechenzentrum (1)
- Senckenbergische Naturforschende Gesellschaft (1)
Gegenstand der hier vorgestellten Arbeit ist eine Applikation für die virtuelle Realität (VR), die in der Lage ist, die Struktur eines beliebigen Textes als begehbare, interaktive Stadt zu visualisieren. Darüber hinaus bietet das Programm eine besondere Textsuche an, die so in anderen konventionellen Textverarbeitungsprogrammen nicht vorzufinden ist. Dank der strukturellen Analyse und der Verwendung einiger außergewöhnlicher Analysetools des TextImager [2], ermöglicht text2City nicht nur die Suche nach bestimmten Textmustern, sondern zum Beispiel auch die Bestimmung der Textebene (Wort, Satz, Absatz, etc.) und einiges mehr. Ein weiteres Feature ist die Kommunikationsverbindung zwischen dem TextAnnotator-Service [1] und text2City, die dem Benutzer die Möglichkeit zum Annotieren bietet, aber auch von anderen Personen durchgeführte Annotationen sofort sichtbar machen kann. Für die Ausführung des Programms ist eine der beiden VRBrillen, Oculus Rift oder HTC Vive, ein für VR geeigneter PC, sowie die Software Unity nötig.
Human readers have the ability to infer knowledge from text, even if that particular information is not explicitly stated. In this thesis, we address the phenomena of text-level implicit information and outline novel automated methods for its recovery.
The main focus of this work is on two types of unexpressed content that arises between sentences (implicit discourse relations) and within sentences (implicit semantic roles).
Traditional approaches mostly rely on costly rich linguistic features, e.g., sentiment or frame-based lexicons, and require heuristics or manual feature engineering.
As an improvement, we propose a collection of generic resource-lean methods, implemented in the form of statistical background knowledge or by means of neural architectures.
Our models are largely language-independent and produce state-of-the-art performance, e.g., in the classification of Chinese implicit discourse relations, or the detection of locally covert predicative arguments in free texts.
In novel experiments, we quantitatively demonstrate that both types of implicit information are mutually dependent insofar as, for instance, some implicit roles directly correlate with implicit discourse relations of similar properties.
We show that implicit information processing further benefits downstream applications and demonstrate its applicability to the higher-level task of narrative story understanding.
In the conclusion of the dissertation, we argue for the need of implicit information processing in order to realize the goal of true natural language understanding.
Relying on the theory of Saward (2010) and Disch (2015), we study political representation through the lens of representative claim-making. We identify a gap between the theoretical concept of claim-making and the empirical (quantitative) assessment of representative claims made in the real world’s representative contexts. Therefore, we develop a new approach to map and quantify representative claims in order to subsequently measure the reception and validation of the claims by the audience. To test our method, we analyse all the debates of the German parliament concerned with the introduction of the gender quota in German supervisory boards from 2013 to 2017 in a two-step process. At first, we assess which constituencies the MPs claim to represent and how they justify their stance. Drawing on multiple correspondence analysis, we identify different claim patterns. Second, making use of natural language processing techniques and logistic regression on social media data, we measure if and how the asserted claims in the parliamentary debates are received and validated by the respective audience. We come to the conclusion that the constituency as ultimate judge of legitimacy has not been comprehensively conceptualized yet.
Dancing is an activity that positively enhances the mood of people that consists of feeling the music and expressing it in rhythmic movements with the body. Learning how to dance can be challenging because it requires proper coordination and understanding of rhythm and beat. In this paper, we present the first implementation of the Dancing Coach (DC), a generic system designed to support the practice of dancing steps, which in its current state supports the practice of basic salsa dancing steps. However, the DC has been designed to allow the addition of more dance styles. We also present the first user evaluation of the DC, which consists of user tests with 25 participants. Results from the user test show that participants stated they had learned the basic salsa dancing steps, to move to the beat and body coordination in a fun way. Results also point out some direction on how to improve the future versions of the DC.
The development of multimodal sensor-based applications designed to support learners with the improvement of their skills is expensive since most of these applications are tailor-made and built from scratch. In this paper, we show how the Presentation Trainer (PT), a multimodal sensor-based application designed to support the development of public speaking skills, can be modularly extended with a Virtual Reality real-time feedback module (VR module), which makes usage of the PT more immersive and comprehensive. The described study consists of a formative evaluation and has two main objectives. Firstly, a technical objective is concerned with the feasibility of extending the PT with an immersive VR Module. Secondly, a user experience objective focuses on the level of satisfaction of interacting with the VR extended PT. To study these objectives, we conducted user tests with 20 participants. Results from our test show the feasibility of modularly extending existing multimodal sensor-based applications, and in terms of learning and user experience, results indicate a positive attitude of the participants towards using the application (PT+VR module).
Multi-view microscopy techniques are used to increase the resolution along the optical axis for 3D imaging. Without this, the resolution is insufficient to resolve subcellular events. In addition, parts of the images of opaque specimens are often highly degraded or masked. Both problems motivate scientists to record the same specimen from multiple directions. The images, then have to be digitally fused into a single high-quality image. Selective-plane illumination microscopy has proven to be a powerful imaging technique due to its unsurpassed acquisition speed and gentle optical sectioning. However, even in the case of multi view imaging techniques that illuminate and image the sample from multiple directions, light scattering inside tissues often severely impairs image contrast.
Here we show that for c-elegans embryos multi view registration can be achieved based on segmented nuclei. However, segmentation of nuclei in high density distribution like c-elegans embryo is challenging. We propose a method which uses 3D Mexican hat filter for preprocessing and 3D Gaussian curvature for the post-processing step to separate nuclei. We used this method successfully on 3 data sets of c-elegans embryos in 3 different views. The result of segmentation outperforms previous methods. Moreover, we provide a simple GUI for manual correction and adjusting the parameters for different data.
We then proposed a method that combines point and voxel registration for an accurate multi view reg- istration of c-elegans embryo, which does not need any special experimental preparation. We demonstrate the performance of our approach on data acquired from fixed embryos of c-elegans worms. This multi step approach is successfully evaluated by comparison to different methods and also by using synthetic data. The proposed method could overcome the typically low resolution along the optical axis and enable stitching to- gether the different parts of the embryo available through the different views. A tool for running the code and analyzing the results is developed.
The impact of columnar file formats on SQL‐on‐hadoop engine performance: a study on ORC and Parquet
(2019)
Columnar file formats provide an efficient way to store data to be queried by SQL‐on‐Hadoop engines. Related works consider the performance of processing engine and file format together, which makes it impossible to predict their individual impact. In this work, we propose an alternative approach: by executing each file format on the same processing engine, we compare the different file formats as well as their different parameter settings. We apply our strategy to two processing engines, Hive and SparkSQL, and evaluate the performance of two columnar file formats, ORC and Parquet. We use BigBench (TPCx‐BB), a standardized application‐level benchmark for Big Data scenarios. Our experiments confirm that the file format selection and its configuration significantly affect the overall performance. We show that ORC generally performs better on Hive, whereas Parquet achieves best performance with SparkSQL. Using ZLIB compression brings up to 60.2% improvement with ORC, while Parquet achieves up to 7% improvement with Snappy. Exceptions are the queries involving text processing, which do not benefit from using any compression.