Retrieving implicit relations from text: Hidden semantics and natural language processing

  • Human readers have the ability to infer knowledge from text, even if that particular information is not explicitly stated. In this thesis, we address the phenomena of text-level implicit information and outline novel automated methods for its recovery. The main focus of this work is on two types of unexpressed content that arises between sentences (implicit discourse relations) and within sentences (implicit semantic roles). Traditional approaches mostly rely on costly rich linguistic features, e.g., sentiment or frame-based lexicons, and require heuristics or manual feature engineering. As an improvement, we propose a collection of generic resource-lean methods, implemented in the form of statistical background knowledge or by means of neural architectures. Our models are largely language-independent and produce state-of-the-art performance, e.g., in the classification of Chinese implicit discourse relations, or the detection of locally covert predicative arguments in free texts. In novel experiments, we quantitatively demonstrate that both types of implicit information are mutually dependent insofar as, for instance, some implicit roles directly correlate with implicit discourse relations of similar properties. We show that implicit information processing further benefits downstream applications and demonstrate its applicability to the higher-level task of narrative story understanding. In the conclusion of the dissertation, we argue for the need of implicit information processing in order to realize the goal of true natural language understanding.
Author:Niko Schenk
Place of publication:Frankfurt am Main
Referee:Christian Chiarcos, Gert WebelhuthORCiDGND
Advisor:Christian Chiarcos
Document Type:Doctoral Thesis
Date of Publication (online):2019/09/02
Year of first Publication:2019
Publishing Institution:Universitätsbibliothek Johann Christian Senckenberg
Granting Institution:Johann Wolfgang Goethe-Universität
Date of final exam:2019/03/20
Release Date:2019/09/26
Tag:Implicit Discourse Parsing; Implicit Semantic Role Labeling; Information Retrieval; Natural Language Processing; Natural Language Understanding
Page Number:260
Institutes:Informatik und Mathematik / Informatik
Dewey Decimal Classification:0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Licence (German):License LogoDeutsches Urheberrecht