004 Datenverarbeitung; Informatik
Refine
Year of publication
- 2020 (82)
- 2021 (73)
- 2023 (68)
- 2018 (56)
- 2022 (53)
- 2019 (41)
- 2016 (34)
- 2009 (33)
- 2011 (30)
- 2015 (29)
- 2017 (29)
- 2006 (28)
- 2004 (27)
- 2007 (27)
- 2008 (27)
- 2010 (27)
- 2014 (26)
- 2013 (24)
- 2005 (23)
- 2003 (21)
- 2001 (20)
- 2002 (20)
- 2012 (20)
- 1999 (15)
- 1998 (14)
- 2000 (13)
- 2024 (11)
- 1997 (10)
- 1996 (8)
- 1994 (6)
- 1995 (4)
- 1991 (2)
- 1987 (1)
- 1988 (1)
- 1989 (1)
Document Type
- Article (259)
- Doctoral Thesis (147)
- Working Paper (122)
- Bachelor Thesis (53)
- Conference Proceeding (53)
- Diploma Thesis (48)
- Preprint (44)
- Part of a Book (42)
- Contribution to a Periodical (39)
- diplomthesis (31)
Is part of the Bibliography
- no (905)
Keywords
- Lambda-Kalkül (21)
- Inklusion (13)
- Formale Semantik (11)
- Barrierefreiheit (10)
- Digitalisierung (10)
- Operationale Semantik (9)
- artificial intelligence (9)
- data science (9)
- lambda calculus (9)
- machine learning (9)
Institute
- Informatik (475)
- Informatik und Mathematik (104)
- Präsidium (74)
- Frankfurt Institute for Advanced Studies (FIAS) (54)
- Medizin (54)
- Wirtschaftswissenschaften (45)
- Physik (35)
- Hochschulrechenzentrum (24)
- studiumdigitale (24)
- Extern (12)
Natural Language Processing (NLP) for big data requires an efficient and sophisticated infrastructure to complete tasks both fast and correctly. Providing an intuitive and lightweight interaction with a framework that abstracts and simplifies complex tasks assists in reaching this goal. This bachelor thesis extends the NLP framework Docker Unified UIMA Interface (DUUI) by an API and a web-based graphical user interface to control and manage pipelines for automated analysis of large quantities of natural language. The extension aims to reduce the entry barrier into the field as well as to accelerate the creation and management of pipelines according to UIMA standards. Pipelines can be executed in the browser or using the web API directly and then monitored on a document level. The evaluation in usability and user experience indicates that the implementation benefits the framework by making its usage more user friendly, lightweight, and intuitive while also making the management of pipelines more efficient.
Graph4Med: a web application and a graph database for visualizing and analyzing medical databases
(2022)
Background: Medical databases normally contain large amounts of data in a variety of forms. Although they grant significant insights into diagnosis and treatment, implementing data exploration into current medical databases is challenging since these are often based on a relational schema and cannot be used to easily extract information for cohort analysis and visualization. As a consequence, valuable information regarding cohort distribution or patient similarity may be missed. With the rapid advancement of biomedical technologies, new forms of data from methods such as Next Generation Sequencing (NGS) or chromosome microarray (array CGH) are constantly being generated; hence it can be expected that the amount and complexity of medical data will rise and bring relational database systems to a limit.
Description: We present Graph4Med, a web application that relies on a graph database obtained by transforming a relational database. Graph4Med provides a straightforward visualization and analysis of a selected patient cohort. Our use case is a database of pediatric Acute Lymphoblastic Leukemia (ALL). Along routine patients’ health records it also contains results of latest technologies such as NGS data. We developed a suitable graph data schema to convert the relational data into a graph data structure and store it in Neo4j. We used NeoDash to build a dashboard for querying and displaying patients’ cohort analysis. This way our tool (1) quickly displays the overview of patients’ cohort information such as distributions of gender, age, mutations (fusions), diagnosis; (2) provides mutation (fusion) based similarity search and display in a maneuverable graph; (3) generates an interactive graph of any selected patient and facilitates the identification of interesting patterns among patients.
Conclusion: We demonstrate the feasibility and advantages of a graph database for storing and querying medical databases. Our dashboard allows a fast and interactive analysis and visualization of complex medical data. It is especially useful for patients similarity search based on mutations (fusions), of which vast amounts of data have been generated by NGS in recent years. It can discover relationships and patterns in patients cohorts that are normally hard to grasp. Expanding Graph4Med to more medical databases will bring novel insights into diagnostic and research.
Interacting with the environment to process sensory information, generate perceptions, and shape behavior engages neural networks in brain areas with highly varied representations, ranging from unimodal sensory cortices to higher-order association areas. Recent work suggests a much greater degree of commonality across areas, with distributed and modular networks present in both sensory and non-sensory areas during early development. However, it is currently unknown whether this initially common modular structure undergoes an equally common developmental trajectory, or whether such a modular functional organization persists in some areas—such as primary visual cortex—but not others. Here we examine the development of network organization across diverse cortical regions in ferrets of both sexes using in vivo widefield calcium imaging of spontaneous activity. We find that all regions examined, including both primary sensory cortices (visual, auditory, and somatosensory—V1, A1, and S1, respectively) and higher order association areas (prefrontal and posterior parietal cortices) exhibit a largely similar pattern of changes over an approximately 3 week developmental period spanning eye opening and the transition to predominantly externally-driven sensory activity. We find that both a modular functional organization and millimeter-scale correlated networks remain present across all cortical areas examined. These networks weakened over development in most cortical areas, but strengthened in V1. Overall, the conserved maintenance of modular organization across different cortical areas suggests a common pathway of network refinement, and suggests that a modular organization—known to encode functional representations in visual areas—may be similarly engaged in highly diverse brain areas.
Significance Different areas of the mature brain encode vastly different representations of the world. This study shows that a modular functional organization where nearby neurons participate in similar functional networks is shared across different brain areas not only during early development, but also as the brain matures where it remains a shared feature that shapes neural activity. The largely conserved trajectory of developmental changes across brain areas suggests that similar circuit mechanisms may drive this maturation. This implies that the large literature on developing cortical circuits, which is largely focused on sensory areas, may also apply more broadly, and that perturbations during development that impinge on any such shared mechanisms may produce deficits that extend across multiple brain systems.
Background: Prostate cancer is a major health concern in aging men. Paralleling an aging society, prostate cancer prevalence increases emphasizing the need for efcient diagnostic algorithms.
Methods: Retrospectively, 106 prostate tissue samples from 48 patients (mean age,
66 ± 6.6 years) were included in the study. Patients sufered from prostate cancer (n = 38) or benign prostatic hyperplasia (n = 10) and were treated with radical prostatectomy or Holmium laser enucleation of the prostate, respectively. We constructed tissue microarrays (TMAs) comprising representative malignant (n = 38) and benign (n = 68) tissue cores. TMAs were processed to histological slides, stained, digitized and assessed for the applicability of machine learning strategies and open–source tools in diagnosis of prostate cancer. We applied the software QuPath to extract features for shape, stain intensity, and texture of TMA cores for three stainings, H&E, ERG, and PIN-4. Three machine learning algorithms, neural network (NN), support vector machines (SVM), and random forest (RF), were trained and cross-validated with 100 Monte Carlo random splits into 70% training set and 30% test set. We determined AUC values for single color channels, with and without optimization of hyperparameters by exhaustive grid search. We applied recursive feature elimination to feature sets of multiple color transforms.
Results: Mean AUC was above 0.80. PIN-4 stainings yielded higher AUC than H&E and
ERG. For PIN-4 with the color transform saturation, NN, RF, and SVM revealed AUC of 0.93 ± 0.04, 0.91 ± 0.06, and 0.92 ± 0.05, respectively. Optimization of hyperparameters improved the AUC only slightly by 0.01. For H&E, feature selection resulted in no increase of AUC but to an increase of 0.02–0.06 for ERG and PIN-4.
Conclusions: Automated pipelines may be able to discriminate with high accuracy between malignant and benign tissue. We found PIN-4 staining best suited for classifcation. Further bioinformatic analysis of larger data sets would be crucial to evaluate the reliability of automated classifcation methods for clinical practice and to evaluate potential discrimination of aggressiveness of cancer to pave the way to automatic precision medicine.
Unified probabilistic deep continual learning through generative replay and open set recognition
(2022)
Modern deep neural networks are well known to be brittle in the face of unknown data instances and recognition of the latter remains a challenge. Although it is inevitable for continual-learning systems to encounter such unseen concepts, the corresponding literature appears to nonetheless focus primarily on alleviating catastrophic interference with learned representations. In this work, we introduce a probabilistic approach that connects these perspectives based on variational inference in a single deep autoencoder model. Specifically, we propose to bound the approximate posterior by fitting regions of high density on the basis of correctly classified data points. These bounds are shown to serve a dual purpose: unseen unknown out-of-distribution data can be distinguished from already trained known tasks towards robust application. Simultaneously, to retain already acquired knowledge, a generative replay process can be narrowed to strictly in-distribution samples, in order to significantly alleviate catastrophic interference.
Assessing communicative accommodation in the context of large language models : a semiotic approach
(2023)
Recently, significant strides have been made in the ability of transformer-based chatbots to hold natural conversations. However, despite a growing societal and scientific relevancy, there are few frameworks systematically deriving what it means for a chatbot conversation to be natural. The present work approaches this question through the phenomenon of communicative accommodation/interactive alignment. While there is existing research suggesting that humans adapt communicatively to technologies, the aim of this work is to explore the accommodation of AI-chatbots to an interlocutor. Its research interest is twofold: Firstly, the structural ability of the transformer-architecture to support accommodative behavior is assessed using a frame constructed in accordance with existing accommodationtheories.
This results in hypotheses to be tested empirically. Secondly, since effective accommodation produces the same outcomes, regardless of technical implementation, a behavioral experiment is proposed. Existing quantifications of accommodation are reconciled,
extended, and modified to apply them to nonhuman-interlocutors. Thus, a measurement scheme is suggested which evaluates textual data from text-only, double-blind interactions between chatbots and humans, chatbots and chatbots and humans and humans. Using the generated human-to-human convergence data as a reference, the degree of artificial accommodation can be evaluated. Accommodation as a central facet of artificial interactivity can thus be evaluated directly against its theoretical paradigm, i.e. human interaction. In case that subsequent examinations show that chatbots effectively do not accommodate, there may be a new form of algorithmic bias, emerging from the aggregate accommodation towards chatbots but not towards humans. Thus, existing, hegemonic semantics could be cemented through chatbot-learning. Meanwhile, the ability to effectively accommodate would render chatbots vastly more susceptible to misuse.
Detailed feedback on exercises helps learners become proficient but is time-consuming for educators and, thus, hardly scalable. This manuscript evaluates how well Generative Artificial Intelligence (AI) provides automated feedback on complex multimodal exercises requiring coding, statistics, and economic reasoning. Besides providing this technology through an easily accessible web application, this article evaluates the technology’s performance by comparing the quantitative feedback (i.e., points achieved) from Generative AI models with human expert feedback for 4,349 solutions to marketing analytics exercises. The results show that automated feedback produced by Generative AI (GPT-4) provides almost unbiased evaluations while correlating highly with (r = 0.94) and deviating only 6 % from human evaluations. GPT-4 performs best among seven Generative AI models, albeit at the highest cost. Comparing the models’ performance with costs shows that GPT-4, Mistral Large, Claude 3 Opus, and Gemini 1.0 Pro dominate three other Generative AI models (Claude 3 Sonnet, GPT-3.5, and Gemini 1.5 Pro). Expert assessment of the qualitative feedback (i.e., the AI’s textual response) indicates that it is mostly correct, sufficient, and appropriate for learners. A survey of marketing analytics learners shows that they highly recommend the app and its Generative AI feedback. An advantage of the app is its subject-agnosticism—it does not require any subject- or exercise-specific training. Thus, it is immediately usable for new exercises in marketing analytics and other subjects.
The human immune system is determined by the functionality of the human lymph node. With the use of high-throughput techniques in clinical diagnostics, a large number of data is currently collected. The new data on the spatiotemporal organization of cells offers new possibilities to build a mathematical model of the human lymph node - a virtual lymph node. The virtual lymph node can be applied to simulate drug responses and may be used in clinical diagnosis. Here, we review mathematical models of the human lymph node from the viewpoint of cellular processes. Starting with classical methods, such as systems of differential equations, we discuss the values of different levels of abstraction and methods in the range from artificial intelligence techniques formalism.