004 Datenverarbeitung; Informatik
Refine
Year of publication
- 2022 (50) (remove)
Document Type
- Article (21)
- Doctoral Thesis (11)
- Preprint (6)
- Bachelor Thesis (4)
- Master's Thesis (3)
- Working Paper (2)
- Part of a Book (1)
- Conference Proceeding (1)
- Contribution to a Periodical (1)
Has Fulltext
- yes (50) (remove)
Is part of the Bibliography
- no (50)
Keywords
- data science (5)
- NLP (3)
- artificial intelligence (3)
- digital medicine (3)
- machine learning (3)
- machine-learning (3)
- Biomedical informatics (2)
- Data science (2)
- Natural Language Processing (2)
- patients (2)
Institute
Knowledge discovery in biomedical data using supervised methods assumes that the data contain structure relevant to the class structure if a classifier can be trained to assign a case to the correct class better than by guessing. In this setting, acceptance or rejection of a scientific hypothesis may depend critically on the ability to classify cases better than randomly, without high classification performance being the primary goal. Random forests are often chosen for knowledge-discovery tasks because they are considered a powerful classifier that does not require sophisticated data transformation or hyperparameter tuning and can be regarded as a reference classifier for tabular numerical data. Here, we report a case where the failure of random forests using the default hyperparameter settings in the standard implementations of R and Python would have led to the rejection of the hypothesis that the data contained structure relevant to the class structure. After tuning the hyperparameters, classification performance increased from 56% to 65% balanced accuracy in R, and from 55% to 67% balanced accuracy in Python. More importantly, the 95% confidence intervals in the tuned versions were to the right of the value of 50% that characterizes guessing-level classification. Thus, tuning provided the desired evidence that the data structure supported the class structure of the data set. In this case, the tuning made more than a quantitative difference in the form of slightly better classification accuracy, but significantly changed the interpretation of the data set. This is especially true when classification performance is low and a small improvement increases the balanced accuracy to over 50% when guessing.
Enabling cybersecurity and protecting personal data are crucial challenges in the development and provision of digital service chains. Data and information are the key ingredients in the creation process of new digital services and products. While legal and technical problems are frequently discussed in academia, ethical issues of digital service chains and the commercialization of data are seldom investigated. Thus, based on outcomes of the Horizon2020 PANELFIT project, this work discusses current ethical issues related to cybersecurity. Utilizing expert workshops and encounters as well as a scientific literature review, ethical issues are mapped on individual steps of digital service chains. Not surprisingly, the results demonstrate that ethical challenges cannot be resolved in a general way, but need to be discussed individually and with respect to the ethical principles that are violated in the specific step of the service chain. Nevertheless, our results support practitioners by providing and discussing a list of ethical challenges to enable legally compliant as well as ethically acceptable solutions in the future.
The human brain achieves visual object recognition through multiple stages of nonlinear transformations operating at a millisecond scale. To predict and explain these rapid transformations, computational neuroscientists employ machine learning modeling techniques. However, state-of-the-art models require massive amounts of data to properly train, and to the present day there is a lack of vast brain datasets which extensively sample the temporal dynamics of visual object recognition. Here we collected a large and rich dataset of high temporal resolution EEG responses to images of objects on a natural background. This dataset includes 10 participants, each with 82,160 trials spanning 16,740 image conditions. Through computational modeling we established the quality of this dataset in five ways. First, we trained linearizing encoding models that successfully synthesized the EEG responses to arbitrary images. Second, we correctly identified the recorded EEG data image conditions in a zero-shot fashion, using EEG synthesized responses to hundreds of thousands of candidate image conditions. Third, we show that both the high number of conditions as well as the trial repetitions of the EEG dataset contribute to the trained models’ prediction accuracy. Fourth, we built encoding models whose predictions well generalize to novel participants. Fifth, we demonstrate full end-to-end training of randomly initialized DNNs that output M/EEG responses for arbitrary input images. We release this dataset as a tool to foster research in visual neuroscience and computer vision.
The human brain achieves visual object recognition through multiple stages of linear and nonlinear transformations operating at a millisecond scale. To predict and explain these rapid transformations, computational neuroscientists employ machine learning modeling techniques. However, state-of-the-art models require massive amounts of data to properly train, and to the present day there is a lack of vast brain datasets which extensively sample the temporal dynamics of visual object recognition. Here we collected a large and rich dataset of high temporal resolution EEG responses to images of objects on a natural background. This dataset includes 10 participants, each with 82,160 trials spanning 16,740 image conditions. Through computational modeling we established the quality of this dataset in five ways. First, we trained linearizing encoding models that successfully synthesized the EEG responses to arbitrary images. Second, we correctly identified the recorded EEG data image conditions in a zero-shot fashion, using EEG synthesized responses to hundreds of thousands of candidate image conditions. Third, we show that both the high number of conditions as well as the trial repetitions of the EEG dataset contribute to the trained models’ prediction accuracy. Fourth, we built encoding models whose predictions well generalize to novel participants. Fifth, we demonstrate full end-to-end training of randomly initialized DNNs that output EEG responses for arbitrary input images. We release this dataset as a tool to foster research in visual neuroscience and computer vision.
Meat adulteration is a global problem which undermines market fairness and harms people with allergies or certain religious beliefs. In this study, a novel framework in which a one-dimensional convolutional neural network (1DCNN) serves as a backbone and a random forest regressor (RFR) serves as a regressor, named 1DCNN-RFR, is proposed for the quantitative detection of beef adulterated with pork using electronic nose (E-nose) data. The 1DCNN backbone extracted a sufficient number of features from a multichannel input matrix converted from the raw E-nose data. The RFR improved the regression performance due to its strong prediction ability. The effectiveness of the 1DCNN-RFR framework was verified by comparing it with four other models (support vector regression model (SVR), RFR, backpropagation neural network (BPNN), and 1DCNN). The proposed 1DCNN-RFR framework performed best in the quantitative detection of beef adulterated with pork. This study indicated that the proposed 1DCNN-RFR framework could be used as an effective tool for the quantitative detection of meat adulteration.
Bacteria that are capable of organizing themselves as biofilms are an important public health issue. Knowledge discovery focusing on the ability to swarm and conquer the surroundings to form persistent colonies is therefore very important for microbiological research communities that focus on a clinical perspective. Here, we demonstrate how a machine learning workflow can be used to create useful models that are capable of discriminating distinct associated growth behaviors along distinct phenotypes. Based on basic gray-scale images, we provide a processing pipeline for binary image generation, making the workflow accessible for imaging data from a wide range of devices and conditions. The workflow includes a locally estimated regression model that easily applies to growth-related data and a shape analysis using identified principal components. Finally, we apply a density-based clustering application with noise (DBSCAN) to extract and analyze characteristic, general features explained by colony shapes and areas to discriminate distinct Bacillus subtilis phenotypes. Our results suggest that the differences regarding their ability to swarm and subsequently conquer the medium that surrounds them result in characteristic features. The differences along the time scales of the distinct latency for the colony formation give insights into the ability to invade the surroundings and therefore could serve as a useful monitoring tool.
Orientation hypercolumns in the visual cortex are delimited by the repeating pinwheel patterns of orientation selective neurons. We design a generative model for visual cortex maps that reproduces such orientation hypercolumns as well as ocular dominance maps while preserving retinotopy. The model uses a neural placement method based on t–distributed stochastic neighbour embedding (t–SNE) to create maps that order common features in the connectivity matrix of the circuit. We find that, in our model, hypercolumns generally appear with fixed cell numbers independently of the overall network size. These results would suggest that existing differences in absolute pinwheel densities are a consequence of variations in neuronal density. Indeed, available measurements in the visual cortex indicate that pinwheels consist of a constant number of ∼30, 000 neurons. Our model is able to reproduce a large number of characteristic properties known for visual cortex maps. We provide the corresponding software in our MAPStoolbox for Matlab.
Im Fachbereich der Computerlinguistik ist die automatische Generierung von Szenen aus, in natürlicher Sprache verfassten, Text seit bereits vielen Jahrzehnten ein wichtiger Bestandteil der Forschung, welche in der "Kunst", "Lehre" und "Robotik" Verwendung finden. Mit Hilfe von neuen Technologien im Bereich der Künstlichen Intelligenzen (KI), werden neue Entwicklungen möglich, welche diese Generierungen vereinfachen, allerdings auch undurchsichtige interne vom Modell getroffene Entscheidungen fördern.
Ziel der vorgeschlagenen Lösung „ARES: Annotation von Relationen und Eigenschaften zur Szenengenerierung“ ist es, ein modulares System zu entwerfen, wobei einzelne Prozesse für den Benutzer verständlich bleiben. Außerdem sollen Möglichkeiten geboten werden, neue Entitäten und Relationen, welche über die Textanalyse bereitgestellt werden, auch in die Szenengenerierung im dreidimensionalen Raum einzupflegen, ohne dass hierfür Code zwingend notwendig wird.
Der Fokus liegt auf der syntaktisch korrekten Darstellung der Elemente im Raum. Dagegen lässt sich die semantische Korrektheit durch weitere manuelle Anpassungen, welche für spätere Generierungen gespeichert werden erhöhen. Letztlich soll die Menge der zur Darstellung benötigten Annotationen möglichst gering bleiben und neue szenenbezogene Annotationen durch die implementierten Annotationstools hinzugefügt werden.
Recent advances in artificial neural networks enabled the quick development of new learning algorithms, which, among other things, pave the way to novel robotic applications. Traditionally, robots are programmed by human experts so as to accomplish pre-defined tasks. Such robots must operate in a controlled environment to guarantee repeatability, are designed to solve one unique task and require costly hours of development. In developmental robotics, researchers try to artificially imitate the way living beings acquire their behavior by learning. Learning algorithms are key to conceive versatile and robust robots that can adapt to their environment and solve multiple tasks efficiently. In particular, Reinforcement Learning (RL) studies the acquisition of skills through teaching via rewards. In this thesis, we will introduce RL and present recent advances in RL applied to robotics. We will review Intrinsically Motivated (IM) learning, a special form of RL, and we will apply in particular the Active Efficient Coding (AEC) principle to the learning of active vision. We also propose an overview of Hierarchical Reinforcement Learning (HRL), an other special form of RL, and apply its principle to a robotic manipulation task.
The electrical and computational properties of neurons in our brains are determined by a rich repertoire of membrane-spanning ion channels and elaborate dendritic trees. However, the precise reason for this inherent complexity remains unknown. Here, we generated large stochastic populations of biophysically realistic hippocampal granule cell models comparing those with all 15 ion channels to their reduced but functional counterparts containing only 5 ion channels. Strikingly, valid parameter combinations in the full models were more frequent and more stable in the face of perturbations to channel expression levels. Scaling up the numbers of ion channels artificially in the reduced models recovered these advantages confirming the key contribution of the actual number of ion channel types. We conclude that the diversity of ion channels gives a neuron greater flexibility and robustness to achieve target excitability.