Filtern
Dokumenttyp
- Wissenschaftlicher Artikel (19)
- Arbeitspapier (4)
- Bachelorarbeit (1)
- Konferenzveröffentlichung (1)
- Masterarbeit (1)
- Preprint (1)
- Bericht (1)
Volltext vorhanden
- ja (28)
Gehört zur Bibliographie
- nein (28) (entfernen)
Schlagworte
- machine learning (28) (entfernen)
Institut
- Medizin (9)
- Wirtschaftswissenschaften (6)
- Center for Financial Studies (CFS) (4)
- Biochemie, Chemie und Pharmazie (3)
- Buchmann Institut für Molekulare Lebenswissenschaften (BMLS) (2)
- Frankfurt Institute for Advanced Studies (FIAS) (2)
- Psychologie (2)
- Biochemie und Chemie (1)
- Biodiversität und Klima Forschungszentrum (BiK-F) (1)
- Biowissenschaften (1)
We assemble a data set of more than eight million German Twitter posts related to the war in Ukraine. Based on state-of-the-art methods of text analysis, we construct a daily index of uncertainty about the war as perceived by German Twitter. The approach also allows us to separate this index into uncertainty about sanctions against Russia, energy policy and other dimensions. We then estimate a VAR model with daily financial and macroeconomic data and identify an exogenous uncertainty shock. The increase in uncertainty has strong effects on financial markets and causes a significant decline in economic activity as well as an increase in expected inflation. We find the effects of uncertainty to be particularly strong in the first months of the war.
The discussion about the interplay between digital technologies and the process of globalization is often focused around the following question: who has access to global information networks and who benefits from digital communication technologies? These are essential questions and it can hardly be denied that they confront us with a series of political and ethical questions. However, we also need to recognize the ongoing digitalization of the globe, a process where more and more people are put on various kinds of maps...
Background: The categorization of individuals as normosmic, hyposmic, or anosmic from test results of odor threshold, discrimination, and identification may provide a limited view of the sense of smell. The purpose of this study was to expand the clinical diagnostic repertoire by including additional tests. Methods: A random cohort of n = 135 individuals (83 women and 52 men, aged 21 to 94 years) was tested for odor threshold, discrimination, and identification, plus a distance test, in which the odor of peanut butter is perceived, a sorting task of odor dilutions for phenylethyl alcohol and eugenol, a discrimination test for odorant enantiomers, a lateralization test with eucalyptol, a threshold assessment after 10 min of exposure to phenylethyl alcohol, and a questionnaire on the importance of olfaction. Unsupervised methods were used to detect structure in the olfaction-related data, followed by supervised feature selection methods from statistics and machine learning to identify relevant variables. Results: The structure in the olfaction-related data divided the cohort into two distinct clusters with n = 80 and 55 subjects. Odor threshold, discrimination, and identification did not play a relevant role for cluster assignment, which, on the other hand, depended on performance in the two odor dilution sorting tasks, from which cluster assignment was possible with a median 100-fold cross-validated balanced accuracy of 77–88%. Conclusions: The addition of an odor sorting task with the two proposed odor dilutions to the odor test battery expands the phenotype of olfaction and fits seamlessly into the sensory focus of standard test batteries.
Bayesian inference is ubiquitous in science and widely used in biomedical research such as cell sorting or “omics” approaches, as well as in machine learning (ML), artificial neural networks, and “big data” applications. However, the calculation is not robust in regions of low evidence. In cases where one group has a lower mean but a higher variance than another group, new cases with larger values are implausibly assigned to the group with typically smaller values. An approach for a robust extension of Bayesian inference is proposed that proceeds in two main steps starting from the Bayesian posterior probabilities. First, cases with low evidence are labeled as “uncertain” class membership. The boundary for low probabilities of class assignment (threshold 𝜀
) is calculated using a computed ABC analysis as a data-based technique for item categorization. This leaves a number of cases with uncertain classification (p < 𝜀
). Second, cases with uncertain class membership are relabeled based on the distance to neighboring classified cases based on Voronoi cells. The approach is demonstrated on biomedical data typically analyzed with Bayesian statistics, such as flow cytometric data sets or biomarkers used in medical diagnostics, where it increased the class assignment accuracy by 1–10% depending on the data set. The proposed extension of the Bayesian inference of class membership can be used to obtain robust and plausible class assignments even for data at the extremes of the distribution and/or for which evidence is weak.
Introduction: Affective disorders are a major global burden, with approximately 15% of people worldwide suffering from some form of affective disorder. In patients experiencing their first depressive episode, in most cases it cannot be distinguished whether this is due to bipolar disorder (BD) or major depressive disorder (MDD). Valid fluid biomarkers able to discriminate between the two disorders in a clinical setting are not yet available.
Material and Methods: Seventy depressed patients suffering from BD (bipolar I and II subtypes) and 42 patients with major MDD were recruited and blood samples were taken for proteomic analyses after 8 h fasting. Proteomic profiles were analyzed using the Multiplex Immunoassay platform from Myriad Rules Based Medicine (Myriad RBM; Austin, Texas, USA). Human DiscoveryMAPTM was used to measure the concentration of various proteins, peptides, and small molecules. A multivariate predictive model was consequently constructed to differentiate between BD and MDD.
Results: Based on the various proteomic profiles, the algorithm could discriminate depressed BD patients from MDD patients with an accuracy of 67%.
Discussion: The results of this preliminary study suggest that future discrimination between bipolar and unipolar depression in a single case could be possible, using predictive biomarker models based on blood proteomic profiling.
This paper investigates how biases in macroeconomic forecasts are associated with economic surprises and market responses across asset classes around US data announcements. We find that the skewness of the distribution of economic forecasts is a strong predictor of economic surprises, suggesting that forecasters behave strategically (rational bias) and possess private information. Our results also show that consensus forecasts of US macroeconomic releases embed anchoring. Under these conditions, both economic surprises and the returns of assets that are sensitive to macroeconomic conditions are predictable. Our findings indicate that local equities and bond markets are more predictable than foreign markets, currencies and commodities. Economic surprises are found to link to asset returns very distinctively through the stages of the economic cycle, whereas they strongly depend on economic releases being inflation- or growth-related. Yet, when forecasters fail to correctly forecast the direction of economic surprises, regret becomes a relevant cognitive bias to explain asset price responses. We find that the behavioral and rational biases encountered in US economic forecasting also exists in Continental Europe, the United Kingdom and Japan, albeit, to a lesser extent.
When we browse via WiFi on our laptop or mobile phone, we receive data over a noisy channel. The received message may differ from the one that was sent originally. Luckily it is often possible to reconstruct the original message but it may take a lot of time. That’s because decoding the received message is a complex problem, NP-hard to be exact. As we continue browsing, new information is sent to us in a high frequency. So if lags are to be avoided and as memory is finite, there is not much time left for decoding. Coding theory tackles this problem by creating models of the channels we use to communicate and tailor codes based on the channel properties. A well known family of codes are Low-Density Parity-Check codes (LDPC codes), they are widely used in standards like WiFi and DVB-T2. In practical settings the complexity of decoding a received message can be heavily reduced by using LDPC codes and approximative decoding algorithms. This thesis lays out the basic construction of LDPC codes and a proper decoding using the sum-product algorithm. On this basis a neural network to improve decoding is introduced. Therefore the sum-product algorithm is transformed into a neural network decoder. This approach was first presented by Nachmani et al. and treated in detail by Navneet Agrawal in 2017. To find out how machine learning can improve the codes, the bit error rates of the trained neural network decoder are compared with the bit error rates of the classic sum-product algorithm approach. Experiments with static and dynamic training datasets of diverse sizes, various signal-to-noise ratios, a feed forward as well as a recurrent architecture show how to tune the neural network decoder even further. Results of the experiments are used to verify statements made in Agrawal’s work. In addition, corrections and improvements in the area of metrics are presented. An implementation of the neural network to facilitate access for others will be made available to the public.
The KMT2A (MLL) gene rearrangements (KMT2A-r) are associated with a diverse spectrum of acute leukemias. Although most KMT2A-r are restricted to nine partner genes, we have recently revealed that KMT2A-USP2 fusions are often missed during FISH screening of these genetic alterations. Therefore, complementary methods are important for appropriate detection of any KMT2A-r. Here we use a machine learning model to unravel the most appropriate markers for prediction of KMT2A-r in various types of acute leukemia. A Random Forest and LightGBM classifier was trained to predict KMT2A-r in patients with acute leukemia. Our results revealed a set of 20 genes capable of accurately estimating KMT2A-r. The SKIDA1 (AUC: 0.839; CI: 0.799–0.879) and LAMP5 (AUC: 0.746; CI: 0.685–0.806) overexpression were the better markers associated with KMT2A-r compared to CSPG4 (also named NG2; AUC: 0.722; CI: 0.659–0.784), regardless of the type of acute leukemia. Of importance, high expression levels of LAMP5 estimated the occurrence of all KMT2A-USP2 fusions. Also, we performed drug sensitivity analysis using IC50 data from 345 drugs available in the GDSC database to identify which ones could be used to treat KMT2A-r leukemia. We observed that KMT2A-r cell lines were more sensitive to 5-Fluorouracil (5FU), Gemcitabine (both antimetabolite chemotherapy drugs), WHI-P97 (JAK-3 inhibitor), Foretinib (MET/VEGFR inhibitor), SNX-2112 (Hsp90 inhibitor), AZD6482 (PI3Kβ inhibitor), KU-60019 (ATM kinase inhibitor), and Pevonedistat (NEDD8-activating enzyme (NAE) inhibitor). Moreover, IC50 data from analyses of ex-vivo drug sensitivity to small-molecule inhibitors reveals that Foretinib is a promising drug option for AML patients carrying FLT3 activating mutations. Thus, we provide novel and accurate options for the diagnostic screening and therapy of KMT2A-r leukemia, regardless of leukemia subtype.
In Niedersachsen sind etwa 50 % der forstlichen Standorte in einem Maßstab 1 : 25 000 nach einem relativ komplexen Verfahren kartiert. Jede kartierte Einheit besteht aus Stufen für den Geländewasserhaushalt (WHZ; 43 Stufen), die Nährstoffversorgung (NZ; 16 Stufen) und die Substratund Lagerungsverhältnisse (SLZ; 105 Stufen). Das Ziel der Arbeit war es, WHZ und NZ Stufen der Niedersächsischen forstlichen Standortskartierung für nicht kartierte Gebiete vorherzusagen. Anhand von stratifizierten Zufallsstichproben der WHZ und NZ Stufen aus der Kartierung wurden zwei RandomForest-Modelle kalibriert. Das Modell klassifizierte etwa 77 % der Teststichprobe für die WHZ richtig. Die F1-Werte der einzelnen Stufen reichten dabei von 50–95 %. Falsche Vorhersagen mehrten sich bei Übergängen benachbarter WHZ (z. B. Übergang von Tälern zu Hängen) und bei WHZ mit ähnlichen Geländeeigenschaften, aber Abstufungen in der Wasserversorgung. Einige Modellfehler hängen aber offenbar auch von Unschärfen innerhalb der zugrundeliegenden Kartierung ab. Zusätzlich sagt das Modell im Vergleich zur Feldkartierung viel kleinräumigere Muster vorher, die zwar vom zugrundeliegenden Gelände her nachvollziehbar erscheinen, aber in dieser Genauigkeit nicht im Feld kartiert werden. Etwa 66 % des Testdatensatzes für die NZ wurden richtig klassifiziert. Falsche Vorhersagen traten hier vor allem in direkt benachbarten Stufen der Nährstoffversorgung auf. Unsicherheiten deuten zum einen auf weniger gut geeignete Kovariablen hin, sind möglicherweise aber auch durch zeitliche Änderungen der Bodeneigenschaften selbst sowie durch Ungenauigkeiten in der Kartierung zu erwarten, die wenige Regeln für die Vergabe der Nährstoffzahl vorgibt. Insgesamt beurteilen wir die Modelle als gut geeignet, um sie landesweit anzuwenden. Allerdings ist zu erwarten, dass eine lokale Kalibrierung der Modelle für einzelne Wuchsgebiete die Modellgüte deutlich erhöht. Gleiches kann eine Zusammenfassung ähnlicher Stufen zu waldbaulich relevanten Obergruppen leisten.
Music listening has become a highly individualized activity with smartphones and music streaming services providing listeners with absolute freedom to listen to any kind of music in any situation. Until now, little has been written about the processes underlying the selection of music in daily life. The present study aimed to disentangle some of the complex processes among the listener, situation, and functions of music listening involved in music selection. Utilizing the experience sampling method, data were collected from 119 participants using a smartphone application. For 10 consecutive days, participants received 14 prompts using stratified-random sampling throughout the day and reported on their music-listening behavior. Statistical learning procedures on multilevel regression models and multilevel structural equation modeling were used to determine the most important predictors and analyze mediation processes between person, situation, functions of listening, and music selection. Results revealed that the features of music selected in daily life were predominantly determined by situational characteristics, whereas consistent individual differences were of minor importance. Functions of music listening were found to act as a mediator between characteristics of the situation and music-selection behavior. We further observed several significant random effects, which indicated that individuals differed in how situational variables affected their music selection behavior. Our findings suggest a need to shift the focus of music-listening research from individual differences to situational influences, including potential person-situation interactions.