Refine
Document Type
- Bachelor Thesis (1)
- Working Paper (1)
Language
- English (2) (remove)
Has Fulltext
- yes (2)
Is part of the Bibliography
- no (2)
Keywords
- NLP (2) (remove)
Reproducible annotations
(2022)
This bachelor thesis presents a software solution which implements reproducible annotations in the context of the UIMA framework. This is achieved by creating an automated containerization of arbitrary analysis engines and annotating every analysis engine configuration in the processed CAS document. Any CAS document created by this solution is self sufficient and able to reproduce the exact environment under which it was created.
A review of the state-of-the art software in the field of UIMA reveals that there are many implementations trying to increase reproducibility for a given application relying on UIMA, but no publication trying to increase the reproducibility of UIMA itself. This thesis improves upon that technological gap and provides a throughout analysis at the end which shows a negligible overhead in memory consumption, but a significant performance regression depending on the complexity of the analysis engine which was examined.
Recent advances in natural language processing have contributed to the development of market sentiment measures through text content analysis in news providers and social media. The effectiveness of these sentiment variables depends on the imple- mented techniques and the type of source on which they are based. In this paper, we investigate the impact of the release of public financial news on the S&P 500. Using automatic labeling techniques based on either stock index returns or dictionaries, we apply a classification problem based on long short-term memory neural networks to extract alternative proxies of investor sentiment. Our findings provide evidence that there exists an impact of those sentiments in the market on a 20-minute time frame. We find that dictionary-based sentiment provides meaningful results with respect to those based on stock index returns, which partly fails in the mapping process between news and financial returns.