Extracting event-centric document collections from large-scale web archives
- Web archives created by the Internet Archive (IA) (https://archive.org), national libraries and other archiving services contain large amounts of information collected for a time period of over twenty years. These archives constitute a valuable source for research in many disciplines, including the digital humanities and the historical sciences by offering a unique possibility to look into past events and their representation on the Web. Most Web archive services aim to capture the entire Web (IA) or national top-level domains and are therefore broad in their scope, diverse regarding the topics they contain and the time intervals they cover. Due to the large size and the broad scope it is difficult for interested researchers to locate relevant information in the archives as search facilities are very limited. Many users are more interested in studying smaller and topically coherent event-centric collections of documents contained in a Web archive [1,2]. Such collections can reflect specific events such as elections, or natural disasters, e.g. the Fukushima nuclear disaster (2011) or the German federal elections.
Author: | Gerhard Gossen, Elena DemidovaORCiDGND, Thomas RisseORCiDGND |
---|---|
URN: | urn:nbn:de:hebis:30:3-542504 |
URL: | https://pro.europeana.eu/page/issue-8-tpdl |
Parent Title (English): | EuropeanaTech Insight |
Publisher: | Europeana Foundation |
Place of publication: | Den Haag, Netherlands |
Document Type: | Conference Proceeding |
Language: | English |
Year of Completion: | 2017 |
Year of first Publication: | 2017 |
Publishing Institution: | Universitätsbibliothek Johann Christian Senckenberg |
Creating Corporation: | TPDL (21. : 2017 : Thessaloniki) |
Release Date: | 2020/06/24 |
Issue: | 8: TPDL (2017) |
Page Number: | 4 |
Note: | All texts are CC BY-SA, images and media licensed individually. |
HeBIS-PPN: | 466726910 |
Institutes: | Zentrale Einrichtung / Universitätsbibliothek |
Dewey Decimal Classification: | 0 Informatik, Informationswissenschaft, allgemeine Werke / 02 Bibliotheks- und Informationswissenschaften / 020 Bibliotheks- und Informationswissenschaften |
Sammlungen: | Universitätspublikationen |
Licence (German): | Creative Commons - Namensnennung-Weitergabe unter gleichen Bedingungen |