Informatik
Refine
Year of publication
Document Type
- Preprint (748)
- Article (400)
- Working Paper (119)
- Doctoral Thesis (92)
- Diploma Thesis (46)
- Conference Proceeding (41)
- Book (37)
- Bachelor Thesis (35)
- diplomthesis (29)
- Report (25)
- Part of a Book (13)
- Contribution to a Periodical (10)
- Master's Thesis (6)
- Habilitation (1)
- Lecture (1)
- Review (1)
Has Fulltext
- yes (1604)
Is part of the Bibliography
- no (1604)
Keywords
Institute
- Informatik (1604)
- Frankfurt Institute for Advanced Studies (FIAS) (1002)
- Physik (982)
- Mathematik (55)
- Präsidium (41)
- Medizin (25)
- Biowissenschaften (21)
- Exzellenzcluster Makromolekulare Komplexe (8)
- Psychologie (8)
- Deutsches Institut für Internationale Pädagogische Forschung (DIPF) (5)
This thesis contributes to the field of machine learning with a specific focus on the methods for learning relations between the inputs. Learning relationships between images is the most common primitive in vision. There are many vision tasks in which relationships across images play an important role. Some of them are motion estimation, activity recognition, stereo vision, multi-view geometry and visual odometry. Many of such tasks mainly depend on motion and disparity cues, which are inferred based on the relations across multiple image pairs. The approaches presented in this thesis mainly deal with, but are not limited to, learning of the representations for motion and depth. This thesis by articles consists of five articles which present relational feature learning models along with their applications in computer vision. In the first article, we present an approach for encoding motion in videos. To this end, we show that the detection of spatial transformations can be viewed as detection of coincidence or synchrony between the given sequence of frames and a sequence of features which are related by the transformation we wish to detect. Learning to detect synchrony is possible by introducing "multiplicative interactions'' into the hidden units of single layered sparse coding models.
We show that the learned motion representations employed for the task of activity recognition achieve competitive performance on multiple benchmarks. Stereo vision is an important challenge in computer vision and useful for many applications in that field. In the second article, we extend the energy based learning models, which were previously used for motion encoding, to the context of depth perception. Given the common architecture of the models for encoding motion and depth, we show that it is possible to define a single model for learning a unified representation for both the cues. Our experimental results show that learning a combined representation for depth and motion makes it possible to achieve state-of-the-art performance at the task of 3-D activity analysis, and to perform better than the existing hand-engineered 3-D motion features. Autoencoder is a popular unsupervised learning method for learning efficient encoding for a given set of data samples. Typically, regularized autoencoders which are used to learn over-complete and sparse representations for the input data, were shown to fail on intrinsically high dimensional data like videos. In the third article, we investigate the reason for such a behavior. It can be observed that the regularized autoencoders typically learn negative hidden unit biases. We show that the learning of negative biases is the result of hidden units being responsible for both the sparsity and the representation of the input data. It is shown that, as a result, the behavior of the model resembles clustering methods which would require exponentially large number of features to model intrinsically high dimensional data. Based on this understanding, we propose a new activation function which decouples the roles of hidden layer and uses linear encoding. This allows to learn representations on data with very high intrinsic dimensionality. We also show that gating connections in the bi-linear models and the single layer models from articles one and two of this thesis can be thought of as a way to attain a linear encoding scheme which allows them to learn good representations on videos. Visual odometry is the task of inferring egomotion of a moving object from visual information such as images and videos. It can primarily be used for the task of localization and has many applications in the fields of robotics and navigation. The work in article four was motivated by the idea of using deep learning techniques, which are successful methods for many vision tasks, for visual odometry. The visual odometry task mainly requires inference of motion and depth information from visual input which can then be mapped to velocity and change in direction. We use relational feature models presented in the articles one and two for inferring a combined motion and depth representation from stereo video sequences. The combined representation is then mapped to discrete velocity and change in direction labels using convolutional neural networks. Our approach is an end-to-end deep learning-based architecture which uses a single type of computational model and learning rule. Preliminary results show that the architecture is capable of learning the mapping from input video to egomotion. Activity recognition is a challenging computer vision task with many real world applications. It is well know that it is a hard task to use computer vision research for real-time applications. In the fifth article of this thesis, we present a real-time activity recognition system based on deep learning based methods. Our approach uses energy based relational feature learning models for the computation of local motion features directly from videos. A bag-of-words over the local motion features is used for the analysis of activity in a given video sequence. We implement this system on a distributed computational platform and demonstrate its performance on the iCub robot. Using GPUs we demonstrate real time performance which makes the deployment of activity recognition systems in real world scenarios possible.
Network graphs have become a popular tool to represent complex systems composed of many interacting subunits; especially in neuroscience, network graphs are increasingly used to represent and analyze functional interactions between multiple neural sources. Interactions are often reconstructed using pairwise bivariate analyses, overlooking the multivariate nature of interactions: it is neglected that investigating the effect of one source on a target necessitates to take all other sources as potential nuisance variables into account; also combinations of sources may act jointly on a given target. Bivariate analyses produce networks that may contain spurious interactions, which reduce the interpretability of the network and its graph metrics. A truly multivariate reconstruction, however, is computationally intractable because of the combinatorial explosion in the number of potential interactions. Thus, we have to resort to approximative methods to handle the intractability of multivariate interaction reconstruction, and thereby enable the use of networks in neuroscience. Here, we suggest such an approximative approach in the form of an algorithm that extends fast bivariate interaction reconstruction by identifying potentially spurious interactions post-hoc: the algorithm uses interaction delays reconstructed for directed bivariate interactions to tag potentially spurious edges on the basis of their timing signatures in the context of the surrounding network. Such tagged interactions may then be pruned, which produces a statistically conservative network approximation that is guaranteed to contain non-spurious interactions only. We describe the algorithm and present a reference implementation in MATLAB to test the algorithm’s performance on simulated networks as well as networks derived from magnetoencephalographic data. We discuss the algorithm in relation to other approximative multivariate methods and highlight suitable application scenarios. Our approach is a tractable and data-efficient way of reconstructing approximative networks of multivariate interactions. It is preferable if available data are limited or if fully multivariate approaches are computationally infeasible.
In dieser Arbeit werden Verfahren vorgestellt, mit dem sich hochaufgelöste wissenschaftliche Illustrationen in einem interaktiven Vorgang erstellen lassen. Die Basis dafür bildet die neu eingeführte GPU-basierte Illustrations-Pipeline, in der auf Grundlage eines 3D-Modells Bildebenen frei angelegt und miteinander kombiniert werden können. In einer Ebene wird ein bestimmter Aspekt der Illustration mit einer auswählbaren Technik gezeigt. Die Parameter der Technik sind interaktiv editierbar. Um Effizienz zu gewährleisten ist das gesamte Verfahren so konzipiert, dass es soweit wie möglich die Berechnungen auf der GPU durchführt. So ist es möglich, dass die Illustrationen mit interaktiven Frameraten gerendert werden.
Detectors of modern high-energy physics experiments generate huge data rates during operation. The efficient read-out of this data from the front-end electronics is a sophisticated task, the main challenges, however, may vary from experiment to experiment. The Compressed Baryonic Matter (CBM) experiment that is currently under construction at the Facility for Antiproton and Ion Research (FAIR) in Darmstadt/Germany foresees a novel approach for data acquisition.
Unlike previous comparable experiments that organize data read-out based on global, hierarchical trigger decisions, CBM is based on free-running and self-triggered front-end electronics. Data is pushed to the next stage of the read-out chain rather than pulled from the buffers of the previous stage. This new paradigm requires a completely new development of read-out electronics.
As one part of this thesis, a firmware for a read-out controller to interface such a free-running and self-triggered front-end ASIC, the GET4 chip, was implemented. The firmware in question was developed to run on a Field Programmable Gate Array (FPGA). An FPGA is an integrated circuit whose behavior can be reconfigured "in the field" which offers a lot of flexibility, bugs can be fixed and also completely new features can be added, even after the hardware has already been installed. Due to these general advantages, the usage of FPGAs is desired for the final experiment. However, there is also a drawback to the usage of FPGAs. The only affordable FPGAs today are based on either SRAM or Flash technology and both cannot easily be operated in a radiation environment.
SRAM-based devices suffer severely from Single Event Upsets (SEUs) and Flash-based FPGAs deteriorate too fast from Total Ionizing Dose (TID) effects.
Several radiation mitigation techniques exist for SRAM-based FPGAs, but careful evaluation for each use case is required. For CBM it is not clear if the higher resource consumption of added redundancy, that more or less directly translates in to additional cost, outweighs the advantaged of using FPGAs. In addition, it is even not clear if radiation mitigation techniques (e.g. scrubbing) that were already successfully put into operation in space applications also work as efficiently at the much higher particle rates expected at CBM.
In this thesis, existing radiation mitigation techniques have been analyzed and eligible techniques have been implemented for the above-mentioned read-out controller. To minimize additional costs, redundancy was only implemented for selected parts of the design.
Finally, the radiation mitigated read-out controller was tested by mounting the device directly into a particle beam at Forschungszentrum Jülich. The tests show that the radiation mitigation effect of the implemented techniques remains sound, even at a very high particle flux and with only part of the design protected by costly redundancy.
The promising results of the in-beam tests suggest to use FPGAs in the read-out chain of the CBM-ToF detector.
The number of multilingual texts in the World Wide Web (WWW) is increasing dramatically and a multilingual economic zone like the European Union (EU) requires the availability of multilingual Natural Language Processing (NLP) tools. Due to a rapid development of NLP tools, many lexical, syntactic, semantic and other linguistic features have been used in different NLP applications. However, there are some situations where these features can not be used due the application type or unavailability of NLP resources for some of the languages. That is why an application that is intended to handle multilingual texts must have features that are not dependent on a particular language and specific linguistic tools. In this thesis, we will focus on two such applications: text readability and source and translation classification.
In this thesis, we provide 18 features that are not only suitable for both applications, but are also language and linguistic tools independent. In order to build a readability classifier, we use texts from three different languages: English, German and Bangla. Our proposed features achieve a classification accuracy that is comparable with a classifier using 40 linguistic features. The readability classifier achieves a classification F-score of 74.21% on the English Wikipedia corpus, an F-score of 75.47% on the English textbook corpus, an F-score of 86.46% on the Bangla textbook corpus and an F-score of 86.26% on the German GEO/GEOLino corpus.
We used more than two million sentence pairs from 21 European languages in order to build the source and translation classifier. The classifier using the same eighteen features achieves a classification accuracy of 86.63%. We also used the same features to build a classifier that classifies translated texts based on their origin. The classifier achieves classification accuracy of 75% for texts from 10 European languages. In this thesis, we also provide four different corpora, three for text readability analysis and one for corpus based translation studies.
Local protein synthesis has re-defined our ideas on the basic cellular mechanisms that underlie synaptic plasticity and memory formation. The population of messenger RNAs that are localised to dendrites, however, remains sparsely identified. Furthermore, neuronal morphological complexity and spatial compartmentalisation require efficient mechanisms for messenger RNA localisation and control over translational efficiency or transcript stability. 3’ untranslated regions, downstream from stop codons, are recognised for providing binding platforms for many regulatory units, thus encoding the processing of the above processes. The hippocampus, a part of the brain involved in the formation, organisation and storage of memories, provides a natural platform to investigate patterns of RNA localisation. The hippocampus comprises tissue layers, which naturally separate the principle neuronal cell bodies from their processes (axons and dendrites). Identifying the full-complement of localised transcripts and associated 3’UTR isoforms is of great importance to understand both basic neuronal functions and principles of synaptic plasticity. These findings can be used to study the properties of neuronal networks as well as to understand how these networks malfunction in neuronal diseases.
Here, deep sequencing is used to identify the mRNAs resident in the synaptic neuropil in the hippocampus. Analysis of a neuropil data set yields a list of 8,379 transcripts of which 2,550 are localised in dendrites and/or axons. Using a fluorescent barcode strategy to label individual mRNAs shows that the relative abundance of different mRNAs in the neuropil varies over 5 orders of magnitude. High-resolution in situ hybridisation validated the presence of mRNAs in both cultured neurons and hippocampal slices. Among the many mRNAs identified, a large fraction of known synaptic proteins including signaling molecules, scaffolds and receptors is discovered. These results reveal a previously unappreciated enormous potential for the local protein synthesis machinery to supply, maintain and modify the dendritic and synaptic proteome.
Using advances in library preparation for next generation sequencing experiments, the diversity of 3’UTR isoforms present in localised transcripts from the rat hippocampus is examined. The obtained results indicate that there is an increase in 3’UTR heterogeneity and 3’UTR length in neuronal tissue. The evolutionary importance of the 3’UTR diversity and correlation with changes in species,tissue and cell complexity is investigated. The conducted analysis reveals the population of 3’UTR isoforms required for transcript localisation in overall neuronal transcriptome as well as the regulatory elements and binding sites specific for neuronal compartments. The configuration of poly(A) signals is correlated with gene function and can be further exploit to determine similar mechanisms for alternative polyadenylation.
Usage of custom specified methods for next-generation sequencing as well as novel approaches for RNA quantification and visualisation necessitate the development and implementation of new downstream analytic methods. Library methods for data-mining transcripts annotation, expression and ontology relations is provided. Usage of a specialised search engine targeting key features of previous experiments is proposed. A processing pipeline for NanoString technology, defining experimental quality and exploiting methods for data normalisation is developed. High-resolution in situ images are analysed by custom application, showing a correlation between RNA quantity and spatial distribution. The vast variety of bioinformatic methods included in this work indicates the importance of downstream analysis to reach biological conclusions. Maintaining the integrability and modularity of our implementations is of great priority, as the dynamic nature of many experimental techniques requires constant improvement in computational analysis.
Quarks and gluons are the building blocks of all hadronic matter, like protons and neutrons. Their interaction is described by Quantum Chromodynamics (QCD), a theory under test by large scale experiments like the Large Hadron Collider (LHC) at CERN and in the future at the Facility for Antiproton and Ion Research (FAIR) at GSI. However, perturbative methods can only be applied to QCD for high energies. Studies from first principles are possible via a discretization onto an Euclidean space-time grid. This discretization of QCD is called Lattice QCD (LQCD) and is the only ab-initio option outside of the high-energy regime. LQCD is extremely compute and memory intensive. In particular, it is by definition always bandwidth limited. Thus—despite the complexity of LQCD applications—it led to the development of several specialized compute platforms and influenced the development of others. However, in recent years General-Purpose computation on Graphics Processing Units (GPGPU) came up as a new means for parallel computing. Contrary to machines traditionally used for LQCD, graphics processing units (GPUs) are a massmarket product. This promises advantages in both the pace at which higher-performing hardware becomes available and its price. CL2QCD is an OpenCL based implementation of LQCD using Wilson fermions that was developed within this thesis. It operates on GPUs by all major vendors as well as on central processing units (CPUs). On the AMD Radeon HD 7970 it provides the fastest double-precision D= kernel for a single GPU, achieving 120GFLOPS. D=—the most compute intensive kernel in LQCD simulations—is commonly used to compare LQCD platforms. This performance is enabled by an in-depth analysis of optimization techniques for bandwidth-limited codes on GPUs. Further, analysis of the communication between GPU and CPU, as well as between multiple GPUs, enables high-performance Krylov space solvers and linear scaling to multiple GPUs within a single system. LQCD calculations require a sampling of the phase space. The hybrid Monte Carlo (HMC) algorithm performs this. For this task, a single AMD Radeon HD 7970 GPU provides four times the performance of two AMD Opteron 6220 running an optimized reference code. The same advantage is achieved in terms of energy-efficiency. In terms of normalized total cost of acquisition (TCA), GPU-based clusters match conventional large-scale LQCD systems. Contrary to those, however, they can be scaled up from a single node. Examples of large GPU-based systems are LOEWE-CSC and SANAM. On both, CL2QCD has already been used in production for LQCD studies.
Acceleration of Biomedical Image Processing and Reconstruction with FPGAs
Increasing chip sizes and better programming tools have made it possible to increase the boundaries of application acceleration with reconfigurable computer chips. In this thesis the potential of acceleration with Field Programmable Gate Arrays (FPGAs) is examined for applications that perform biomedical image processing and reconstruction. The dataflow paradigm was used to port the analysis of image data for localization microscopy and for 3D electron tomography from an imperative description towards the FPGA for the first time.
After the primitives of image processing on FPGAs are presented, a general workflow is given for analyzing imperative source code and converting it to a hardware pipeline where every node processes image data in parallel. The theoretical foundation is then used to accelerate both example applications. For localization microscopy, an acceleration of 185 compared to an Intel i5 450 CPU was achieved, and electron tomography could be sped up by a factor of 5 over an Nvidia Tesla C1060 graphics card while maintaining full accuracy in both cases.
Diese Arbeit zeigt, dass die optimale Verteilung von Individuen in Gruppen unter Beachtung ihrer Zeitpräferenzen ein NP-schweres Problem ist. Daher liegt es Nahe, dass für große Teilnehmerzahlen eine optimale Lösung nicht in praxistauglicher Zeit berechnet werden kann. Hier kann eine geeignete Heuristik Abhilfe schaffen.
Da dieses Problem in Universitäten für Studierende bei der Zuteilung der Übungsgruppen für Hunderte von Teilnehmern jedes Semester aufs Neue gelöst werden muss, macht es Sinn, dabei eine rechnergestützte Lösung einzusetzen.
In dieser Arbeit werden die gängigsten in Deutschland und insbesondere an der Goethe-Universität Frankfurt am Main verwendeten Gruppeneinteilungssysteme untersucht. Alle aktuell eingesetzten Lösungen weisen offensichtliche Mängel auf. In dieser Arbeit wird analysiert, weshalb es dazu kommt und gezeigt, wie diese Mängel vermieden werden können. Außerdem werden Kriterien entwickelt und diskutiert, die ein gutes Gruppeneinteilungssystem erfüllen sollte.
Es wird beschrieben, inwiefern eine gute mit einer Heuristik schnell berechenbare approximative Lösung des Gruppeneinteilungsproblems besser als eine perfekte Lösung sein könnte. Mehrere Heuristiken werden verglichen und eine für dieses Problem gut passende wird entwickelt und implementiert. Mithilfe der Beispielimplementierung und anhand anonymisierter Anmeldedaten für die Veranstaltungen aus vergangenen Jahren wird gezeigt, welche Ergebnisse bei dem Wechsel zu einem solchen System erreicht werden können.
Weiterhin wird analysiert, wie ein solches Gruppeneinteilungssystem an die anderen an Universitäten eingesetzten digitalen Systeme angekoppelt werden kann. Das ist notwendig, um zu vermeiden, dass die redundanten Studierendendaten doppelt gepflegt werden müssen. Somit werden Konsistenz und Korrektheit der Daten bei dem Einsatz eines neuen Systems gefördert.
Abschließend wird ein Ausblick in die Zukunft der Gruppeneinteilungssysteme gegeben und beschrieben, welche Aspekte in diesem Bereich weiterhin wichtig sein könnten.
1D-3D hybrid modeling : from multi-compartment models to full resolution models in space and time
(2014)
Investigation of cellular and network dynamics in the brain by means of modeling and simulation has evolved into a highly interdisciplinary field, that uses sophisticated modeling and simulation approaches to understand distinct areas of brain function. Depending on the underlying complexity, these models vary in their level of detail, in order to cope with the attached computational cost. Hence for large network simulations, single neurons are typically reduced to time-dependent signal processors, dismissing the spatial aspect of each cell. For single cell or networks with relatively small numbers of neurons, general purpose simulators allow for space and time-dependent simulations of electrical signal processing, based on the cable equation theory. An emerging field in Computational Neuroscience encompasses a new level of detail by incorporating the full three-dimensional morphology of cells and organelles into three-dimensional, space and time-dependent, simulations. While every approach has its advantages and limitations, such as computational cost, integrated and methods-spanning simulation approaches, depending on the network size could establish new ways to investigate the brain. In this paper we present a hybrid simulation approach, that makes use of reduced 1D-models using e.g., the NEURON simulator—which couples to fully resolved models for simulating cellular and sub-cellular dynamics, including the detailed three-dimensional morphology of neurons and organelles. In order to couple 1D- and 3D-simulations, we present a geometry-, membrane potential- and intracellular concentration mapping framework, with which graph- based morphologies, e.g., in the swc- or hoc-format, are mapped to full surface and volume representations of the neuron and computational data from 1D-simulations can be used as boundary conditions for full 3D simulations and vice versa. Thus, established models and data, based on general purpose 1D-simulators, can be directly coupled to the emerging field of fully resolved, highly detailed 3D-modeling approaches. We present the developed general framework for 1D/3D hybrid modeling and apply it to investigate electrically active neurons and their intracellular spatio-temporal calcium dynamics.
Ein Ansatz für semantisches Selbstmanagement von verteilten Anwendungen im privaten Lebensumfeld
(2014)
Die Anreicherung des privaten Lebensumfelds mit intelligenten technischen Assistenzsystemen wird in den nächsten Jahrzehnten stark zunehmen. Als Teil dieser Entwicklung wird die Nutzung von externen und hauseigenen IT-Diensten steigen, wodurch sich auch die Komplexität der entstehenden Gesamtsysteme erhöht. Hier sind Ansätze gefordert, diese Systeme auch für technisch nicht versierte Benutzer produktiv nutzbar und beherrschbar zu gestalten, um eine Überforderung zu vermeiden. Im Umfeld häuslicher Dienstplattformen, die eine zentrale Rolle in solchen Systemen übernehmen, nimmt seit ein paar Jahren die Bedeutung der semantischen Modellierung von Diensten stark zu. Diese dient zum einen der formalen Repräsentation von zugehörigen Kontextinformationen, die durch Interaktion mit Sensoren und Aktoren entstehen, und zum anderen der Verbesserung der Interoperabilität zwischen Systemen unterschiedlicher Hersteller. Bisherige Ansätze beschränken sich jedoch auf den Einsatz eines zentralen Rechenknotens zur Ausführung der Dienstplattform und nutzen Semantik – wenn überhaupt – nur zur Verarbeitung von Kontextinformationen. Ein technisches Management des Gesamtsystems findet i.d.R. nicht statt.
Vor diesem Hintergrund ist das Ziel dieser Arbeit die Entwicklung eines Ansatzes für semantisches Selbstmanagement von verteilten dienstbasierten Anwendungen speziell im Umfeld häuslicher Dienstplattformen.
Die vorliegende Arbeit definiert zunächst formale Ontologien für Dienste, Dienstgütemanagement, Selbstmanagement und zugehörige Managementregeln, die zur Laufzeit mit konkreten Diensten und deren erfassten Leistungskenngrößen integriert werden. Durch einen modellgetriebenen Architekturansatz (Model Driven Architecture, MDA) wird ein technologieunabhängiges Management auf abstrakter Ebene ermöglicht, das die Wiederverwendbarkeit von Managementregeln in anderen Szenarien erlaubt.
Dieser Ansatz wird zunächst in eine Architektur für einen hochverfügbaren autonomen Manager überführt, der die Überwachung und Steuerung von Diensten und zugehörigen Dienstplattformen übernehmen kann und auf der aus dem Autonomic Computing bekannten MAPE-K-Kontrollschleife (Monitor, Analyze, Plan, Execute, Knowledge) basiert.
Den Abschluss der Arbeit bildet eine qualitative und quantitative Evaluation (mittels einer OSGi-basierten prototypischen Umsetzung) der erreichten Ergebnisse, die einen Einsatz über die Grenzen des privaten Lebensumfelds hinaus nahelegen.
Die Simulation von Strömung in geklüftet porösen Medien ist von entscheidender Bedeutung in Hinblick auf viele hydrogeologische Anwendungsgebiete, wie beispielsweise der Vorbeugung einer Grundwasserverschmutzung in der Nähe einer Mülldeponie oder einer Endlagerstätte für radioaktive Abfälle, der Förderung fossiler Brennstoffe oder der unterirdischen Speicherung von Kohlendioxid. Aufgrund ihrer Beschaffenheit und insbesondere der großen Permeabilität innerhalb der Klüfte, stellen diese bevorzugte Transportwege dar und können das Strömungsprofil entscheidend beeinflussen. Allerdings stellt die anisotrope Geometrie der Klüfte in Zusammenhang mit den enormen Sprüngen in Parametern wie der Permeabilität auf kleinstem Raum große Anforderungen an die numerischen Verfahren.
Deswegen werden in dieser Arbeit zwei Ansätze zur Modellierung der Klüfte verfolgt. Ein niederdimensionaler Ansatz motiviert durch die anisotrope Geometrie mit sehr geringer Öffnungsweite und sehr langer Erstreckung der Klüfte und ein volldimensionaler Ansatz, der alle Vorgänge innerhalb der Kluft auflöst. Es werden die Ergebnisse dieser Ansätze für Benchmark-Probleme untersucht, mit dem Ergebnis, dass nur bei sehr dünnen Klüften der numerisch günstigere niederdimensionale Ansatz zufriedenstellende Ergebnisse liefert. Weiterhin wird ein Kriterium eingeführt, dass während der Laufzeit anhand von Eigenschaften der Kluft und Strömungsparametern angibt, ob der niederdimensionale Ansatz ausreichende Gültigkeit besitzt. Es wird ein dimensions-adaptiver Ansatz präsentiert, der dann entsprechend dieses Kriteriums einen Wechsel zum volldimensionalen Modell durchführt. Die Ergebnisse zeigen, dass so wesentlich genauere Ergebnisse erzielt werden können, ohne dass eine volle Auflösung in jedem Fall und über den gesamten Rechenzeitraum erforderlich ist.
Alternative polyadenylation (APA) is a widespread mechanism that contributes to the sophisticated dynamics of gene regulation. Approximately 50% of all protein-coding human genes harbor multiple polyadenylation (PA) sites; their selective and combinatorial use gives rise to transcript variants with differing length of their 3' untranslated region (3'UTR). Shortened variants escape UTR-mediated regulation by microRNAs (miRNAs), especially in cancer, where global 3'UTR shortening accelerates disease progression, dedifferentiation and proliferation. Here we present APADB, a database of vertebrate PA sites determined by 3' end sequencing, using massive analysis of complementary DNA ends. APADB provides (A)PA sites for coding and non-coding transcripts of human, mouse and chicken genes. For human and mouse, several tissue types, including different cancer specimens, are available. APADB records the loss of predicted miRNA binding sites and visualizes next-generation sequencing reads that support each PA site in a genome browser. The database tables can either be browsed according to organism and tissue or alternatively searched for a gene of interest. APADB is the largest database of APA in human, chicken and mouse. The stored information provides experimental evidence for thousands of PA sites and APA events. APADB combines 3' end sequencing data with prediction algorithms of miRNA binding sites, allowing to further improve prediction algorithms. Current databases lack correct information about 3'UTR lengths, especially for chicken, and APADB provides necessary information to close this gap. Database URL: http://tools.genxpro.net/apadb/
Mathematical modeling of Arabidopsis thaliana with focus on network decomposition and reduction
(2014)
Systems biology has become an important research field during the last decade. It focusses on the understanding of the systems which emit the measured data. An important part of this research field is the network analysis, investigating biological networks. An essential point of the inspection of these network models is their validation, i.e., the successful comparison of predicted properties to measured data. Here especially Petri nets have shown their usefulness as modeling technique, coming with sound analysis methods and an intuitive representation of biological network data.
A very important tool for network validation is the analysis of the Transition-invariants (TI), which represent possible steady-state pathways, and the investigation of the liveness property. The computational complexity of the determination of both, TI and liveness property, often hamper their investigation.
To investigate this issue, a metabolic network model is created. It describes the core metabolism of Arabidopsis thaliana, and it is solely based on data from the literature. The model is too complex to determine the TI and the liveness property.
Several strategies are followed to enable an analysis and validation of the network. A network decomposition is utilized in two different ways: manually, motivated by idea to preserve the integrity of biological pathways, and automatically, motivated by the idea to minimize the number of crossing edges. As a decomposition may not be preserving important properties like the coveredness, a network reduction approach is suggested, which is mathematically proven to conserve these important properties. To deal with the large amount of data coming from the TI analysis, new organizational structures are proposed. The liveness property is investigated by reducing the complexity of the calculation method and adapting it to biological networks.
The results obtained by these approaches suggest a valid network model. In conclusion, the proposed approaches and strategies can be used in combination to allow the validation and analysis of highly complex biological networks.
A concurrent implementation of software transactional memory in Concurrent Haskell using a call-by-need functional language with processes and futures is given. The description of the small-step operational semantics is precise and explicit, and employs an early abort of conflicting transactions. A proof of correctness of the implementation is given for a contextual semantics with may- and should-convergence. This implies that our implementation is a correct evaluator for an abstract specification equipped with a big-step semantics.
Towards correctness of program transformations through unification and critical pair computation
(2011)
Correctness of program transformations in extended lambda calculi with a contextual semantics is usually based on reasoning about the operational semantics which is a rewrite semantics. A successful approach to proving correctness is the combination of a context lemma with the computation of overlaps between program transformations and the reduction rules, and then of so-called complete sets of diagrams. The method is similar to the computation of critical pairs for the completion of term rewriting systems.We explore cases where the computation of these overlaps can be done in a first order way by variants of critical pair computation that use unification algorithms. As a case study we apply the method to a lambda calculus with recursive let-expressions and describe an effective unification algorithm to determine all overlaps of a set of transformations with all reduction rules. The unification algorithm employs many-sorted terms, the equational theory of left-commutativity modelling multi-sets, context variables of different kinds and a mechanism for compactly representing binding chains in recursive let-expressions.
The interactive verification system VeriFun is based on a polymorphic call-by-value functional language and on a first-order logic with initial model semantics w.r.t. constructors. It is designed to perform automatic induction proofs and can also deal with partial functions. This paper provides a reconstruction of the corresponding logic and semantics using the standard treatment of undefinedness which adapts and improves the VeriFun-logic by allowing reasoning on nonterminating expressions and functions. Equality of expressions is defined as contextual equivalence based on observing termination in all closing contexts. The reconstruction shows that several restrictions of the VeriFun framework can easily be removed, by natural generalizations: mutual recursive functions, abstractions in the data values, and formulas with arbitrary quantifier prefix can be formulated. The main results of this paper are: an extended set of deduction rules usable in VeriFun under the adapted semantics is proved to be correct, i.e. they respect the observational equivalence in all extensions of a program. We also show that certain classes of theorems are conservative under extensions, like universally quantified equations. Also other special classes of theorems are analyzed for conservativity.
The interactive verification system VeriFun is based on a polymorphic call-by-value functional language and on a first-order logic with initial model semantics w.r.t. constructors. This paper provides a reconstruction of the corresponding logic when partial functions are permitted. Typing is polymorphic for the definition of functions but monomorphic for terms in formulas. Equality of terms is defined as contextual equivalence based on observing termination in all contexts. The reconstruction also allows several generalizations of the functional language like mutual recursive functions and abstractions in the data values. The main results are: Correctness of several program transformations for all extensions of a program, which have a potential usage in a deduction system. We also proved that universally quantified equations are conservative, i.e. if a universally quantified equation is valid w.r.t. a program P, then it remains valid if the program is extended by new functions and/or new data types.
Motivated by the question of correctness of a specific implementation of concurrent buffers in the lambda calculus with futures underlying Alice ML, we prove that concurrent buffers and handled futures can correctly encode each other. Correctness means that our encodings preserve and reflect the observations of may- and must-convergence, and as a consequence also yields soundness of the encodings with respect to a contextually defined notion of program equivalence. While these translations encode blocking into queuing and waiting, we also describe an adequate encoding of buffers in a calculus without handles, which is more low-level and uses busy-waiting instead of blocking. Furthermore we demonstrate that our correctness concept applies to the whole compilation process from high-level to low-level concurrent languages, by translating the calculus with buffers, handled futures and data constructors into a small core language without those constructs.
We investigate methods and tools for analyzing translations between programming languages with respect to observational semantics. The behavior of programs is observed in terms of may- and mustconvergence in arbitrary contexts, and adequacy of translations, i.e., the reflection of program equivalence, is taken to be the fundamental correctness condition. For compositional translations we propose a notion of convergence equivalence as a means for proving adequacy. This technique avoids explicit reasoning about contexts, and is able to deal with the subtle role of typing in implementations of language extensions.
We investigate methods and tools for analyzing translations between programming languages with respect to observational semantics. The behavior of programs is observed in terms of may- and mustconvergence in arbitrary contexts, and adequacy of translations, i.e., the reflection of program equivalence, is taken to be the fundamental correctness condition. For compositional translations we propose a notion of convergence equivalence as a means for proving adequacy. This technique avoids explicit reasoning about contexts, and is able to deal with the subtle role of typing in implementations of language extensions.
We investigate methods and tools for analysing translations between programming languages with respect to observational semantics. The behaviour of programs is observed in terms of may- and mustconvergence in arbitrary contexts, and adequacy of translations, i.e., the reflection of program equivalence, is taken to be the fundamental correctness condition. For compositional translations we propose a notion of convergence equivalence as a means for proving adequacy. This technique avoids explicit reasoning about contexts, and is able to deal with the subtle role of typing in implementations of language extensions.
We investigate methods and tools for analysing translations between programming languages with respect to observational semantics. The behaviour of programs is observed in terms of may- and mustconvergence in arbitrary contexts, and adequacy of translations, i.e., the reflection of program equivalence, is taken to be the fundamental correctness condition. For compositional translations we propose a notion of convergence equivalence as a means for proving adequacy. This technique avoids explicit reasoning about contexts, and is able to deal with the subtle role of typing in implementations of language extensions.
The paper proposes a variation of simulation for checking and proving contextual equivalence in a non-deterministic call-by-need lambda-calculus with constructors, case, seq, and a letrec with cyclic dependencies. It also proposes a novel method to prove its correctness. The calculus’ semantics is based on a small-step rewrite semantics and on may-convergence. The cyclic nature of letrec bindings, as well as nondeterminism, makes known approaches to prove that simulation implies contextual equivalence, such as Howe’s proof technique, inapplicable in this setting. The basic technique for the simulation as well as the correctness proof is called pre-evaluation, which computes a set of answers for every closed expression. If simulation succeeds in finite computation depth, then it is guaranteed to show contextual preorder of expressions.
This paper proves several generic variants of context lemmas and thus contributes to improving the tools for observational semantics of deterministic and non-deterministic higher-order calculi that use a small-step reduction semantics. The generic (sharing) context lemmas are provided for may- as well as two variants of must-convergence, which hold in a broad class of extended process- and extended lambda calculi, if the calculi satisfy certain natural conditions. As a guide-line, the proofs of the context lemmas are valid in call-by-need calculi, in callby-value calculi if substitution is restricted to variable-by-variable and in process calculi like variants of the π-calculus. For calculi employing beta-reduction using a call-by-name or call-by-value strategy or similar reduction rules, some iu-variants of ciu-theorems are obtained from our context lemmas. Our results reestablish several context lemmas already proved in the literature, and also provide some new context lemmas as well as some new variants of the ciu-theorem. To make the results widely applicable, we use a higher-order abstract syntax that allows untyped calculi as well as certain simple typing schemes. The approach may lead to a unifying view of higher-order calculi, reduction, and observational equality.
This paper proves several generic variants of context lemmas and thus contributes to improving the tools to develop observational semantics that is based on a reduction semantics for a language. The context lemmas are provided for may- as well as two variants of mustconvergence and a wide class of extended lambda calculi, which satisfy certain abstract conditions. The calculi must have a form of node sharing, e.g. plain beta reduction is not permitted. There are two variants, weakly sharing calculi, where the beta-reduction is only permitted for arguments that are variables, and strongly sharing calculi, which roughly correspond to call-by-need calculi, where beta-reduction is completely replaced by a sharing variant. The calculi must obey three abstract assumptions, which are in general easily recognizable given the syntax and the reduction rules. The generic context lemmas have as instances several context lemmas already proved in the literature for specific lambda calculi with sharing. The scope of the generic context lemmas comprises not only call-by-need calculi, but also call-by-value calculi with a form of built-in sharing. Investigations in other, new variants of extended lambda-calculi with sharing, where the language or the reduction rules and/or strategy varies, will be simplified by our result, since specific context lemmas are immediately derivable from the generic context lemma, provided our abstract conditions are met.
We present a higher-order call-by-need lambda calculus enriched with constructors, case-expressions, recursive letrec-expressions, a seq-operator for sequential evaluation and a non-deterministic operator amb that is locally bottom-avoiding. We use a small-step operational semantics in form of a single-step rewriting system that defines a (nondeterministic) normal order reduction. This strategy can be made fair by adding resources for bookkeeping. As equational theory we use contextual equivalence, i.e. terms are equal if plugged into any program context their termination behaviour is the same, where we use a combination of may- as well as must-convergence, which is appropriate for non-deterministic computations. We show that we can drop the fairness condition for equational reasoning, since the valid equations w.r.t. normal order reduction are the same as for fair normal order reduction. We evolve different proof tools for proving correctness of program transformations, in particular, a context lemma for may- as well as mustconvergence is proved, which restricts the number of contexts that need to be examined for proving contextual equivalence. In combination with so-called complete sets of commuting and forking diagrams we show that all the deterministic reduction rules and also some additional transformations preserve contextual equivalence.We also prove a standardisation theorem for fair normal order reduction. The structure of the ordering <=c a is also analysed: Ω is not a least element, and <=c already implies contextual equivalence w.r.t. may-convergence.
We present a higher-order call-by-need lambda calculus enriched with constructors, case-expressions, recursive letrec-expressions, a seq-operator for sequential evaluation and a non-deterministic operator amb that is locally bottom-avoiding. We use a small-step operational semantics in form of a single-step rewriting system that defines a (nondeterministic) normal order reduction. This strategy can be made fair by adding resources for bookkeeping. As equational theory we use contextual equivalence, i.e. terms are equal if plugged into any program context their termination behaviour is the same, where we use a combination of may- as well as must-convergence, which is appropriate for non-deterministic computations. We show that we can drop the fairness condition for equational reasoning, since the valid equations w.r.t. normal order reduction are the same as for fair normal order reduction. We evolve different proof tools for proving correctness of program transformations, in particular, a context lemma for may- as well as mustconvergence is proved, which restricts the number of contexts that need to be examined for proving contextual equivalence. In combination with so-called complete sets of commuting and forking diagrams we show that all the deterministic reduction rules and also some additional transformations preserve contextual equivalence.We also prove a standardisation theorem for fair normal order reduction. The structure of the ordering <=c a is also analysed: Ω is not a least element, and <=c already implies contextual equivalence w.r.t. may-convergence.
We develop a proof method to show that in a (deterministic) lambda calculus with letrec and equipped with contextual equivalence the call-by-name and the call-by-need evaluation are equivalent, and also that the unrestricted copy-operation is correct. Given a let-binding x = t, the copy-operation replaces an occurrence of the variable x by the expression t, regardless of the form of t. This gives an answer to unresolved problems in several papers, it adds a strong method to the tool set for reasoning about contextual equivalence in higher-order calculi with letrec, and it enables a class of transformations that can be used as optimizations. The method can be used in different kind of lambda calculi with cyclic sharing. Probably it can also be used in non-deterministic lambda calculi if the variable x is “deterministic”, i.e., has no interference with non-deterministic executions. The main technical idea is to use a restricted variant of the infinitary lambda-calculus, whose objects are the expressions that are unrolled w.r.t. let, to define the infinite developments as a reduction calculus on the infinite trees and showing a standardization theorem.
The goal of this report is to prove correctness of a considerable subset of transformations w.r.t. contextual equivalence in an extended lambda-calculus LS with case, constructors, seq, let, and choice, with a simple set of reduction rules; and to argue that an approximation calculus LA is equivalent to LS w.r.t. the contextual preorder, which enables the proof tool of simulation. Unfortunately, a direct proof appears to be impossible.
The correctness proof is by defining another calculus L comprising the complex variants of copy, case-reduction and seq-reductions that use variable-binding chains. This complex calculus has well-behaved diagrams and allows a proof of correctness of transformations, and that the simple calculus LS, the calculus L, and the calculus LA all have an equivalent contextual preorder.
The goal of this report is to prove correctness of a considerable subset of transformations w.r.t. contextual equivalence in an extended lambda-calculus LS with case, constructors, seq, let, and choice, with a simple set of reduction rules; and to argue that an approximation calculus LA is equivalent to LS w.r.t. the contextual preorder, which enables the proof tool of simulation. Unfortunately, a direct proof appears to be impossible.
The correctness proof is by defining another calculus L comprising the complex variants of copy, case-reduction and seq-reductions that use variable-binding chains. This complex calculus has well-behaved diagrams and allows a proof of correctness of transformations, and that the simple calculus LS, the calculus L, and the calculus LA all have an equivalent contextual preorder.
The calculus CHF models Concurrent Haskell extended by concurrent, implicit futures. It is a process calculus with concurrent threads, monadic concurrent evaluation, and includes a pure functional lambda-calculus which comprises data constructors, case-expressions, letrec-expressions, and Haskell’s seq. Futures can be implemented in Concurrent Haskell using the primitive unsafeInterleaveIO, which is available in most implementations of Haskell. Our main result is conservativity of CHF, that is, all equivalences of pure functional expressions are also valid in CHF. This implies that compiler optimizations and transformations from pure Haskell remain valid in Concurrent Haskell even if it is extended by futures. We also show that this is no longer valid if Concurrent Haskell is extended by the arbitrary use of unsafeInterleaveIO.
This paper shows equivalence of applicative similarity and contextual approximation, and hence also of bisimilarity and contextual equivalence, in LR, the deterministic call-by-need lambda calculus with letrec extended by data constructors, case-expressions and Haskell's seqoperator. LR models an untyped version of the core language of Haskell. Bisimilarity simplifies equivalence proofs in the calculus and opens a way for more convenient correctness proofs for program transformations.
The proof is by a fully abstract and surjective transfer of the contextual approximation into a call-by-name calculus, which is an extension of Abramsky's lazy lambda calculus. In the latter calculus equivalence of similarity and contextual approximation can be shown by Howe's method. Using an equivalent but inductive definition of behavioral preorder we then transfer similarity back to the calculus LR.
The translation from the call-by-need letrec calculus into the extended call-by-name lambda calculus is the composition of two translations. The first translation replaces the call-by-need strategy by a call-by-name strategy and its correctness is shown by exploiting infinite tress, which emerge by unfolding the letrec expressions. The second translation encodes letrec-expressions by using multi-fixpoint combinators and its correctness is shown syntactically by comparing reductions of both calculi. A further result of this paper is an isomorphism between the mentioned calculi, and also with a call-by-need letrec calculus with a less complex definition of reduction than LR.
Motivated by the question whether sound and expressive applicative similarities for program calculi with should-convergence exist, this paper investigates expressive applicative similarities for the untyped call-by-value lambda-calculus extended with McCarthy's ambiguous choice operator amb. Soundness of the applicative similarities w.r.t. contextual equivalence based on may-and should-convergence is proved by adapting Howe's method to should-convergence. As usual for nondeterministic calculi, similarity is not complete w.r.t. contextual equivalence which requires a rather complex counter example as a witness. Also the call-by-value lambda-calculus with the weaker nondeterministic construct erratic choice is analyzed and sound applicative similarities are provided. This justifies the expectation that also for more expressive and call-by-need higher-order calculi there are sound and powerful similarities for should-convergence.
The pi-calculus is a well-analyzed model for mobile processes and mobile computations.
While a lot of other process and lambda calculi that are core languages of higher-order concurrent and/or functional programming languages use a contextual semantics observing the termination behavior of programs in all program contexts, traditional program equivalences in the pi-calculus are bisimulations and barbed testing equivalences, which observe the communication capabilities of processes under reduction and in contexts.
There is a distance between these two approaches to program equivalence which makes it hard to compare the pi-calculus with other languages. In this paper we contribute to bridging this gap by investigating a contextual semantics of the synchronous pi-calculus with replication and without sums.
To transfer contextual equivalence to the pi-calculus we add a process Stop as constant which indicates success and is used as the base to define and analyze the contextual equivalence which observes may- and should-convergence of processes.
We show as a main result that contextual equivalence in the pi-calculus with Stop conservatively extends barbed testing equivalence in the (Stop-free) pi-calculus. This implies that results on contextual equivalence can be directly transferred to the (Stop-free) pi-calculus with barbed testing equivalence.
We analyze the contextual ordering, prove some nontrivial process equivalences, and provide proof tools for showing contextual equivalences. Among them are a context lemma, and new notions of sound applicative similarities for may- and should-convergence.
Motivated by our experience in analyzing properties of translations between programming languages with observational semantics, this paper clarifies the notions, the relevant questions, and the methods, constructs a general framework, and provides several tools for proving various correctness properties of translations like adequacy and full abstractness. The presented framework can directly be applied to the observational equivalences derived from the operational semantics of programming calculi, and also to other situations, and thus has a wide range of applications.
Our motivation is the question whether the lazy lambda calculus, a pure lambda calculus with the leftmost outermost rewriting strategy, considered under observational semantics, or extensions thereof, are an adequate model for semantic equivalences in real-world purely functional programming languages, in particular for a pure core language of Haskell. We explore several extensions of the lazy lambda calculus: addition of a seq-operator, addition of data constructors and case-expressions, and their combination, focusing on conservativity of these extensions. In addition to untyped calculi, we study their monomorphically and polymorphically typed versions. For most of the extensions we obtain non-conservativity which we prove by providing counterexamples. However, we prove conservativity of the extension by data constructors and case in the monomorphically typed scenario.
Our motivation is the question whether the lazy lambda calculus, a pure lambda calculus with the leftmost outermost rewriting strategy, considered under observational semantics, or extensions thereof, are an adequate model for semantic equivalences in real-world purely functional programming languages, in particular for a pure core language of Haskell. We explore several extensions of the lazy lambda calculus: addition of a seq-operator, addition of data constructors and case-expressions, and their combination, focusing on conservativity of these extensions. In addition to untyped calculi, we study their monomorphically and polymorphically typed versions. For most of the extensions we obtain non-conservativity which we prove by providing counterexamples. However, we prove conservativity of the extension by data constructors and case in the monomorphically typed scenario.
Diese Diplomarbeit beschäftigt sich mit dem Entwurf und der Implementierung der Auslesefirmware für den GET4 Chip im Rahmen des CBM (Compressed Baryonic Matter) Experiments.
Physikalische Experimente mit Teilchenbeschleunigern, wie das geplante CBM Experiment, erzeugen große Mengen an Sensordaten, die aufbereitet, gespeichert und ausgewertet werden müssen. Der GET4 Chip ist ein möglicher TDC (Time to Digital Converter) Chip mit 4 Kanälen für den TOF (Time of Flight) Detektor des CBM Experiments. Der GET4 Chip erzeugt einen Zeitstempel für die Treffer der TOF RPC (Resistive Plate Chambers) Sensoren.
Die Auslese der GET4 Daten wird von einem ROC (Readout Controller Board) übernommen, das mit einem FPGA (Field Programmable Gate Array) ausgestattet ist. Das ROC Board lässt sich über die ROC Firmware für verschiedene Einsatzzwecke flexibel konfigurieren, so dass damit auch andere Frontend Elektronik, wie z.B. der n-XYTER Chip, ausgelesen werden kann. Während dieser Diplomarbeit wurde das SysCore2 mit einem Virtex-4 FX20 als ROC verwendet. Inzwischen ist das SysCore3 verfügbar, das einen deutlich leistungsfähigeren FPGA besitzt. Die bisher verfügbare Firmware für den GET4 war für den Betrieb im 24 Bit Modus konzipiert.
Der bevorzugte Auslesemodus für den GET4 ist der 32 Bit Modus, da der 24 Bit Modus einige Beschränkungen aufweist und die volle Funktionalität des GET4s nur im 32 Bit Modus zur Verfügung steht. Im Rahmen dieser Diplomarbeit wurde die vorhandene 24 Bit Firmware für den 32 Bit Modus erweitert, so dass der zweite Datenkanal des GET4s, sowie der DDR (Double Data Rate) Modus für die Datenübertragung zur Verfügung steht. Neben der Erhöhung der Datenübertragungsraten stand die mögliche Skalierbarkeit der Firmware für die Auslese möglichst vieler GET4 Chips im Fokus. Dazu musste die Firmware umgestaltet werden, um die begrenzten BRAM Ressourcen des FPGAs nicht unnötig zu belegen.
Eine weitere wichtige Neuerung der 32 Bit Firmware ist die automatische Synchronisation der Epochen zwischen ROC und GET4. Dazu werden die Epochennachrichten der GET4s mit der Epoche des ROCs verglichen und bei Bedarf Synchronisationsnachrichten mit der korrekten Epoche an die GET4s gesendet.
Visualisierung von E-Mail-Traffic mit Schwerpunkt auf eine inhaltliche Analyse von Wortmustern
(2010)
E-Mail hat sich zu einem sehr wichtigen Kommunikationsmittel entwickelt, leidet aber aktuell unter einer massiven Verbreitung unerwünschter und unverlangter Inhalte. Diese können für einen Anwender nicht nur lästig sein, sondern auch die vorhandene Netz- und Speicher-Infrastruktur enorm belasten.
Die Notwendigkeit einer Filterung des E-Mail-Traffic hat zu einer Reihe recht unterschiedlicher Methoden geführt, die computergesteuert eine E-Mail auf ihren Spam-Gehalt untersuchen.
Die Motivation hinter dieser Arbeit ist zu prüfen, ob die besonderen Eigenschaften der visuellen Wahrnehmung eines Menschen als unterstützendes Mittel eingesetzt werden können, um E-Mail-Inhalte zu überprüfen und eventuell vorhandene Wort-Muster, die auf Spam deuten, sichtbar zu machen.
Um dieses Ziel zu erreichen musste zuerst eine geeignete Auswahl spamspezifischer Merkmale getroffen werden. Danach wurden Methoden des Text Minings angewendet, um aus dem Inhalt einer E-Mail strukturierte Daten zu gewinnen, die sich zur Repräsentation einer Nachricht eignen und als Grundlage für eine Visualisierung herangezogen werden können. Basierend auf den vorab ausgewählten Spam-Charakteristika wurdenWorteigenschaften mit Hilfe extern angebundener Wortlisten, regulärer Ausdrücke und unter Einsatz eines Wörterbuches überprüft, und die erhaltenen Ergebnisse flossen neben einer einfachen Gewichtung von Worthäufigkeiten in Form einer anwendungsspezifischen Gewichtung mit ein.
Es wurden anschließend zwei verschiedene Sichten konzipiert, um einem Anwender einen Einblick in die extrahierten Daten zu ermöglichen. Es hat sich herausgestellt, dass besonders Treemaps geeignet sind um die anfallenden Datenmengen kompakt abzubilden, aber gleichzeitig einen notwendigen Detailgrad auf einzelne Worteigenschaften gewährleisten.
Das Konzept wurde prototypisch unter Verwendung des Mailservers Mercury/32 sowie einer MySQL-Datenbank implementiert und konnte teilweise aufzeigen, dass es anhand der von der Engine generierten Strukturen möglich ist, spamspezifische Merkmale einer E-Mail unter Verwendung der gewählten Visualisierungstechniken auf eine Weise sichtbar zu machen, die einem Anwender eine Mustererkennung erlauben.
Die Diplomarbeit wurde als Gemeinschaftsarbeit angefertigt und konnte sinnvoll in zwei Bereiche aufgeteilt werden: Die Engine und die Visualisierung. Die konzeptuellen Überlegungen für das Thema sind größtenteils gemeinsam erfolgt, jedoch liegt der Schwerpunkt von Pouneh Khayat Pour im Bereich der Analyse und der von Yvonne Neidert in der Visualisierung.
The human brain is an unparalleled system: Through millions of years of evolution and during a lifespan of learning, our brains have developed remarkable abilities for dealing with incoming sensory data, extracting structure and useful information, and finally drawing the conclusions that result in the actions we take. Understanding the principles behind this machinery and building artificial systems that mimic at least some of these capabilities is a long standing goal in both the scientific and the engineering communities. While this goal still seems unreachable, we have seen tremendous progress when it comes to training data-driven algorithms on vast amounts of training data, e.g. to learn an optimal data model and its parameters in order to accomplish some task. Such algorithms are now omnipresent: they are part of recommender systems, they perform speech recognition and generally build the foundation for many semi-autonomous systems. They start to be integral part of many technical systems modern technical societies rely on for their everyday functioning. Many of these algorithms were originally inspired by biological systems or act as models for sensory data processing in mammalian brains. The response properties of a certain population of neurons in the first stages of the mammalian visual pathway, for example, can be modeled by algorithms such as Sparse Coding (SC), Independent Component Analysis (ICA) or Factor Analysis (FA). These well established learning algorithms typically assume linear interactions between the variables of the model. Most often these relationships are expressed in the form of a matrix-vector products between a matrix with learned dictionary-elements (basis vectors as column vectors) and the latent variables of these models. While on the one hand this linear interaction can sometimes be justified by the physical process for which the machine learning model is proposed, it is on the other hand often chosen just because of its mathematical and practical convenience. From an optimal coding point of view though, one would generally expect that the ideal model closely reflect the core interactions of the system it is modeling. In vision for example, one of the dominant processes giving rise to our sensory percepts are occlusions. Occluding objects are omnipresent in visual scenes and it would not be surprising if the mammalian visual system would be optimized to process occluding structures in the visual data stream. Yet, the established mathematical models of the first stages of the visual processing path (like, e.g., SC, ICA or FA) all assume linear interactions between the active image components. In this thesis we will discuss new models that aim to approximate the effects of occluding components by assuming nonlinear interactions between their activated dictionary elements. We will present learning algorithms that infer optimal parameters for these models given data. In the experiments, we will validate the algorithms on artificial ground truth data and demonstrate their ability to recover the correct model parameters. We will show that the predictions made by these nonlinear models correspond better to the experimental data measured in-vivo than the predictions made by the established linear models. Furthermore, we systematically explore and compare a large space of plausible combinations of hyperparameters and preprocessing schemes in order to eliminate any effects of artefacts on the observed results. Training nonlinear sparse coding models is computationally more demanding than training linear models. In order to perform the numerical experiments described in this thesis we developed a software framework that facilitates the implementation of massive parallel expectation maximization (EM) based learning algorithms. This infrastructure was used for all experiments described in here, as well as by collaborators in projects we will not discuss. Some of the experiments required more than 1017 floating point operations and were run on a computer cluster running on up to 5000 CPU Cores in parallel. Our parallel framework enabled these experiments to be performed.
Frankfurt, an einem gewöhnlichen Morgen gegen 8:00 Uhr: Von Osten strömen zahlreiche Pendler über die A 66 in Richtung Innenstadt. Spätestens »Am Erlenbruch« kommt es zu Staus und zäh fließendem Verkehr. Frankfurter Informatiker können diese Staus mithilfe eines Simulationssystems vorhersagen. Mehr noch: Sie berechnen den Ausstoß von Schadstoffen und deren Verteilung über das Stadtgebiet. Ziel ist die Optimierung von Verkehrsleitstrategien.
Aging of biological systems is controlled by various processes which have a potential impact on gene expression. Here we report a genome-wide transcriptome analysis of the fungal aging model Podospora anserina. Total RNA of three individuals of defined age were pooled and analyzed by SuperSAGE (serial analysis of gene expression). A bioinformatics analysis identified different molecular pathways to be affected during aging. While the abundance of transcripts linked to ribosomes and to the proteasome quality control system were found to decrease during aging, those associated with autophagy increase, suggesting that autophagy may act as a compensatory quality control pathway. Transcript profiles associated with the energy metabolism including mitochondrial functions were identified to fluctuate during aging. Comparison of wild-type transcripts, which are continuously down-regulated during aging, with those down-regulated in the long-lived, copper-uptake mutant grisea, validated the relevance of age-related changes in cellular copper metabolism. Overall, we (i) present a unique age-related data set of a longitudinal study of the experimental aging model P. anserina which represents a reference resource for future investigations in a variety of organisms, (ii) suggest autophagy to be a key quality control pathway that becomes active once other pathways fail, and (iii) present testable predictions for subsequent experimental investigations.
Time-critical applications process a continuous stream of input data and have to meet specific timing constraints. A common approach to ensure that such an application satisfies its constraints is over-provisioning: The application is deployed in a dedicated cluster environment with enough processing power to achieve the target performance for every specified data input rate. This approach comes with a drawback: At times of decreased data input rates, the cluster resources are not fully utilized. A typical use case is the HLT-Chain application that processes physics data at runtime of the ALICE experiment at CERN. From a perspective of cost and efficiency it is desirable to exploit temporarily unused cluster resources. Existing approaches aim for that goal by running additional applications. These approaches, however, a) lack in flexibility to dynamically grant the time-critical application the resources it needs, b) are insufficient for isolating the time-critical application from harmful side-effects introduced by additional applications or c) are not general because application-specific interfaces are used. In this thesis, a software framework is presented that allows to exploit unused resources in a dedicated cluster without harming a time-critical application. Additional applications are hosted in Virtual Machines (VMs) and unused cluster resources are allocated to these VMs at runtime. In order to avoid resource bottlenecks, the resource usage of VMs is dynamically modified according to the needs of the time-critical application. For this purpose, a number of previously not combined methods is used. On a global level, appropriate VM manipulations like hot migration, suspend/resume and start/stop are determined by an informed search heuristic and applied at runtime. Locally on cluster nodes, a feedback-controlled adaption of VM resource usage is carried out in a decentralized manner. The employment of this framework allows to increase a cluster’s usage by running additional applications, while at the same time preventing negative impact towards a time-critical application. This capability of the framework is shown for the HLT-Chain application: In an empirical evaluation the cluster CPU usage is increased from 49% to 79%, additional results are computed and no negative effect towards the HLT-Chain application are observed.
G-CSC Report 2010
(2011)
The present report gives a short summary of the research of the Goethe Center for Scientific Computing (G-CSC) of the Goethe University Frankfurt. G-CSC aims at developing and applying methods and tools for modelling and numerical simulation of problems from empirical science and technology. In particular, fast solvers for partial differential equations (i.e. pde) such as robust, parallel, and adaptive multigrid methods and numerical methods for stochastic differential equations are developed. These methods are highly adanvced and allow to solve complex problems..
The G-CSC is organised in departments and interdisciplinary research groups. Departments are localised directly at the G-CSC, while the task of interdisciplinary research groups is to bridge disciplines and to bring scientists form different departments together. Currently, G-CSC consists of the department Simulation and Modelling and the interdisciplinary research group Computational Finance.
The economic success of the World Wide Web makes it a highly competitive environment for web businesses. For this reason, it is crucial for web business owners to learn what their customers want. This thesis provides a conceptual framework and an implementation of a system that helps to better understand the behavior and potential interests of web site visitors by accounting for both explicit and implicit feedback. This thesis is divided into two parts.
The first part is rooted in computer science and information systems and uses graph theory and an extended click-stream analysis to define a framework and a system tool that is useful for analyzing web user behavior by calculating the interests of the users.
The second part is rooted in behavioral economics, mathematics, and psychology and is investigating influencing factors on different types of web user choices. In detail, a model for the cognitive process of rating products on the Web is defined and an importance hierarchy of the influencing factors is discovered.
Both parts make use of techniques from a variety of research fields and, therefore, contribute to the area of Web Science.
Driven by rapid technological advancements, the amount of data that is created, captured, communicated, and stored worldwide has grown exponentially over the past decades. Along with this development it has become critical for many disciplines of science and business to being able to gather and analyze large amounts of data. The sheer volume of the data often exceeds the capabilities of classical storage systems, with the result that current large-scale storage systems are highly distributed and are comprised of a high number of individual storage components. As with any other electronic device, the reliability of storage hardware is governed by certain probability distributions, which in turn are influenced by the physical processes utilized to store the information. The traditional way to deal with the inherent unreliability of combined storage systems is to replicate the data several times. Another popular approach to achieve failure tolerance is to calculate the block-wise parity in one or more dimensions. With better understanding of the different failure modes of storage components, it has become evident that sophisticated high-level error detection and correction techniques are indispensable for the ever-growing distributed systems. The utilization of powerful cyclic error-correcting codes, however, comes with a high computational penalty, since the required operations over finite fields do not map very well onto current commodity processors. This thesis introduces a versatile coding scheme with fully adjustable fault-tolerance that is tailored specifically to modern processor architectures. To reduce stress on the memory subsystem the conventional table-based algorithm for multiplication over finite fields has been replaced with a polynomial version. This arithmetically intense algorithm is better suited to the wide SIMD units of the currently available general purpose processors, but also displays significant benefits when used with modern many-core accelerator devices (for instance the popular general purpose graphics processing units). A CPU implementation using SSE and a GPU version using CUDA are presented. The performance of the multiplication depends on the distribution of the polynomial coefficients in the finite field elements. This property has been used to create suitable matrices that generate a linear systematic erasure-correcting code which shows a significantly increased multiplication performance for the relevant matrix elements. Several approaches to obtain the optimized generator matrices are elaborated and their implications are discussed. A Monte-Carlo-based construction method allows it to influence the specific shape of the generator matrices and thus to adapt them to special storage and archiving workloads. Extensive benchmarks on CPU and GPU demonstrate the superior performance and the future application scenarios of this novel erasure-resilient coding scheme.
Paging is one of the most prominent problems in the field of online algorithms. We have to serve a sequence of page requests using a cache that can hold up to k pages. If the currently requested page is in cache we have a cache hit, otherwise we say that a cache miss occurs, and the requested page needs to be loaded into the cache. The goal is to minimize the number of cache misses by providing a good page-replacement strategy. This problem is part of memory-management when data is stored in a two-level memory hierarchy, more precisely a small and fast memory (cache) and a slow but large memory (disk). The most important application area is the virtual memory management of operating systems. Accessed pages are either already in the RAM or need to be loaded from the hard disk into the RAM using expensive I/O. The time needed to access the RAM is insignificant compared to an I/O operation which takes several milliseconds.
The traditional evaluation framework for online algorithms is competitive analysis where the online algorithm is compared to the optimal offline solution. A shortcoming of competitive analysis consists of its too pessimistic worst-case guarantees. For example LRU has a theoretical competitive ratio of k but in practice this ratio rarely exceeds the value 4.
Reducing the gap between theory and practice has been a hot research issue during the last years. More recent evaluation models have been used to prove that LRU is an optimal online algorithm or part of a class of optimal algorithms respectively, which was motivated by the assumption that LRU is one of the best algorithms in practice. Most of the newer models make LRU-friendly assumptions regarding the input, thus not leaving much room for new algorithms.
Only few works in the field of online paging have introduced new algorithms which can compete with LRU as regards the small number of cache misses.
In the first part of this thesis we study strongly competitive randomized paging algorithms, i.e. algorithms with optimal competitive guarantees. Although the tight bound for the competitive ratio has been known for decades, current algorithms matching this bound are complex and have high running times and memory requirements. We propose the algorithm OnlineMin which processes a page request in O(log k/log log k) time in the worst case. The best previously known solution requires O(k^2) time.
Usually the memory requirement of a paging algorithm is measured by the maximum number of pages that the algorithm keeps track of. Any algorithm stores information about the k pages in the cache. In addition it can also store information about pages not in cache, denoted bookmarks. We answer the open question of Bein et al. '07 whether strongly competitive randomized paging algorithms using only o(k) bookmarks exist or not. To do so we modify the Partition algorithm of McGeoch and Sleator '85 which has an unbounded bookmark complexity, and obtain Partition2 which uses O(k/log k) bookmarks.
In the second part we extract ideas from theoretical analysis of randomized paging algorithms in order to design deterministic algorithms that perform well in practice. We refine competitive analysis by introducing the attack rate
parameter r, which ranges between 1 and k. We show that r is a tight bound on the competitive ratio of deterministic algorithms.
We give empirical evidence that r is usually much smaller than k and thus r-competitive algorithms have a reasonable performance on real-world traces. By introducing the r-competitive priority-based algorithm class OnOPT we obtain a collection of promising algorithms to beat the LRU-standard. We single out the new algorithm RDM and show that it outperforms LRU and some of its variants on a wide range of real-world traces.
Since RDM is more complex than LRU one may think at first sight that the gain in terms of lowering the number of cache misses is ruined by high runtime for processing pages. We engineer a fast implementation of RDM, and compare it
to LRU and the very fast FIFO algorithm in an overall evaluation scheme, where we measure the runtime of the algorithms and add penalties for each cache miss.
Experimental results show that for realistic penalties RDM still outperforms these two algorithms even if we grant the competitors an idealistic runtime of 0.
An der Universität Frankfurt entwickelte Online-Self-Assessment-Verfahren für die Studiengänge Psychologie und Informatik sollen Studieninteressierten noch vor Studienbeginn auf der Basis von Selbsterkundungsmaßnahmen und Tests eine Rückmeldung über ihre eigenen Fähigkeiten, Motive, personalen Kompetenzen und Interessen mit Blick auf den jeweiligen Studiengang geben. Sowohl die Befunde zur psychometrischen Güte der Verfahren als auch jene zur prognostischen Validität lassen ihren Einsatz zur Feststellung studienrelevanter Kompetenzen als geeignet erscheinen. Da die erfassten Kompetenzen und Merkmale substanzielle Beziehun-gen zu Studienleistungen aufweisen, könnten die Informationen über individuelle Stärken zur Wahl eines geeigneten Studienganges genutzt werden; Schwächen hingegen könnten frühzeitig Hinweise für geeignete Fördermaßnahmen liefern.
We introduce tree-width for first order formulae φ, fotw(φ). We show that computing fotw is fixed-parameter tractable with parameter fotw. Moreover, we show that on classes of formulae of bounded fotw, model checking is fixed parameter tractable, with parameter the length of the formula. This is done by translating a formula φ with fotw(φ)<k into a formula of the k-variable fragment Lk of first order logic. For fixed k, the question whether a given first order formula is equivalent to an Lk formula is undecidable. In contrast, the classes of first order formulae with bounded fotw are fragments of first order logic for which the equivalence is decidable. Our notion of tree-width generalises tree-width of conjunctive queries to arbitrary formulae of first order logic by taking into account the quantifier interaction in a formula. Moreover, it is more powerful than the notion of elimination-width of quantified constraint formulae, defined by Chen and Dalmau (CSL 2005): for quantified constraint formulae, both bounded elimination-width and bounded fotw allow for model checking in polynomial time. We prove that fotw of a quantified constraint formula φ is bounded by the elimination-width of φ, and we exhibit a class of quantified constraint formulae with bounded fotw, that has unbounded elimination-width. A similar comparison holds for strict tree-width of non-recursive stratified datalog as defined by Flum, Frick, and Grohe (JACM 49, 2002). Finally, we show that fotw has a characterization in terms of a cops and robbers game without monotonicity cost.
Poster presentation: Twenty Second Annual Computational Neuroscience Meeting: CNS*2013. Paris, France. 13-18 July 2013.
The synaptic cleft is an extracellular domain that is capable of relaying a presynaptically received electrical signal by diffusive neurotransmitters to the postsynaptic membrane. The cleft is trans-synaptically bridged by ring-like shaped clusters of pre- and postsynaptically localized calcium-dependent adhesion proteins of the N-Cadherin type and is possibly the smallest intercircuit in nervous systems [1]. The strength of association between the pre- and postsynaptic membranes can account for synaptic plasticity such as long-term potentiation [2]. Through neuronal activity the intra- and extracellular calcium levels are modulated through calcium exchangers embedded in the pre- and postsynaptic membrane. Variations of the concentration of cleft calcium induces changes in the N-Cadherin-zipper, that in synaptic resting states is rigid and tightly connects the pre- and postsynaptic domain. During synaptic activity calcium concentrations are hypothesized to drop below critical thresholds which leads to loosening of the N-Cadherin connections and subsequently "unzips" the Cadherin-mediated connection. These processes may result in changes in synaptic strength [2]. In order to investigate the calcium-mediated N-Cadherin dynamics at the synaptic cleft, we developed a three-dimensional model including the cleft morphology and all prominent calcium exchangers and corresponding density distributions [3-6]. The necessity for a fully three-dimensional model becomes apparent, when investigating the effects of the spatial architecture of the synapse [7], [8]. Our data show, that the localization of calcium channels with respect to the N-Cadherin ring has substantial effects on the time-scales on which the Cadherin-zipper switches between states, ranging from seconds to minutes. This will have significant effects on synaptic signaling. Furthermore we see, that high-frequency action potential firing can only be relayed to the Calcium/N-Cadherin-system at a synapse under precise spatial synaptic reorganization.
To truly appreciate the myriad of events which relate synaptic function and vesicle dynamics, simulations should be done in a spatially realistic environment. This holds true in particular in order to explain as well the rather astonishing motor patterns which we observed within in vivo recordings which underlie peristaltic contractionsas well as the shape of the EPSPs at different forms of long-term stimulation, presented both here, at a well characterized synapse, the neuromuscular junction (NMJ) of the Drosophila larva (c.f. Figure 1). To this end, we have employed a reductionist approach and generated three dimensional models of single presynaptic boutons at the Drosophila larval NMJ. Vesicle dynamics are described by diffusion-like partial differential equations which are solved numerically on unstructured grids using the uG platform. In our model we varied parameters such as bouton-size, vesicle output probability (Po), stimulation frequency and number of synapses, to observe how altering these parameters effected bouton function. Hence we demonstrate that the morphologic and physiologic specialization maybe a convergent evolutionary adaptation to regulate the trade off between sustained, low output, and short term, high output, synaptic signals. There seems to be a biologically meaningful explanation for the co-existence of the two different bouton types as previously observed at the NMJ (characterized especially by the relation between size and Po), the assigning of two different tasks with respect to short- and long-time behaviour could allow for an optimized interplay of different synapse types. We can present astonishing similar results of experimental and simulation data which could be gained in particular without any data fitting, however based only on biophysical values which could be taken from different experimental results. As a side product, we demonstrate how advanced methods from numerical mathematics could help in future to resolve also other difficult experimental neurobiological issues.
Functional modules of metabolic networks are essential for understanding the metabolism of an organism as a whole. With the vast amount of experimental data and the construction of complex and large-scale, often genome-wide, models, the computer-aided identification of functional modules becomes more and more important. Since steady states play a key role in biology, many methods have been developed in that context, for example, elementary flux modes, extreme pathways, transition invariants and place invariants. Metabolic networks can be studied also from the point of view of graph theory, and algorithms for graph decomposition have been applied for the identification of functional modules. A prominent and currently intensively discussed field of methods in graph theory addresses the Q-modularity. In this paper, we recall known concepts of module detection based on the steady-state assumption, focusing on transition-invariants (elementary modes) and their computation as minimal solutions of systems of Diophantine equations. We present the Fourier-Motzkin algorithm in detail. Afterwards, we introduce the Q-modularity as an example for a useful non-steady-state method and its application to metabolic networks. To illustrate and discuss the concepts of invariants and Q-modularity, we apply a part of the central carbon metabolism in potato tubers (Solanum tuberosum) as running example. The intention of the paper is to give a compact presentation of known steady-state concepts from a graph-theoretical viewpoint in the context of network decomposition and reduction and to introduce the application of Q-modularity to metabolic Petri net models.
Finding motifs in biological, social, technological, and other types of networks has become a widespread method to gain more knowledge about these networks’ structure and function. However, this task is very computationally demanding, because it is highly associated with the graph isomorphism which is an NP problem (not known to belong to P or NP-complete subsets yet). Accordingly, this research is endeavoring to decrease the need to call NAUTY isomorphism detection method, which is the most time-consuming step in many existing algorithms. The work provides an extremely fast motif detection algorithm called QuateXelero, which has a Quaternary Tree data structure in the heart. The proposed algorithm is based on the well-known ESU (FANMOD) motif detection algorithm. The results of experiments on some standard model networks approve the overal superiority of the proposed algorithm, namely QuateXelero, compared with two of the fastest existing algorithms, G-Tries and Kavosh. QuateXelero is especially fastest in constructing the central data structure of the algorithm from scratch based on the input network.
Integer point sets minimizing average pairwise L1 distance: What is the optimal shape of a town?
(2010)
An n-town, n[is an element of]N , is a group of n buildings, each occupying a distinct position on a 2-dimensional integer grid. If we measure the distance between two buildings along the axis-parallel street grid, then an n-town has optimal shape if the sum of all pairwise Manhattan distances is minimized. This problem has been studied for cities, i.e., the limiting case of very large n. For cities, it is known that the optimal shape can be described by a differential equation, for which no closed-form solution is known. We show that optimal n-towns can be computed in O(n[superscript 7.5]) time. This is also practically useful, as it allows us to compute optimal solutions up to n=80.
Das Ziel dieser Arbeit ist es, eine authentische Verdeckung eingebetteter virtueller 3D-Objekte in augmentierten Bilderwelten bei einer geringen Anzahl an Fotos innerhalb der Bilderwelt zu erreichen. Für die Verdeckung von realen und virtuellen Anteilen einer Augmented Reality-Szene sind Tiefeninformationen notwendig. Diese stammen üblicherweise aus einer 3D-Rekonstruktion, für deren Erstellung sehr viele Eingangsbilder notwendig sind. Im Gegensatz dazu wurde in dieser Arbeit ein System entwickelt, das eine vollständige 3D-Rekonstruktion umgeht. Dieses beruht auf einem direkten bildbasierten Rendering-Ansatz, welcher auch mit unvollständigen Tiefeninformationen eine hohe Bildqualität in Bezug auf eine authentische Verdeckung erreicht. Daraus erschließen sich neue Anwendungsgebiete, wie z.B. die automatisierte Visualisierung von 3D-Planungsdaten und 3D-Produktpräsentationen in Bildern bzw. Bilderwelten, da in diesen Bereichen oftmals nicht genügend große Bildmengen vorhanden sind. Gerade für diese Anwendungsgebiete sind authentische Verdeckungen für die Nutzerakzeptanz der Augmentierung wichtig. Unter authentischer Verdeckung wird die entsprechend der menschlichen Wahrnehmung visuell korrekte Überlagerung zwischen virtuellen Objekten und einzelnen Bildanteilen eines oder mehrerer Fotos verstanden. Das Ergebnis wird in Form einer Bilderwelt (eine bildbasierte 3D-Welt, die die Fotos entsprechend der Bildinhalte räumlich anordnet) präsentiert, die mit virtuellen Objekten erweitert wurde. Folglich ordnet sich diese Arbeit in das Fachgebiet der Augmented Reality ein. Im Rahmen dieser Arbeit wurde ein Verfahren für die bildbasierte Darstellung mit authentischen Verdeckungen auf der Basis von unvollständigen Tiefeninformationen sowie unterschiedliche Verfahren für die notwendige Berechnung der Tiefeninformationen entwickelt und gegenübergestellt. Das Sliced-Image-Rendering-Verfahren rendert mithilfe unvollständiger Tiefeninformationen ein Bild ohne 3D-Geometrie als dreidimensionale Darstellung und realisiert auf diese Weise eine authentische Verdeckung. Das Berechnen der dafür notwendigen Tiefeninformationen eines 2D-Bildes stellt eine gesonderte Herausforderung dar, da die Bilderwelt nur wenige und unvollständige 3D-Informationen der abgebildeten Szene bereitstellt. Folglich kann eine qualitativ hochwertige 3D-Rekonstruktion nicht durchgeführt werden. Die Fragestellung ist daher, wie einzelne Tiefeninformationen berechnet und diese anschließend größeren Bildbereichen zugeordnet werden können. Für diese Tiefenzuordnung wurden im Rahmen der vorliegenden Arbeit drei verschiedene Verfahren konzipiert, die sich in Bezug auf genutzte Daten und deren Verarbeitung unterscheiden. Das Segment-Depth-Matching-Verfahren ordnet Segmenten eines Bildes mithilfe der 3D-Szeneninformationen der Bilderwelt eine Tiefe zu. Hierfür werden Segmentbilder vorausgesetzt. Als Ergebnis liegt für jedes Foto eine Depth-Map vor. Um eine Tiefenzuordnung auch ohne eine vorangehende Segmentierung zu ermöglichen, wurde das Key-Point-Depth-Matching-Verfahren entwickelt. Bei diesem Verfahren werden die 3D-Szeneninformationen der Bilderwelt auf die Bildebene als kreisförmige Sprites projiziert. Die Distanz zur Kamera wird dabei als Tiefenwert für das Sprite verwendet. Alle projizierten Sprites einer Kamera ergeben die Depth-Map. Beide Verfahren liefern Flächen mit Tiefeninformationen, aber keine pixelgenauen Depth-Maps. Um pixelgenaue Depth-Maps zu erzeugen, wurde das Geometry-Depth-Matching-Verfahren entwickelt. Bei diesem Verfahren wird eine Szenengeometrie des abgebildeten Szenenausschnittes erzeugt und dadurch eine pixelgenaue Depth-Map erstellt. Hierfür wird ein semiautomatischer Skizzierungsschritt vorausgesetzt. Die erzeugte Szenengeometrie stellt keine vollständige 3D-Rekonstruktion der Bilderweltenszene dar, da nur ein Szenenausschnitt aus Sicht einer Kamera rekonstruiert wird. Anhand einer technischen Umsetzung erfolgte eine Validierung der konzeptionellen Verfahren. Die daraus resultierenden Ergebnisse wurden anhand verschiedener Bilderweltenszenen mit unterschiedlichen Eigenschaften (Außen- und Innenraumszenen, detailreich und -arm, unterschiedliche Bildmengen) evaluiert. Die Evaluierung des Sliced-Image-Renderings zeigt, dass mithilfe unvollständiger Tiefeninformationen der entwickelten Depth-Matching-Verfahren und unter Einhaltung der gestellten Anforderungen (wenig Eingabefotos, kleine Szenen, keine 3D-Rekonstruktion) eine authentische Verdeckung eingebetteter virtueller 3D-Objekte in Bilderwelten realisiert werden kann. Mithilfe des entwickelten Systems können bildbasierte Anwendungen auch mit kleinen Fotomengen Augmentierungen mit hoher Bildqualität in Bezug auf eine authentische Verdeckung realisieren.
Paging is one of the prominent problems in the field of on-line algorithms. While in the deterministic setting there exist simple and efficient strongly competitive algorithms, in the randomized setting a tradeoff between competitiveness and memory is still not settled. Bein et al. [4] conjectured that there exist strongly competitive randomized paging algorithms, using o(k) bookmarks, i.e. pages not in cache that the algorithm keeps track of. Also in [4] the first algorithm using O(k) bookmarks (2k more precisely), Equitable2, was introduced, proving in the affirmative a conjecture in [7].
We prove tighter bounds for Equitable2, showing that it requires less than k bookmarks, more precisely ≈ 0.62k. We then give a lower bound for Equitable2 showing that it cannot both be strongly competitive and use o(k) bookmarks. Nonetheless, we show that it can trade competitiveness for space. More precisely, if its competitive ratio is allowed to be (Hk + t), then it requires k/(1 + t) bookmarks.
Our main result proves the conjecture that there exist strongly competitive paging algorithms using o(k) bookmarks. We propose an algorithm, denoted Partition2, which is a variant of the Partition algorithm byMcGeoch and Sleator [13]. While classical Partition is unbounded in its space requirements, Partition2 uses θ(k/ log k) bookmarks. Furthermore, we show that this result is asymptotically tight when the forgiveness steps are deterministic.
In der modernen Hochschullehre haben sich eLearning-Elemente als ein Teil des Lehrrepertoires etabliert. Der Einsatz interaktiver webbasierter Selbstlernmodule (Web Based Trainings (WBT)) ist dabei eine Option. Hochschulen und Unternehmen versprechen sich dadurch neue Möglichkeiten des Lehrens und Lernens, um z. B. einen Ausgleich heterogener Vorerfahrungen sowie eine stärkere aktive Beteiligung der Lernenden zu bewirken. Damit die Erstellung und Strukturierung dieser Inhalte mit möglichst geringem Aufwand erfolgen kann, bieten Autorensysteme Unterstützung.
Zu den Grundfunktionen von Autorensystemen gehören unter anderem, das Einbinden gebräuchlicher Medienformate, die einfache Erstellung von Fragen sowie verschiedene Auswertungs- und Feedbackmöglichkeiten. Obwohl Autorensysteme schon vor vielen Jahren ihre erste praktische Anwendung fanden, gibt es nach wie vor Schwachstellen, die sich auf den gesamten Erstellungsprozess wie auch auf einzelne Funktionen beziehen. Im Detail wird bemängelt, dass die Werkzeuge zu komplex und unflexibel sind. Darüber hinaus fehlt häufig eine zufriedenstellende Verknüpfung der vielen Werkzeuge entlang der Prozesskette zu einer Gesamtlösung.
Des Weiteren wird die Konzentration auf die Produktionsphase kritisiert, wodurch andere wichtige Prozesse in den Hintergrund treten bzw. außer Acht gelassen werden.
Im Rahmen der Zusammenarbeit mit einem Automobilhersteller, für den die erste Version des Autorensystems LernBar weiterentwickelt wurde, spielte der Begriff „Lean Production“ inhaltlich in der Umsetzung der WBTs eine wesentliche Rolle. Die Lean Production, die über viele Jahre für die Automobilindustrie entwickelt, verbessert und angepasst wurde, liefert Optimierungsansätze für den Produktionsbereich. Ein wirtschaftlicher Nutzen des Lean-Ansatzes wird auch in anderen Bereichen gesehen wie z. B. in der Softwareentwicklung („Lean Software Development“) oder im Management („Lean Management“). Dabei bietet die Wertschöpfungsorientierung Lösungen für die widersprüchlichen Ziele mehr Leistungen zu geringeren Kosten, schneller und in höherer Qualität zugleich zu liefern. Aus der Grundidee der Lean Production entwickelte sich vorliegendes Dissertationsthema in Bezug darauf, inwiefern sich diese Prinzipien auf den WBT-Produktionsprozess übertragen lassen und die LernBar (das hierfür weiterentwickelnde Autorensystem) dabei Unterstützung bieten kann.
Zunächst wurde analysiert, welche Werkzeuge und Hilfestellungen benötigt werden, um unter dem Aspekt der Lean Production WBTs im universitären Umfeld erstellen zu können. In diesem Zusammenhang wurden Merkmale einer „Lean Media Production“ definiert sowie konzeptionell und technisch umgesetzt. Zur Verbesserung der Prozesse flossen Ergebnisse aus empirischer und praktischer Forschung ein. Im Vergleich zu anderen Entwicklungen bei denen häufig das Hauptziel eine umfangreiche Funktionalität ist, werden u.a. folgende übertragbare Ziele bei der Umsetzung verfolgt: Verschwendung vermeiden, eine starke Einbeziehung der Kunden, Werkzeuge die nahtlos ineinandergreifen, eine hohe Flexibilität und eine stetige Qualitätsverbesserung.
Zur Erreichung dieser Zielsetzungen wurden alle Prozesse kontinuierlich verbessert, sich auf das Wesentliche und die Wertschöpfung konzentriert sowie überflüssige Schritte eliminiert. Demnach ist unter dem Begriff „Lean Media Production“ ein skalierbarer, effizienter und effektiver Produktionsprozess zu verstehen, in dem alle Werkzeuge ineinandergreifen.
Die Realisierung der „Lean Media Production“ erfolgte anhand des Autorensystems LernBar, wobei die typischen Softwareentwicklungsphasen Entwurf, Implementierung und Evaluierung mehrfach durchlaufen wurden. Ausschlaggebend dabei war, dass der „Lean“-Aspekt berücksichtigt wurde und dies somit eine neue Vorgehensweise bei der Umsetzung eines Autorensystems darstellt. Im Verlauf der Entwicklungen ergaben sich, durch eine formative Evaluation, den Einsatz in Projekten und eine empirische Begleitforschung, neue Anforderungen an das System. Ein Vergleich der zwei Produktionssysteme, Automobil vs. WBT-Produktion, zeigt und bestätigt die Erwartung, dass nicht alle Prinzipien der Lean Production übertragbar sind.
Dennoch war diese Untersuchung notwendig, da sie Denkanstöße zur Entwicklung und Optimierung des Erstellungsprozesses eines WBTs gab. Auch die Ergebnisse der abschließenden Online-Befragung ergaben, dass die Ziele der Arbeit erreicht wurden, dass aber weiterer Optimierungsbedarf besteht. Die LernBar Release 3 bietet für alle Produktionsphasen Werkzeuge an, durch die eine effektive und effiziente Erstellung von WBTs von der Idee bis zur Distribution möglich ist.
Stand noch vor fünf Jahren zu Beginn dieser Arbeit das Endprodukt bei der LernBar Entwicklung im Vordergrund, verlagerte sich durch den Einfluss dieser Dissertation der Schwerpunkt auf den gesamten Produktionsprozess. Unter Berücksichtigung der in diesem Zusammenhang entwickelten Prinzipien einer „Lean Media Production“, nehmen bspw. die Wirtschaftlichkeit und die starke Kundenorientierung während des Produktionsprozesses einen wichtigen Stellenwert ein. Dieser Ansatz ist eine neue Vorgehensweise im Bereich der Entwicklung von Autorensystemen, der seine Anerkennung und Professionalität durch die Ergebnisse des selbstentwickelten Evaluationsbogens sowie dem stetig wachsenden Einsatz in Schulen, Hochschulen und Unternehmen belegen kann.
In weiteren Forschungsarbeiten ist zu untersuchen, welche Lean Production Prinzipien zu verwenden oder anzupassen sind, wenn z. B. in größeren Teams oder mobil produziert wird. Des Weiteren sollte überprüft werden, inwieweit die Lernenden mit dem Endprodukt zufrieden sind und in ihrem Lernprozess unterstützt werden. Durch diese Forschungsarbeit wurde ein Beitrag dazu geleistet, die Lehre und Ausbildung zu optimieren, indem die Autoren/Lehrende in der Erstellung ihrer digitalen Lerninhalte im gesamten Prozess von aufeinander abgestimmten Werkzeugen unterstützt werden.
With increasing heterogeneity of modern hardware, different requirements for 3d applications arise. Despite the fact that real-time rendering of photo-realistic images is possible using today’s graphics cards, still large computational effort is required. Furthermore, smart-phones or computers with older, less powerful graphics cards may not be able to reproduce these results. To retain interactive rendering, usually the detail of a scene is reduced, and so less data needs to be processed. This removal of data, however, may introduce errors, so called artifacts. These artifacts may be distracting for a human spectator when gazing at the display. Thus, the visual quality of the presented scene is reduced. This is counteracted by identifying features of an object that can be removed without introducing artifacts. Most methods utilize geometrical properties, such as distance or shape, to rate the quality of the performed reduction. This information used to generate so called Levels Of Detail (LODs), which are made available to the rendering system. This reduces the detail of an object using the precalculated LODs, e.g. when it is moved into the back of the scene. The appropriate LOD is selected using a metric, and it is replaced with the current displayed version. This exchange must be made smoothly, requiring both LOD-versions to be drawn simultaneously during a transition. Otherwise, this exchange will introduce discontinuities, which are easily discovered by a human spectator. After completion of the transition, only the newly introduced LOD-version is drawn and the previous overhead removed. These LOD-methods usually operate with discrete levels and exploit limitations of both the display and the spectator: the human.
Humans are limited in their vision. This ranges from being unable to distinct colors at varying illumination scenarios to the limitation to focus only at one location at a time. Researchers have developed many applications to exploit these limitations to increase the quality of an applied compression. Some popular methods of vision-based compression are MPEG or JPEG. For example, a JPEG compression exploits the reduced sensitivity of humans regarding color and so encodes colors with a lower resolution. Also, other fields, such as auditive perception, allow the exploitation of human limitations. The MP3 compression, for example, reduces the quality of stored frequencies if other frequencies are masking it. For representation of perception various computer models exist. In our rendering scenario, a model is advantageous that cannot be influenced by a human spectator, such as the visual salience or saliency.
Saliency is a notion from psycho-physics that determines how an object “pops out” of its surrounding. These outstanding objects (or features) are important for the human vision and are directly evaluated by our Human Visual System (HVS). Saliency combines multiple parts of the HVS and allows an identification of regions where humans are likely to look at. In applications, saliency-based methods have been used to control recursive or progressive rendering methods. Especially expensive display methods, such as pathtracing or global illumination calculations, benefit from a perceptual representation as recursions or calculations can be aborted if only small or unperceivable errors are expected to occur. Yet, saliency is commonly applied to 2d images, and an extension towards 3d objects has only partially been presented. Some issues need to be addressed to accomplish a complete transfer.
In this work, we present a smart rendering system that not only utilizes a 3d visual salience model but also applies the reduction in detail directly during rendering. As opposed to normal LOD-methods, this detail reduction is not limited to a predefined set of levels, but rather a dynamic and continuous LOD is created. Furthermore, to apply this reduction in a human-oriented way, a universal function to compute saliency of a 3d object is presented. The definition of this function allows to precalculate and store object-related visual salience information. This stored data is then applicable in any illumination scenario and allows to identify regions of interest on the surface of a 3d object. Unlike preprocessed methods, which generate a view-independent LOD, this identification includes information of the scene as well. Thus, we are able to define a perception-based, view-specific LOD. Performance measures of a prototypical implementation on computers with modern graphic cards achieved interactive frame rates, and several tests have proven the validity of the reduction.
The adaptation of an object is performed with a dynamic data structure, the TreeCut. It is designed to operate on hierarchical representations, which define a multi-resolution object. In such a hierarchy, the leaf nodes contain the highest detail while inner nodes are approximations of their respective subtree. As opposed to classical hierarchical rendering methods, a cut is stored and re-traversal of a tree during rendering is avoided. Due to the explicit cut representation, the TreeCut can be altered using only two core operations: refine and coarse. The refine-operation increases detail by replacing a node of the tree with its children while the coarse-operation removes the node along with its siblings and replaces them with their parent node. These operations do not rely on external information and can be performed in a local manner. These only require direct successor or predecessor information. Different strategies to evolve the TreeCut are presented, which adapt the representation using only information given by the current cut. These evaluate the cut by assigning either a priority or a target-level (or bucket) to each cut-node. The former is modelled as an optimization problem that increases the average priority of a cut while being restricted in some way, e.g. in size. The latter evolves the cut to match a certain distribution. This is applied in cases where a prioritization of nodes is not applicable. Both evaluation strategies operate with linear time complexity with respect to the size of the current TreeCut.
The data layout is chosen to separate rendering data and hierarchy to enable multi-threaded evaluation and display. The object is adapted over multiple frames while the rendering is not interrupted by the used evaluation strategy. Therefore, we separate the representation of the hierarchy from the rendering data. Due to its design, this overhead imposed to the TreeCut data structure does not influence rendering performance, and a linear time complexity for rendering is retained. The TreeCut is not only limited to alter geometrical detail of an object. The TreeCut has successfully been applied to create a non-photo-realistic stippling display, which draws the object with equal sized points in varying density. In this case the bucket-based evaluation strategy is utilized, which determines the distribution of the cut based on local illumination information. As an alternative, an attention drawing mechanism is proposed, which applies the TreeCut evaluation strategies to define the display style of a notification icon. A combination of external priorities is used to derive the appropriate icon version. An application for this mechanism is a messaging system that accounts for the current user situation.
When optimizing an object or scene, perceptual methods allow to account for or exploit human limitations. Therefore, visual salience approaches derive a saliency map, which encodes regions of interest in a 2d map. Rendering algorithms extract importance from such a map and adapt the rendering accordingly, e.g. abort a recursion when the current location is unsalient. The visual salience depends on multiple factors including the view and the illumination of the scene. We extend the existing definition of the 2d saliency and propose a universal function for 3d visual salience: the Bidirectional Saliency Weight Distribution Function (BSWDF). Instead of extracting the saliency from 2d image and approximate 3d information, we directly compute this information using the 3d data. We derive a list of equivalent features for the 3d scenario and add them to the BSWDF. As the BSWDF is universal, also 2d images are covered with the BSWDF, and the calculation of the important regions within images is possible.
To extract the individual features that contribute to visual salience, capabilities of modern graphics card in combination with an accumulation method for rendering is utilized. Inspired from point-based rendering methods local features are summed up in a single surface element (surfel) and are compared with their surround to determine whether they “pop out”. These operations are performed with a shader-program that is executed on the Graphics Processing Unit (GPU) and has direct access to the 3d data. This increases processing speed because no transfer of the data is required. After computation, each of these object-specific features can be combined to derive a saliency map for this object. Surface specific information, e.g. color or curvature, can be preprocessed and stored onto disk. We define a sampling scheme to determine the views that need to be evaluated for each object. With these schemes, the features can be interpolated for any view that occurs during rendering, and the according surface data is reconstructed. These sampling schemes compose a set of images in form of a lookup table. This is similar to existing rendering techniques, which extract illumination information from a lookup. The size of the lookup table increases only with the number of samples or the image size used for creation as the images are of equal size. Thus, the quality of the saliency data is independent of the object’s geometrical complexity. The computation of a BSWDF can be performed either on a Central Processing Unit (CPU) or a GPU, and an implementation requires only a few instructions when using a shader program. If the surface features have been stored during a preprocess, a reprojection of the data is performed and combined with the current information of the object. Once the data is available, the computation of the saliency values is done using a specialized illumination model, and a priority for each primitive is extracted. If the GPU is used, the calculated data has to be transferred from the graphics card. We therefore use the “transform feedback” capabilities, which allow high transfer rates and preserve the order of processed primitives. So, an identification of regions of interest based on the currently used primitives is achieved. The TreeCut evaluation strategies are then able to optimize the representation in an perception-based manner.
As the adaptation utilizes information of the current scene, each change to an object can result in new visual salience information. So, a self-optimizing system is defined: the Feedback System. The output generated by this system converges towards a perception-optimized solution. To proof the saliency information to be useful, user tests have been performed with the results generated by the proposed Feedback System. We compared a saliency-enhanced object compression to a pure geometrical approach, common for LOD-generation. One result of the tests is that saliency information allows to increase compression even further as possible with the pure geometrical methods. The participants were not able to distinguish between objects even if the saliency-based compression had only 60% of the size of the geometrical reduced object. If the size ratio is greater, saliency-based compression is rated, on average, with higher score and these results have a high significance using statistical tests. The Feedback System extends an 3d object with the capability of self-optimization. Not only geometrical detail but also other properties can be limited and optimized using the TreeCut in combination with a BSWDF. We present a dynamic animation, which utilizes a Software Development Kit (SDK) for physical simulations. This was chosen, on the one hand, to show the universal applicability of the proposed system, and on the other hand, to focus on the connection between the TreeCut and the SDK. We adapt the existing framework, and include the SDK within our design. In this case, the TreeCut-operations not only alter geometrical but also simulation detail. This increases calculation performance because both the rendering and the SDK operate on less data after the reduction has been completed.
The selected simulation type is a soft-body simulation. Soft-bodies are deformable in a certain degree but retain their internal connection. An example is a piece of cloth that smoothly fits the underlying surface without tearing apart. Other types are rigid bodies, i.e. idealistic objects that cannot be deformed, and fluids or gaseous materials, which are well suited for point-based simulations. Any of these simulations scales with the number of simulation nodes used, and a reduction of detail increases performance significantly. We define a specialized BSWDF to evaluate simulation specific features, such as motion. The Feedback System then increases detail in highly salient regions, e.g. those with large motion, and saves computation time by reducing detail in static parts of the simulation. So, detail of the simulation is preserved while less nodes are simulated.
The incorporation of perception in real-time rendering is an important part of recent research. Today, the HVS is well understood, and valid computer models have been derived. These models are frequently used in commercial and free software, e.g. JPEG compression. Within this thesis, the Tree-Cut is presented to change the LOD of an object in a dynamic and continuous manner. No definition of the individual levels in advance is required, and the transitions are performed locally. Furthermore, in combination with an identification of important regions by the BSWDF, a perceptual evaluation of a 3d object is achieved. As opposed to existing methods, which approximate data from 2d images, the perceptual information is directly acquired from 3d data. Some of this data can be preprocessed if necessary, to defer additional computations during rendering. The Feedback System, created by the TreeCut and the BSWDF, optimizes the representation and is not limited to visual data alone. We have shown with our prototype that interactive frame rates can be achieved with modern hardware, and we have proven the validity of the reductions by performing several user tests. However, the presented system only focuses on specific aspects, and more research is required to capture even more capabilities that a perception-based rendering system can provide.
This volume contains the proceedings of the 12th International Workshop on Termination (WST 2012), to be held February 19–23, 2012 in Obergurgl, Austria. The goal of the Workshop on Termination is to be a venue for presentation and discussion of all topics in and around termination. In this way, the workshop tries to bridge the gaps between different communities interested and active in research in and around termination. The 12th International Workshop on Termination in Obergurgl continues the successful workshops held in St. Andrews (1993), La Bresse (1995), Ede (1997), Dagstuhl (1999), Utrecht (2001), Valencia (2003), Aachen (2004), Seattle (2006), Paris (2007), Leipzig (2009), and Edinburgh (2010). The 12th International Workshop on Termination did welcome contributions on all aspects of termination and complexity analysis. Contributions from the imperative, constraint, functional, and logic programming communities, and papers investigating applications of complexity or termination (for example in program transformation or theorem proving) were particularly welcome. We did receive 18 submissions which all were accepted. Each paper was assigned two reviewers. In addition to these 18 contributed talks, WST 2012, hosts three invited talks by Alexander Krauss, Martin Hofmann, and Fausto Spoto.
The diagram-based method to prove correctness of program transformations consists of computing
complete set of (forking and commuting) diagrams, acting on sequences of standard reductions
and program transformations. In many cases, the only missing step for proving correctness of a
program transformation is to show the termination of the rearrangement of the sequences. Therefore
we encode complete sets of diagrams as term rewriting systems and use an automated tool
to show termination, which provides a further step in the automation of the inductive step in
correctness proofs.
A concurrent implementation of software transactional memory in Concurrent Haskell using a call-by-need functional language with processes and futures is given. The description of the small-step operational semantics is precise and explicit, and employs an early abort of conflicting transactions. A proof of correctness of the implementation is given for a contextual semantics with may- and should-convergence. This implies that our implementation is a correct evaluator for an abstract specification equipped with a big-step semantics.
This paper shows equivalence of applicative similarity and contextual approximation, and hence also of bisimilarity and contextual equivalence, in LR, the deterministic call-by-need lambda calculus with letrec extended by data constructors, case-expressions and Haskell's seqoperator. LR models an untyped version of the core language of Haskell. Bisimilarity simplifies equivalence proofs in the calculus and opens a way for more convenient correctness proofs for program transformations.
The proof is by a fully abstract and surjective transfer of the contextual approximation into a call-by-name calculus, which is an extension of Abramsky's lazy lambda calculus. In the latter calculus equivalence of similarity and contextual approximation can be shown by Howe's method. Using an equivalent but inductive definition of behavioral preorder we then transfer similarity back to the calculus LR.
The translation from the call-by-need letrec calculus into the extended call-by-name lambda calculus is the composition of two translations. The first translation replaces the call-by-need strategy by a call-by-name strategy and its correctness is shown by exploiting infinite tress, which emerge by unfolding the letrec expressions. The second translation encodes letrec-expressions by using multi-fixpoint combinators and its correctness is shown syntactically by comparing reductions of both calculi. A further result of this paper is an isomorphism between the mentioned calculi, and also with a call-by-need letrec calculus with a less complex definition of reduction than LR.
Poster presentation: Calcium plays a pivotal role in relaying electrical signals of the cell to subcellular compartments, such as the nucleus. Since this one ion type is used by the cell for many processes a neuron needs to establish finely tuned calcium pathways in order to be able to differentiate multiple tasks, [1-3].
While it is known that neurons can actively change their shape upon neuronal activity, [4-7], we here present novel findings of activity-regulated nuclear morphology, [8,9]. With the help of an experimental and computational modeling approach, we show that hippocampal neurons can change the previously spherical shape of their nuclei to complex and infolded morphologies. This morphology regulation is demonstrated to be regulated by NMDA-receptor gated calcium, while synaptic and extra-synaptic NMDA-receptors elicit opposing effects on nuclear morphology, [8].
The structural alterations of the cell nucleus have significant effects on nuclear calcium dynamics. Compartmentalization of the nucleus, due to membrane infoldings, changes calcium frequencies, amplitudes and spatial distributions, [8,10]. Since these parameters have been shown to control downstream events towards gene transcription, [11,12], the results elucidate the cellular control of nuclear function with the help of morphology modulation. With respect to processes downstream of calcium, we show that histone H3 phosphorylation is closely linked to nuclear morphology. Investigating the nuclear morphologies of hippocampal neurons, two major classes were identified [9,10]. One class contains non-infolded nuclei that have the function of calcium signal integrators, while the other class contains highly infolded nuclei, which function as frequency detectors of nuclear calcium, [10].
Extending this interdisciplinary approach of investigating structure/function relationships in neurons, the effects of cellular morphology – as well as the morphology of the endoplasmic reticulum and other organelles – on neuronal calcium signals is currently being investigated. This endeavor makes use of highly detailed, three-dimensional models of neuronal calcium dynamics, including the three-dimensional morphology of the cell and its organelles.
Conceptual design of an ALICE Tier-2 centre integrated into a multi-purpose computing facility
(2012)
This thesis discusses the issues and challenges associated with the design and operation of a data analysis facility for a high-energy physics experiment at a multi-purpose computing centre. At the spotlight is a Tier-2 centre of the distributed computing model of the ALICE experiment at the Large Hadron Collider at CERN in Geneva, Switzerland. The design steps, examined in the thesis, include analysis and optimization of the I/O access patterns of the user workload, integration of the storage resources, and development of the techniques for effective system administration and operation of the facility in a shared computing environment. A number of I/O access performance issues on multiple levels of the I/O subsystem, introduced by utilization of hard disks for data storage, have been addressed by the means of exhaustive benchmarking and thorough analysis of the I/O of the user applications in the ALICE software framework. Defining the set of requirements to the storage system, describing the potential performance bottlenecks and single points of failure and examining possible ways to avoid them allows one to develop guidelines for selecting the way how to integrate the storage resources. The solution, how to preserve a specific software stack for the experiment in a shared environment, is presented along with its effects on the user workload performance. The proposal for a flexible model to deploy and operate the ALICE Tier-2 infrastructure and applications in a virtual environment through adoption of the cloud computing technology and the 'Infrastructure as Code' concept completes the thesis. Scientific software applications can be efficiently computed in a virtual environment, and there is an urgent need to adapt the infrastructure for effective usage of cloud resources.
This thesis will first introduce in more detail the Bayesian theory and its use in integrating multiple information sources. I will briefly talk about models and their relation to the dynamics of an environment, and how to combine multiple alternative models. Following that I will discuss the experimental findings on multisensory integration in humans and animals. I start with psychophysical results on various forms of tasks and setups, that show that the brain uses and combines information from multiple cues. Specifically, the discussion will focus on the finding that humans integrate this information in a way that is close to the theoretical optimal performance. Special emphasis will be put on results about the developmental aspects of cue integration, highlighting experiments that could show that children do not perform similar to the Bayesian predictions. This section also includes a short summary of experiments on how subjects handle multiple alternative environmental dynamics. I will also talk about neurobiological findings of cells receiving input from multiple receptors both in dedicated brain areas but also primary sensory areas. I will proceed with an overview of existing theories and computational models of multisensory integration. This will be followed by a discussion on reinforcement learning (RL). First I will talk about the original theory including the two different main approaches model-free and model-based reinforcement learning. The important variables will be introduced as well as different algorithmic implementations. Secondly, a short review on the mapping of those theories onto brain and behaviour will be given. I mention the most in uential papers that showed correlations between the activity in certain brain regions with RL variables, most prominently between dopaminergic neurons and temporal difference errors. I will try to motivate, why I think that this theory can help to explain the development of near-optimal cue integration in humans. The next main chapter will introduce our model that learns to solve the task of audio-visual orienting. Many of the results in this section have been published in [Weisswange et al. 2009b,Weisswange et al. 2011]. The model agent starts without any knowledge of the environment and acts based on predictions of rewards, which will be adapted according to the reward signaling the quality of the performed action. I will show that after training this model performs similarly to the prediction of a Bayesian observer. The model can also deal with more complex environments in which it has to deal with multiple possible underlying generating models (perform causal inference). In these experiments I use di#erent formulations of Bayesian observers for comparison with our model, and find that it is most similar to the fully optimal observer doing model averaging. Additional experiments using various alterations to the environment show the ability of the model to react to changes in the input statistics without explicitly representing probability distributions. I will close the chapter with a discussion on the benefits and shortcomings of the model. The thesis continues whith a report on an application of the learning algorithm introduced before to two real world cue integration tasks on a robotic head. For these tasks our system outperforms a commonly used approximation to Bayesian inference, reliability weighted averaging. The approximation is handy because of its computational simplicity, because it relies on certain assumptions that are usually controlled for in a laboratory setting, but these are often not true for real world data. This chapter is based on the paper [Karaoguz et al. 2011]. Our second modeling approach tries to address the neuronal substrates of the learning process for cue integration. I again use a reward based training scheme, but this time implemented as a modulation of synaptic plasticity mechanisms in a recurrent network of binary threshold neurons. I start the chapter with an additional introduction section to discuss recurrent networks and especially the various forms of neuronal plasticity that I will use in the model. The performance on a task similar to that of chapter 3 will be presented together with an analysis of the in uence of different plasticity mechanisms on it. Again benefits and shortcomings and the general potential of the method will be discussed. I will close the thesis with a general conclusion and some ideas about possible future work.
The Symposium on Theoretical Aspects of Computer Science (STACS) is held alternately in France and in Germany. The conference of February 26-28, 2009, held in Freiburg, is the 26th in this series. Previous meetings took place in Paris (1984), Saarbr¨ucken (1985), Orsay (1986), Passau (1987), Bordeaux (1988), Paderborn (1989), Rouen (1990), Hamburg (1991), Cachan (1992), W¨urzburg (1993), Caen (1994), M¨unchen (1995), Grenoble (1996), L¨ubeck (1997), Paris (1998), Trier (1999), Lille (2000), Dresden (2001), Antibes (2002), Berlin (2003), Montpellier (2004), Stuttgart (2005), Marseille (2006), Aachen (2007), and Bordeaux (2008). ...
From 12.12.2010 to 17.12.2010, the Dagstuhl Seminar 10501 "Advances and Applications of Automata on Words and Trees" was held in Schloss Dagstuhl - Leibniz Center for Informatics. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available.
Seminar: 10501 - Advances and Applications of Automata on Words and Trees. The aim of the seminar was to discuss and systematize the recent fast progress in automata theory and to identify important directions for future research. For this, the seminar brought together more than 40 researchers from automata theory and related fields of applications. We had 19 talks of 30 minutes and 5 one-hour lectures leaving ample room for discussions. In the following we describe the topics in more detail.
This article shows that there exist two particular linear orders such that first-order logic with these two linear orders has the same expressive power as first-order logic with the Bit-predicate FO(Bit). As a corollary we obtain that there also exists a built-in permutation such that first-order logic with a linear order and this permutation is as expressive as FO(Bit).
The calculus CHF models Concurrent Haskell extended by concurrent, implicit futures. It is a process calculus with concurrent threads, monadic concurrent evaluation, and includes a pure functional lambda-calculus which comprises data constructors, case-expressions, letrec-expressions, and Haskell’s seq. Futures can be implemented in Concurrent Haskell using the primitive unsafeInterleaveIO, which is available in most implementations of Haskell. Our main result is conservativity of CHF, that is, all equivalences of pure functional expressions are also valid in CHF. This implies that compiler optimizations and transformations from pure Haskell remain valid in Concurrent Haskell even if it is extended by futures. We also show that this is no longer valid if Concurrent Haskell is extended by the arbitrary use of unsafeInterleaveIO.
We show how Sestoft’s abstract machine for lazy evaluation of purely functional programs can be extended to evaluate expressions of the calculus CHF – a process calculus that models Concurrent Haskell extended by imperative and implicit futures. The abstract machine is modularly constructed by first adding monadic IO-actions to the machine and then in a second step we add concurrency. Our main result is that the abstract machine coincides with the original operational semantics of CHF, w.r.t. may- and should-convergence.
This paper considers the logic FOcard, i.e., first-order logic with cardinality predicates that can specify the size of a structure modulo some number. We study the expressive power of FOcard on the class of languages of ranked, finite, labelled trees with successor relations. Our first main result characterises the class of FOcard-definable tree languages in terms of algebraic closure properties of the tree languages. As it can be effectively checked whether the language of a given tree automaton satisfies these closure properties, we obtain a decidable characterisation of the class of regular tree languages definable in FOcard. Our second main result considers first-order logic with unary relations, successor relations, and two additional designated symbols < and + that must be interpreted as a linear order and its associated addition. Such a formula is called addition-invariant if, for each fixed interpretation of the unary relations and successor relations, its result is independent of the particular interpretation of < and +. We show that the FOcard-definable tree languages are exactly the regular tree languages definable in addition-invariant first-order logic. Our proof techniques involve tools from algebraic automata theory, reasoning with locality arguments, and the use of logical interpretations. We combine and extend methods developed by Benedikt and Segoufin (ACM ToCL, 2009) and Schweikardt and Segoufin (LICS, 2010).
Diese Arbeit untersucht den Einfluss des Game-Design auf ausgelöste Lernprozesse und den Erfolg von Serious Games. Hierzu werden Game-Design Paradigmen entwickelt, die als Richtlinien für Konzeption und Umsetzung eines Serious Game dienen. Als Serious Games werden Videospiele bezeichnet, die zur Wissensvermittlung konzipiert worden sind. Dabei sollen die motivationalen Faktoren eines Videospiels genutzt werden, um einen intrinsisch motivierten Lernprozess auszulösen. Das Bewertungkriterium für den Erfolg einer Spielmechanik ist somit die Erfüllung der Lernziele. Damit dieses Erfolgskriterium genauer untersucht werden kann, werden die ausgelösten Lernprozesse differenziert betrachtet. In der Literatur werden folgende Lernprozesse hervorgehoben: Der Prozess des Erfahrungslernens und metakognitive Prozesse. Darüber hinaus sind Eigenschaften der Zielgruppe, wie Alter oder Geschlecht weitere wichtige Faktoren. Das dieser Arbeit zu Grunde liegende Forschungsframework setzt sich wie folgt zusammen: Lernszenario, Lernprozess und Lernerfolg. Das Lernszenario ist durch folgende Faktoren charakterisiert: Game Characteristics (Eigenschaften des Serious Game), Instructional Content (Arbeitsanweisungen und Trainingsetting) sowie Player Characteristics (Eigenschaften der Zielgruppe). Diese Parameter bedingen den Lernprozess, welcher unter dem Aspekt des Erfahrungslernens und der Metakognition analysiert wird. Eine besondere Problemstellung in den Player Characteristics ergibt sich aus dem sogenannten Net-Generation Konflikt. Mit Net-Generation wird die Generation bezeichnet, welche mit neuen Medien wie Internet und mobiler Kommunikation aufgewachsen ist. Diese besitzt im Unterschied zu älteren Generationen ein anderes Lernverhalten. Um die Aspekte des Net-Generation Konflikts und die Auswirkungen auf den Lernprozesses untersuchen zu können, wird ein Serious Game entwickelt, dessen Spielmechanik sich an folgenden Game-Design Paradigmen ausrichtet: Akzeptanz, Leichte Zugänglichkeit, Spielspaß und die Unterstützung des Lernprozesses. Dieses Serious Game FISS (Fertigungs- und Instandhaltungs-Strategie Simulation) wird bei der Daimler AG seit 2008 zur Ausbildung von Ingenieuren eingesetzt. FISS simuliert eine Fertigungslinie, die mit Hilfe geeigneter Wartungsstrategien und effizientem Personaleinsatz erfolgreich geführt werden soll. Die Spielmechanik orientiert sich an dem Genre der Rundenstrategie und wird in einem Anwesenheitstraining im Team durchgeführt. Hervorzuheben ist, dass die Zielgruppe bezüglich des Alters inhomogen ist und deshalb der Net-Generation Konflikt berücksichtigt werden muss. Im Anschluss wird FISS unter folgenden Aspekten untersucht: Der Prozess des Erfahrungslernens, metakognitive Prozesse und die Integration der Non-Net-Generation. Die Ergebnisse zeigen, dass die Eigenschaften des Game-Design einen signifikanten Einfluss auf den Prozess des Erfahrungslernens und die Lernerfolge besitzen. Spieler mit einem praktischen Zugang zu Lerninhalten (Concrete Experience) erzielten einen signifikant größeren Wissenzuwachs. Zudem profitierten alle Spieler von FISS, jedoch konnte in einer Vorstudie kein Einfluss metakognitiver Fähigkeiten auf den Wissenzuwachs nachgewiesen werden. Die weitere zentrale Studie dieser Arbeit fokussiert den Net-Generation Konflikt und evaluiert den Erfolg der eingangs aufgestellten Game-Design Paradigmen. Hierzu werden die Teilnehmer nach drei Altersgruppen getrennt betrachtet: Non-Net-Generation, Net-Generation und die dazwischen liegende Crossover-Generation. Es zeigt sich, dass der Lern- und Spielerfolg aller Generationen gleichermaßen signifikant ist und nur innerhalb des zu erwartenden Standardfehlers abweicht. FISS eignet sich folglich für alle Generationen. Diese Ergebnisse können stellvertretend für Serious Games im Genre der Rundenstrategie gesehen werden. Die in dieser Arbeit erzielten Ergebnisse ermöglichen ein besseres Verständnis der Auswirkungen des Game-Design auf den Lernerfolg. Hiermit können potentielle Schwachstellen eines Serious Game erkannt und vermieden werden. Die Erkenntnisse im Bereich des Erfahrungslernens ermöglichen zudem eine bessere Anpassungen an die Zielgruppe. Für die zukünftige Forschung wurde mit dem in dieser Arbeit entwickelten Framework eine Grundlage geschaffen.
This thesis combines behavioral and cognitive approaches regarding the Web for analyzing users' behavior and supposed interests.
The work is placed in a new field of research called Web Science, which includes, but is not restricted to, the analysis of the World Wide Web. The term Web Science is affected by Tim Berners-Lee et al., who invited the researchers to "create a science of the web" [BLHH+06a]. The thesis is structured in two parts, reflecting the intersection of disciplines that is required for Web Science.
The first part is related to computer science and information systems. This part defines the Gugubarra concepts and algorithms for web user profiling and builds upon the results by Mushtaq et al. [MWTZ04]. This profiling aims at understanding the behavior and supposed interests of users. Based on these concepts, a framework was implemented to support the needs of web site owners. The core technologies used are Java, Spring, Hibernate, and content management systems. The design principles, architecture, implementation, and tests of the prototype are reported.
The second part is directly related to behavioral economics and is connected to the areas of economics, mathematics, and psychology. This part contributes to behavior models, as was claimed by Tim Berners-Lee et al.: "Though individual users may or may not be rational, it has long been noted that en masse people behave as utility maximisers. In that case, understanding the incentives that are available to web users should provide methods for generating models of behaviour..."[BLHH+06b]. The focus here is on studies that investigate the user's choice of online information services in a multi-attribute context. The introduced research framework takes into account background and local context effects and builds upon theoretical foundations by Tversky and Kahneman [TK86]. The findings provide useful insights to behavioral scientists and to practitioners on how to use framing strategies to alter the user's choice.
Succinctness is a natural measure for comparing the strength of different logics. Intuitively, a logic L_1 is more succinct than another logic L_2 if all properties that can be expressed in L_2 can be expressed in L_1 by formulas of (approximately) the same size, but some properties can be expressed in L_1 by (significantly) smaller formulas.
We study the succinctness of logics on linear orders. Our first theorem is concerned with the finite variable fragments of first-order logic. We prove that:
(i) Up to a polynomial factor, the 2- and the 3-variable fragments of first-order logic on linear orders have the same succinctness. (ii) The 4-variable fragment is exponentially more succinct than the 3-variable fragment. Our second main result compares the succinctness of first-order logic on linear orders with that of monadic second-order logic. We prove that the fragment of monadic second-order logic that has the same expressiveness as first-order logic on linear orders is non-elementarily more succinct than first-order logic.