### Refine

#### Year of publication

#### Document Type

- Working Paper (97)
- Article (92)
- Doctoral Thesis (26)
- Conference Proceeding (14)
- Report (7)
- Part of a Book (6)
- Preprint (5)
- Diplom Thesis (3)
- Book (2)
- Bachelor Thesis (1)

#### Language

- English (254) (remove)

#### Keywords

- Lambda-Kalkül (20)
- Formale Semantik (9)
- Programmiersprache (7)
- Nebenläufigkeit (6)
- lambda calculus (6)
- Textanalyse ; Linguistische Datenverarbeitung; Computerlinguistik (5)
- Computerlinguistik (4)
- Operationale Semantik (4)
- Verifikation (4)
- letrec (4)

#### Institute

- Informatik (254) (remove)

- Energy- and cost-efficient Lattice-QCD computations using graphics processing units (2014)
- Quarks and gluons are the building blocks of all hadronic matter, like protons and neutrons. Their interaction is described by Quantum Chromodynamics (QCD), a theory under test by large scale experiments like the Large Hadron Collider (LHC) at CERN and in the future at the Facility for Antiproton and Ion Research (FAIR) at GSI. However, perturbative methods can only be applied to QCD for high energies. Studies from first principles are possible via a discretization onto an Euclidean space-time grid. This discretization of QCD is called Lattice QCD (LQCD) and is the only ab-initio option outside of the high-energy regime. LQCD is extremely compute and memory intensive. In particular, it is by definition always bandwidth limited. Thus—despite the complexity of LQCD applications—it led to the development of several specialized compute platforms and influenced the development of others. However, in recent years General-Purpose computation on Graphics Processing Units (GPGPU) came up as a new means for parallel computing. Contrary to machines traditionally used for LQCD, graphics processing units (GPUs) are a massmarket product. This promises advantages in both the pace at which higher-performing hardware becomes available and its price. CL2QCD is an OpenCL based implementation of LQCD using Wilson fermions that was developed within this thesis. It operates on GPUs by all major vendors as well as on central processing units (CPUs). On the AMD Radeon HD 7970 it provides the fastest double-precision D= kernel for a single GPU, achieving 120GFLOPS. D=—the most compute intensive kernel in LQCD simulations—is commonly used to compare LQCD platforms. This performance is enabled by an in-depth analysis of optimization techniques for bandwidth-limited codes on GPUs. Further, analysis of the communication between GPU and CPU, as well as between multiple GPUs, enables high-performance Krylov space solvers and linear scaling to multiple GPUs within a single system. LQCD calculations require a sampling of the phase space. The hybrid Monte Carlo (HMC) algorithm performs this. For this task, a single AMD Radeon HD 7970 GPU provides four times the performance of two AMD Opteron 6220 running an optimized reference code. The same advantage is achieved in terms of energy-efficiency. In terms of normalized total cost of acquisition (TCA), GPU-based clusters match conventional large-scale LQCD systems. Contrary to those, however, they can be scaled up from a single node. Examples of large GPU-based systems are LOEWE-CSC and SANAM. On both, CL2QCD has already been used in production for LQCD studies.

- Acceleration of biomedical image processing and reconstruction with FPGAs (2014)
- Acceleration of Biomedical Image Processing and Reconstruction with FPGAs Increasing chip sizes and better programming tools have made it possible to increase the boundaries of application acceleration with reconfigurable computer chips. In this thesis the potential of acceleration with Field Programmable Gate Arrays (FPGAs) is examined for applications that perform biomedical image processing and reconstruction. The dataflow paradigm was used to port the analysis of image data for localization microscopy and for 3D electron tomography from an imperative description towards the FPGA for the first time. After the primitives of image processing on FPGAs are presented, a general workflow is given for analyzing imperative source code and converting it to a hardware pipeline where every node processes image data in parallel. The theoretical foundation is then used to accelerate both example applications. For localization microscopy, an acceleration of 185 compared to an Intel i5 450 CPU was achieved, and electron tomography could be sped up by a factor of 5 over an Nvidia Tesla C1060 graphics card while maintaining full accuracy in both cases.

- 1D-3D hybrid modeling-from multi-compartment models to full resolution models in space and time (2014)
- Investigation of cellular and network dynamics in the brain by means of modeling and simulation has evolved into a highly interdisciplinary field, that uses sophisticated modeling and simulation approaches to understand distinct areas of brain function. Depending on the underlying complexity, these models vary in their level of detail, in order to cope with the attached computational cost. Hence for large network simulations, single neurons are typically reduced to time-dependent signal processors, dismissing the spatial aspect of each cell. For single cell or networks with relatively small numbers of neurons, general purpose simulators allow for space and time-dependent simulations of electrical signal processing, based on the cable equation theory. An emerging field in Computational Neuroscience encompasses a new level of detail by incorporating the full three-dimensional morphology of cells and organelles into three-dimensional, space and time-dependent, simulations. While every approach has its advantages and limitations, such as computational cost, integrated and methods-spanning simulation approaches, depending on the network size could establish new ways to investigate the brain. In this paper we present a hybrid simulation approach, that makes use of reduced 1D-models using e.g., the NEURON simulator—which couples to fully resolved models for simulating cellular and sub-cellular dynamics, including the detailed three-dimensional morphology of neurons and organelles. In order to couple 1D- and 3D-simulations, we present a geometry-, membrane potential- and intracellular concentration mapping framework, with which graph- based morphologies, e.g., in the swc- or hoc-format, are mapped to full surface and volume representations of the neuron and computational data from 1D-simulations can be used as boundary conditions for full 3D simulations and vice versa. Thus, established models and data, based on general purpose 1D-simulators, can be directly coupled to the emerging field of fully resolved, highly detailed 3D-modeling approaches. We present the developed general framework for 1D/3D hybrid modeling and apply it to investigate electrically active neurons and their intracellular spatio-temporal calcium dynamics.

- Mathematical modeling of Arabidopsis thaliana with focus on network decomposition and reduction (2014)
- Systems biology has become an important research field during the last decade. It focusses on the understanding of the systems which emit the measured data. An important part of this research field is the network analysis, investigating biological networks. An essential point of the inspection of these network models is their validation, i.e., the successful comparison of predicted properties to measured data. Here especially Petri nets have shown their usefulness as modeling technique, coming with sound analysis methods and an intuitive representation of biological network data. A very important tool for network validation is the analysis of the Transition-invariants (TI), which represent possible steady-state pathways, and the investigation of the liveness property. The computational complexity of the determination of both, TI and liveness property, often hamper their investigation. To investigate this issue, a metabolic network model is created. It describes the core metabolism of Arabidopsis thaliana, and it is solely based on data from the literature. The model is too complex to determine the TI and the liveness property. Several strategies are followed to enable an analysis and validation of the network. A network decomposition is utilized in two different ways: manually, motivated by idea to preserve the integrity of biological pathways, and automatically, motivated by the idea to minimize the number of crossing edges. As a decomposition may not be preserving important properties like the coveredness, a network reduction approach is suggested, which is mathematically proven to conserve these important properties. To deal with the large amount of data coming from the TI analysis, new organizational structures are proposed. The liveness property is investigated by reducing the complexity of the calculation method and adapting it to biological networks. The results obtained by these approaches suggest a valid network model. In conclusion, the proposed approaches and strategies can be used in combination to allow the validation and analysis of highly complex biological networks.

- Correctness of an STM Haskell implementation (2013)
- A concurrent implementation of software transactional memory in Concurrent Haskell using a call-by-need functional language with processes and futures is given. The description of the small-step operational semantics is precise and explicit, and employs an early abort of conflicting transactions. A proof of correctness of the implementation is given for a contextual semantics with may- and should-convergence. This implies that our implementation is a correct evaluator for an abstract specification equipped with a big-step semantics.

- Towards correctness of program transformations through unification and critical pair computation (2011)
- Correctness of program transformations in extended lambda calculi with a contextual semantics is usually based on reasoning about the operational semantics which is a rewrite semantics. A successful approach to proving correctness is the combination of a context lemma with the computation of overlaps between program transformations and the reduction rules, and then of so-called complete sets of diagrams. The method is similar to the computation of critical pairs for the completion of term rewriting systems.We explore cases where the computation of these overlaps can be done in a first order way by variants of critical pair computation that use unification algorithms. As a case study we apply the method to a lambda calculus with recursive let-expressions and describe an effective unification algorithm to determine all overlaps of a set of transformations with all reduction rules. The unification algorithm employs many-sorted terms, the equational theory of left-commutativity modelling multi-sets, context variables of different kinds and a mechanism for compactly representing binding chains in recursive let-expressions.

- Reconstruction of a logic for inductive proofs of properties of functional programs (2010)
- The interactive verification system VeriFun is based on a polymorphic call-by-value functional language and on a first-order logic with initial model semantics w.r.t. constructors. It is designed to perform automatic induction proofs and can also deal with partial functions. This paper provides a reconstruction of the corresponding logic and semantics using the standard treatment of undefinedness which adapts and improves the VeriFun-logic by allowing reasoning on nonterminating expressions and functions. Equality of expressions is defined as contextual equivalence based on observing termination in all closing contexts. The reconstruction shows that several restrictions of the VeriFun framework can easily be removed, by natural generalizations: mutual recursive functions, abstractions in the data values, and formulas with arbitrary quantifier prefix can be formulated. The main results of this paper are: an extended set of deduction rules usable in VeriFun under the adapted semantics is proved to be correct, i.e. they respect the observational equivalence in all extensions of a program. We also show that certain classes of theorems are conservative under extensions, like universally quantified equations. Also other special classes of theorems are analyzed for conservativity.

- Reconstruction of a logic for inductive proofs of properties of functional programs (2010)
- The interactive verification system VeriFun is based on a polymorphic call-by-value functional language and on a first-order logic with initial model semantics w.r.t. constructors. This paper provides a reconstruction of the corresponding logic when partial functions are permitted. Typing is polymorphic for the definition of functions but monomorphic for terms in formulas. Equality of terms is defined as contextual equivalence based on observing termination in all contexts. The reconstruction also allows several generalizations of the functional language like mutual recursive functions and abstractions in the data values. The main results are: Correctness of several program transformations for all extensions of a program, which have a potential usage in a deduction system. We also proved that universally quantified equations are conservative, i.e. if a universally quantified equation is valid w.r.t. a program P, then it remains valid if the program is extended by new functions and/or new data types.

- On correctness of buffer implementations in a concurrent lambda calculus with futures (2009)
- Motivated by the question of correctness of a specific implementation of concurrent buffers in the lambda calculus with futures underlying Alice ML, we prove that concurrent buffers and handled futures can correctly encode each other. Correctness means that our encodings preserve and reflect the observations of may- and must-convergence, and as a consequence also yields soundness of the encodings with respect to a contextually defined notion of program equivalence. While these translations encode blocking into queuing and waiting, we also describe an adequate encoding of buffers in a calculus without handles, which is more low-level and uses busy-waiting instead of blocking. Furthermore we demonstrate that our correctness concept applies to the whole compilation process from high-level to low-level concurrent languages, by translating the calculus with buffers, handled futures and data constructors into a small core language without those constructs.