Refine
Year of publication
Document Type
- Doctoral Thesis (54) (remove)
Language
- English (54) (remove)
Has Fulltext
- yes (54) (remove)
Is part of the Bibliography
- no (54)
Keywords
- FPGA (2)
- ALICE (1)
- Abfrageverarbeitung (1)
- Abstraction (1)
- Agent <Künstliche Intelligenz> (1)
- Agenten (1)
- Agents (1)
- Analog (1)
- Analog Circuits (1)
- Analog Verification (1)
Institute
- Informatik (54) (remove)
A framework for the analysis and visualization of multielectrode spike trains / von Ovidiu F. Jurjut
(2009)
The brain is a highly distributed system of constantly interacting neurons. Understanding how it gives rise to our subjective experiences and perceptions depends largely on understanding the neuronal mechanisms of information processing. These mechanisms are still poorly understood and a matter of ongoing debate remains the timescale on which the coding process evolves. Recently, multielectrode recordings of neuronal activity have begun to contribute substantially to elucidating how information coding is implemented in brain circuits. Unfortunately, analysis and interpretation of multielectrode data is often difficult because of their complexity and large volume. Here we propose a framework that enables the efficient analysis and visualization of multielectrode spiking data. First, using self-organizing maps, we identified reoccurring multi-neuronal spike patterns that evolve on various timescales. Second, we developed a color-based visualization technique for these patterns. They were mapped onto a three-dimensional color space based on their reciprocal similarities, i.e., similar patterns were assigned similar colors. This innovative representation enables a quick and comprehensive inspection of spiking data and provides a qualitative description of pattern distribution across entire datasets. Third, we quantified the observed pattern expression motifs and we investigated their contribution to the encoding of stimulus-related information. An emphasis was on the timescale on which patterns evolve, covering the temporal scales from synchrony up to mean firing rate. Using our multi-neuronal analysis framework, we investigated data recorded from the primary visual cortex of anesthetized cats. We found that cortical responses to dynamic stimuli are best described as successions of multi-neuronal activation patterns, i.e., trajectories in a multidimensional pattern space. Patterns that encode stimulus-specific information are not confined to a single timescale but can span a broad range of timescales, which are tightly related to the temporal dynamics of the stimuli. Therefore, the strict separation between synchrony and mean firing rate is somewhat artificial as these two represent only extreme cases of a continuum of timescales that are expressed in cortical dynamics. Results also indicate that timescales consistent with the time constants of neuronal membranes and fast synaptic transmission (~10-20 ms) appear to play a particularly salient role in coding, as patterns evolving on these timescales seem to be involved in the representation of stimuli with both slow and fast temporal dynamics.
In this dissertation a non-deterministic lambda-calculus with call-by-need evaluation is treated. Call-by-need means that subexpressions are evaluated at most once and only if their value must be known to compute the overall result. Also called "sharing", this technique is inevitable for an efficient implementation. In the lambda-ND calculus of chapter 3 sharing is represented explicitely by a let-construct. Above, the calculus has function application, lambda abstractions, sequential evaluation and pick for non-deterministic choice. Non-deterministic lambda calculi play a major role as a theoretical foundation for concurrent processes or side-effected input/output. In this work, non-determinism additionally makes visible when sharing is broken. Based on the bisimulation method this work develops a notion of equality which respects sharing. Using bisimulation to establish contextual equivalence requires substitutivity within contexts, i.e., the ability to "replace equals by equals" within every program or term. This property is called congruence or precongruence if it applies to a preorder. The open similarity of chapter 4 represents a new concept, insofar that the usual definition of a bisimulation is impossible in the lambda-ND calculus. So in section 3.2 a further calculus lambda-Approx has to be defined. Section 3.3 contains the proof of the so-called Approximation Theorem which states that the evaluation in lambda-ND and lambda-Approx agrees. The foundation for the non-trivial precongruence proof is set out in chapter 2 where the trailblazing method of Howe is extended to be capable with sharing. By the use of this (extended) method, the Precongruence Theorem proves open similarity to be a precongruence, involving the so-called precongruence candidate relation. Joining with the Approximation Theorem we obtain the Main Theorem which says that open similarity of the lambda-Approx calculus is contained within the contextual preorder of the lambda-ND calculus. However, this inclusion is strict, a property whose non-trivial proof involves the notion of syntactic continuity. Finally, chapter 6 discusses possible extensions of the base calculus such as recursive bindings or case and constructors. As a fundamental study the calculus lambda-ND provides neither of these concepts, since it was intentionally designed to keep the proofs as simple as possible. Section 6.1 illustrates that the addition case and constructors could be accomplished without big hurdles. However, recursive bindings cannot be represented simply by a fixed point combinator like Y, thus further investigations are necessary.
Acceleration of Biomedical Image Processing and Reconstruction with FPGAs
Increasing chip sizes and better programming tools have made it possible to increase the boundaries of application acceleration with reconfigurable computer chips. In this thesis the potential of acceleration with Field Programmable Gate Arrays (FPGAs) is examined for applications that perform biomedical image processing and reconstruction. The dataflow paradigm was used to port the analysis of image data for localization microscopy and for 3D electron tomography from an imperative description towards the FPGA for the first time.
After the primitives of image processing on FPGAs are presented, a general workflow is given for analyzing imperative source code and converting it to a hardware pipeline where every node processes image data in parallel. The theoretical foundation is then used to accelerate both example applications. For localization microscopy, an acceleration of 185 compared to an Intel i5 450 CPU was achieved, and electron tomography could be sped up by a factor of 5 over an Nvidia Tesla C1060 graphics card while maintaining full accuracy in both cases.
Efficient algorithms for object recognition are crucial for the newly robotics and computer vision applications that demand real-time and on-line methods. Some examples are autonomous systems, navigating robots, autonomous driving. In this work, we focus on efficient semantic segmentation, which is the problem of labeling each pixel of an image with a semantic class.
Our aim is to speed-up all of the parts of the semantic segmentation pipeline. We also aim at delivering a labeling solution on a time budget, that can be decided on-the-fly. For this purpose, we analyze all the components of the semantic segmentation pipeline, and identify the computational bottleneck of each of them. The different components of the pipeline are over-segmenting the image with local regions, extracting features and classify the local regions, and the final inference of the image labeling with semantic classes. We focus on each of these steps.
First, we introduce a new superpixel algorithm to over-segment the image. Our superpixel method runs in real-time and can deliver a solution at any time budget. Then, for feature extraction, we focus on the framework that computes descriptors and encodes them, followed by a pooling step. We see that the encoding step is the bottleneck, for computational efficiency and performance. We present a novel assignment-based encoding formulation, that allows for the design of a new, very efficient, encoding. Finally, the image labeling output is obtained modeling the dependencies with a Conditional Random Field (CRF). In semantic image segmentation, the computational cost of instantiating the potentials is much higher than MAP inference. We introduce Active MAP inference to on-the-fly select a subset of potentials to be instantiated in the energy function, leaving the rest as unknown, and to estimate the MAP labeling from such incomplete energy function.
We perform experiments on all proposed methods for the different parts of the semantic segmentation pipeline. We show that our superpixel extraction achieves higher accuracy than state-of-the-art on standard superpixel benchmark, while it runs in real-time. We test our feature encoding on standard image classification and segmentation benchmarks, and we show that our method achieves competitive results with the state-of-the-art, and requires less time and memory. Finally, results for semantic segmentation benchmark show that Active MAP inference achieves similar levels of accuracy but with major efficiency gains.
Magnetoencephalography (MEG) measures neural activity non-invasively and at an excellent temporal resolution. Since its invention (Cohen, 1968, 1972), MEG has proven a most valuable tool in neurocognitive (Salmelin et al., 1994) and clinical research (Stufflebeam et al., 2009; Van ’t Ent et al., 2003). MEG is able to measure rapid changes in electrophysiological neural signals related to sensory and cognitive processes. The magnetic fields measured outside the head by MEG directly reflect the cortical currents generated by the synchronised activity of thousands of neuronal sources. This distinguishes MEG from functional magnetic resonance imaging (fMRI), where measurements are only indirectly related to electrophysiological activity through neurovascular coupling...
Driven by rapid technological advancements, the amount of data that is created, captured, communicated, and stored worldwide has grown exponentially over the past decades. Along with this development it has become critical for many disciplines of science and business to being able to gather and analyze large amounts of data. The sheer volume of the data often exceeds the capabilities of classical storage systems, with the result that current large-scale storage systems are highly distributed and are comprised of a high number of individual storage components. As with any other electronic device, the reliability of storage hardware is governed by certain probability distributions, which in turn are influenced by the physical processes utilized to store the information. The traditional way to deal with the inherent unreliability of combined storage systems is to replicate the data several times. Another popular approach to achieve failure tolerance is to calculate the block-wise parity in one or more dimensions. With better understanding of the different failure modes of storage components, it has become evident that sophisticated high-level error detection and correction techniques are indispensable for the ever-growing distributed systems. The utilization of powerful cyclic error-correcting codes, however, comes with a high computational penalty, since the required operations over finite fields do not map very well onto current commodity processors. This thesis introduces a versatile coding scheme with fully adjustable fault-tolerance that is tailored specifically to modern processor architectures. To reduce stress on the memory subsystem the conventional table-based algorithm for multiplication over finite fields has been replaced with a polynomial version. This arithmetically intense algorithm is better suited to the wide SIMD units of the currently available general purpose processors, but also displays significant benefits when used with modern many-core accelerator devices (for instance the popular general purpose graphics processing units). A CPU implementation using SSE and a GPU version using CUDA are presented. The performance of the multiplication depends on the distribution of the polynomial coefficients in the finite field elements. This property has been used to create suitable matrices that generate a linear systematic erasure-correcting code which shows a significantly increased multiplication performance for the relevant matrix elements. Several approaches to obtain the optimized generator matrices are elaborated and their implications are discussed. A Monte-Carlo-based construction method allows it to influence the specific shape of the generator matrices and thus to adapt them to special storage and archiving workloads. Extensive benchmarks on CPU and GPU demonstrate the superior performance and the future application scenarios of this novel erasure-resilient coding scheme.
High-energy physics experiments aim to deepen our understanding of the fundamental structure of matter and the governing forces. One of the most challenging aspects of the design of new experiments is data management and event selection. The search for increasingly rare and intricate physics events asks for high-statistics measurements and sophisticated event analysis. With progressively complex event signatures, traditional hardware-based trigger systems reach the limits of realizable latency and complexity. The Compressed Baryonic Matter experiment (CBM) employs a novel approach for data readout and event selection to address these challenges. Self-triggered, free-streaming detectors push all data to a central compute cluster, called First-level Event Selector (FLES), for software-based event analysis and selection. While this concept solves many issues present in classical architectures, it also sets new challenges for the design of the detector readout systems and online event selection.
This thesis presents an efficient solution to the data management challenges presented by self-triggered, free-streaming particle detectors. The FLES must receive asynchronously streamed data from a heterogeneous detector setup at rates of up to 1 TB/s. The real-time processing environment implies that all components have to deliver high performance and reliability to record as much valuable data as possible. The thesis introduces a time-based data model to partition the input streams into containers of fixed length in experiment time for efficient data management. These containers provide all necessary metadata to enable generic, detector-subsystem-agnostic data distribution across the entire cluster. An analysis shows that the introduced data overhead is well below 1 % for a wide range of system parameters.
Furthermore, a concept and the implementation of a detector data input interface for the CBM FLES, optimized for resource-efficient data transport, are presented. The central element of the architecture is an FPGA-based PCIe extension card for the FLES entry nodes. The hardware designs developed in the thesis enable interfacing with a diverse set of detector systems. A custom, high-throughput DMA design structures data in a way that enables low-overhead access and efficient software processing. The ability to share the host DMA buffers with other devices, such as an InfiniBand HCA, allows for true zero-copy data distribution between the cluster nodes. The discussed FLES input interface is fully implemented and has already proven its reliability in production operation in various physics experiments.
The economic success of the World Wide Web makes it a highly competitive environment for web businesses. For this reason, it is crucial for web business owners to learn what their customers want. This thesis provides a conceptual framework and an implementation of a system that helps to better understand the behavior and potential interests of web site visitors by accounting for both explicit and implicit feedback. This thesis is divided into two parts.
The first part is rooted in computer science and information systems and uses graph theory and an extended click-stream analysis to define a framework and a system tool that is useful for analyzing web user behavior by calculating the interests of the users.
The second part is rooted in behavioral economics, mathematics, and psychology and is investigating influencing factors on different types of web user choices. In detail, a model for the cognitive process of rating products on the Web is defined and an importance hierarchy of the influencing factors is discovered.
Both parts make use of techniques from a variety of research fields and, therefore, contribute to the area of Web Science.