On the self-organization of a hierarchical memory for compositional object representation in the visual cortex
- At present, there is a huge lag between the artificial and the biological information processing systems in terms of their capability to learn. This lag could be certainly reduced by gaining more insight into the higher functions of the brain like learning and memory. For instance, primate visual cortex is thought to provide the long-term memory for the visual objects acquired by experience. The visual cortex handles effortlessly arbitrary complex objects by decomposing them rapidly into constituent components of much lower complexity along hierarchically organized visual pathways. How this processing architecture self-organizes into a memory domain that employs such compositional object representation by learning from experience remains to a large extent a riddle. The study presented here approaches this question by proposing a functional model of a self-organizing hierarchical memory network. The model is based on hypothetical neuronal mechanisms involved in cortical processing and adaptation. The network architecture comprises two consecutive layers of distributed, recurrently interconnected modules. Each module is identified with a localized cortical cluster of fine-scale excitatory subnetworks. A single module performs competitive unsupervised learning on the incoming afferent signals to form a suitable representation of the locally accessible input space. The network employs an operating scheme where ongoing processing is made of discrete successive fragments termed decision cycles, presumably identifiable with the fast gamma rhythms observed in the cortex. The cycles are synchronized across the distributed modules that produce highly sparse activity within each cycle by instantiating a local winner-take-all-like operation. Equipped with adaptive mechanisms of bidirectional synaptic plasticity and homeostatic activity regulation, the network is exposed to natural face images of different persons. The images are presented incrementally one per cycle to the lower network layer as a set of Gabor filter responses extracted from local facial landmarks. The images are presented without any person identity labels. In the course of unsupervised learning, the network creates simultaneously vocabularies of reusable local face appearance elements, captures relations between the elements by linking associatively those parts that encode the same face identity, develops the higher-order identity symbols for the memorized compositions and projects this information back onto the vocabularies in generative manner. This learning corresponds to the simultaneous formation of bottom-up, lateral and top-down synaptic connectivity within and between the network layers. In the mature connectivity state, the network holds thus full compositional description of the experienced faces in form of sparse memory traces that reside in the feed-forward and recurrent connectivity. Due to the generative nature of the established representation, the network is able to recreate the full compositional description of a memorized face in terms of all its constituent parts given only its higher-order identity symbol or a subset of its parts. In the test phase, the network successfully proves its ability to recognize identity and gender of the persons from alternative face views not shown before. An intriguing feature of the emerging memory network is its ability to self-generate activity spontaneously in absence of the external stimuli. In this sleep-like off-line mode, the network shows a self-sustaining replay of the memory content formed during the previous learning. Remarkably, the recognition performance is tremendously boosted after this off-line memory reprocessing. The performance boost is articulated stronger on those face views that deviate more from the original view shown during the learning. This indicates that the off-line memory reprocessing during the sleep-like state specifically improves the generalization capability of the memory network. The positive effect turns out to be surprisingly independent of synapse-specific plasticity, relying completely on the synapse-unspecific, homeostatic activity regulation across the memory network. The developed network demonstrates thus functionality not shown by any previous neuronal modeling approach. It forms and maintains a memory domain for compositional, generative object representation in unsupervised manner through experience with natural visual images, using both on- ("wake") and off-line ("sleep") learning regimes. This functionality offers a promising departure point for further studies, aiming for deeper insight into the learning mechanisms employed by the brain and their consequent implementation in the artificial adaptive systems for solving complex tasks not tractable so far.
Explicit influence of water microsolvation on charge transfer and dynamics in ground and excited electronic states of molecular systems
- Modern computational molecular quantum chemical studies, such as the present one, typically employ a wide range of theoretical techniques. The latter are often rather complicated and one should not generally expect that an experimental scientist in the area of physical chemistry, a potential reader of this work, should be familiar with all these techniques. To simplify the reading of the Thesis and to make it self-sufficient, it is supplied with an overview of the employed theoretical methodologies (Chapter 1). The overview explains basic quantum-chemical terminology referred to throughout the Thesis, introduces theoretical foundations of the methods and outlines their properties and limitations. In Part 1.1 of Chapter 1, methods for the solution of the molecular Schrödinger equation are introduced, while in the subsequent Parts 1.2 and 1.3 methods for the solution of the electronic Schrödinger equation are presented to find the ground and excited states, respectively. Part 1.4 is dedicated to basis-set effects which are omnipresent in electronic-structure calculations. It contains a number of unusual insights and concepts proposed by the author and, thus, may be insightful also to experts in quantum chemistry.
In Chapter 2, the phenomenon of acetone-water proton exchange catalyzed by tubular as well as amorphous aggregates of calixhydroquinone (CHQ) macromolecules, which has been observed previously in NMR experiments (Ref. D1D), is investigated by means of correlated quantum-chemical methods. The first part of the study (Section 2.3-2.7) considers concerted proton transfer, assisted by several initially neutral OH-groups in the hydrogen-bonded networks of CHQ aggregates. The second part of the study (Section 2.8-2.13) is dedicated to a second mechanism of proton exchange: step-wise proton transfer via formation of ionic intermediates resulting from CHQ pre-dissociation. CHQ application-specific as well as general conclusions, relevant to the main topic of the Thesis (i.e. influence of specific microsolvation on the considered proton transfer processes), are presented in Section 2.14.
The phenomenon of dual fluorescence observed in clusters of methyl 4-N,N-dimethylaminobenzoate ester (DMABME) and two water molecules in the gas phase, is studied in Chapter 3. Experimentally, the dual fluorescence was detected in experiments combining optical and ground-state ion-depletion infrared spectroscopies in ultracold molecular beams (Ref. D2D). In Section 3.3, calculated ground-state infrared spectra are presented that allow to identify the structures of those isomers, which are present in the gas-phase, as well as the structure of the isomer responsible for dual fluorescence. To further understand the reaction mechanism of dual fluorescence, excited-state potential energy surfaces of the identified isomers were computed along the relevant twisted intermolecular charge-transfer formation coordinate and the mechanism of energy dissipation in these complexes was investigated (Section 3.4-3.5) (Ref. D3D). A brief summary of the main results of this chapter and conclusions are given in Section 3.6. Finally, in Section 3.7 a complementary benchmark study of the quality of ground-state potential energy surfaces of prototypical hydrogen-bonded systems (ammonia-water and formic acid-water dimers) obtained at the level of BSSE-corrected MP2 combined with moderate basis sets, has been conducted. The quality of potential energy surfaces was tested with respect to basis-set size, level of electron correlation and anharmonicity effects and the applied methodology to identify the IR spectrum of hydrated DMABME complexes (Section 3.3) has been found to be sufficient to uniquely assign the IR spectra.
Essays in behavioral economics - evidence on self-selection into jobs, social networks and leniency
Development of decoration and preferential-etching methods for delineation of crystal defects in semiconductor materials
- Silicon wafers such as Silicon on Insulator (SOI) and strained silicon on Insulator (sSOI) are the essential and basic materials of advanced microelectronic devices. However, they often show various kinds of crystal defects which impair the function of these devices. The most efficient method to date, for detecting such defects and for determining their density, is to delineate them by etching the wafers with a suitable etching solution and characterise them via light optical microscopy. Etch pits are formed at defect sites which are etched at a faster rate than at the perfect lattice. The standard etching solution used for SOI and sSOI is a dilute version of Secco. As Secco contains carcinogenic and environmentally hazardous chromium (VI), the use of which is or will be restricted by law in many countries, suitable chromium (VI)-free etching solutions like Organic Peracid Etches (OPE), modified Chemical Polishing Etches (CP) like CP4 mod and mixtures with organic oxidizing agents like chloranil (CA) have been developed for the successful delineation of various types of crystal defects.
However there are still nanometer-sized defects which are hard to detect or escape detection by this method. Copper decoration is a well known method to magnify these defects. It consists in applying a copper nitrate solution to the back of the SOI or sSOI wafer. On annealing, copper diffuses through the substrate and the BOX (buried oxide) to the SOI/sSOI film and on quenching to room temperature, copper precipitates as copper silicide, SiCu3, foremost at crystal defects where the lattice strain is greater than at perfect lattice sites. These silicides increase the volume in these parts of the crystal lattice and defect magnification occurs. A considerable disadvantage of this method is its tendency for artefact formation, when the copper concentration used is too high, with the copper precipitating at the film surface. The consequence is a higher density of etch pits whereby true defect etch pits cannot be differentiated from those caused by artefacts.
The aim of this thesis is to show that the processes of decorating and etching can be combined successfully to delineate all crystal defects in SOI and sSOI. An ideal result would have been to find a copper decoration procedure that decorates all existing crystal defects at a copper concentration that avoids artefact formation.
The regulation of endocannabinoids after neuronal damage and the neuroprotective impact of GPR55 in organotypic hippocampal slice cultures
- Endocannabinoids (eCB) are signaling lipids and became known for their importance in the central nervous system as well as in immune defense. Beneficial effects of eCB are shown in processes of excitotoxic lesion, secondary damage and neuronal plasticity throughout the last years. Two canabinoid receptors, type 1 (CB1) and type 2 (CB2) as the respective endogenous ligands belong to the endocannabinoid system (eCBS). In 1990, the CB1 could be cloned and was localised mainly on neurons. Shortly thereafter in 1993, the CB2 was characterised and found primarily on cells belonging to the immune system. N-arachidonoylethanolamide (AEA), often called anandamide, and 2-arachidonoylglycerol (2-AG) are the best characterised eCB. N-palmitylethanolamide (PEA) and N-oleoylethanolamide (OEA) have no or only low affinity to CB1 but enhance the affinity of AEA significantly. This group is therefore often summarized as N-ethanolamides (NEA). ECB are derivates of arachidonic acid and are stored in membranes where they become hydrolysed on demand by specific enzymes. Traumatic brain injury altered the levels of eCB in the blood in vivo and when applied in vitro after neuronal damage, eCB could reduce the damaging burden. Further studies demonstrated that eCB are potent to down-regulate pro-inflammatory cytokines and most important to decrease neuronal excitation.
In the present study, the intrinsic regulation of the endocannabinoid system after neuronal damage over time was investigated in rat Organotypic Hippocampal Slice Cultures (OHSC). Temporal and spatial dynamics of eCB levels were analysed after transection of the perforant pathway (PPT) in originating neurons (enthorhinal cortex, EC), areas of deafferentiation/anterograde axonal degeneration (dentate gyrus, DG) and of the synaptically linked cornu ammonis region 1 (CA1) as well as after excitotoxic lesion in the respective regions.
A strong increase of all eCB was observed only in the denervation zone of the DG 24 hours post PPT. In excitotoxic lesioned OHSC all eCB were elevated, in the investigated regions up to 72 hours post lesion (hpl). The responsible enzyme for biosynthesis of the NEA, NAPE-PLD protein, was increased during the early timepoints of measurement (1-6 hpl). The responsible catabolizing enzyme, FAAH, and the CB1 receptor were up-regulated at a later timepoint, 48 hpl, explaining the eCB levels. In the present model, the inhibition of the enzyme responsible for 2-AG hydrolysis (MAGL) was neuroprotective as previously shown and a re-distribution within neurons and astrocytes during neuronal damage could be observed. In primary cell cultures microglia expressed the regulating enzymes of 2-AG and the enzyme responsible for NEA down-regulation, FAAH. Astrocytes expressed mainly the catalyzing enzymes, indicating the role for eCB break-down. All these findings together demonstrate the great capacity of the eCBS to control inflammatory processes and consequently neuronal cell death.
All effects of the known eCB could not be clarified by CB1/CB2 deficient mice. Several G-protein coupled receptors (GPR) are recently in discussion whether they might and should belong to the endocannabinoid system. The GPR55, the not yet cloned abnormal cannabidiol receptor and further GPRs are candidates as potential endocannabinoid receptors. Recently GPR55 has been discussed as a putative cannabinoid receptor type 3 (CB3). Quantitative PCR revealed that Gpr55 is present in primary microglia and the brain, but the exact regional and cellular distribution and the physiological/pathological effects downstream of GPR55 activation in the CNS still remain open. Therefore, the excitotoxic rat OHSC model, previously used to investigate the neuroprotective potency of eCB, was now used to investigate the neuroprotective potency of GPR55. Activation of GPR55 protected dentate gyrus granule cells in vitro after excitotoxic lesion, induced by NMDA. In parallel, GPR55 activation was able to reduce the number of microglia in the dentate gyrus. These neuroprotective effects vanished however in microglia depleted OHSCs as well as in OHSC transfected with Gpr55 siRNA, indicating a strong involvement of microglia in GPR55 mediated neuroprotection.
In summary, the present study found a strong time-dependent and anterograde mechanism of action of eCB after long-range projection damage and provided further evidence for the neuroprotective properties of eCB. The potential cannabinoid receptor 3 (GPR55) mediates neuronal protection on behalf of microglia.
Cell-free expression and molecular modeling of the γ-secretase complex and G-protein-coupled receptors
- Alzheimer’s disease (AD), which was first reported more than a century ago by Alhzeimer, is one of the commonest forms of dementia which affects >30 million people globally (>8 million in Europe). The origin and pathogenesis of AD is poorly understood and there is no cure available for the disease. AD is characterized by the accumulation of senile plaques composed of amyloid beta peptides (Ab 37-43) which is formed by the gamma secretase (GS) complex by cleaving amyloid precursor protein. Therefore GS can be an attractive drug target. Since GS processes several other substrates like Notch, CD44 and Cadherins, nonspecific inhibition of GS has many side effects. Due to the lack of crystal structure of GS, which is attributed to the extreme difficulties in purifying it, molecular modeling can be useful to understand its architecture. So far only low resolution cryoEM structures of the complex has been solved which only provides a rough structure of the complex at low 12-15 A resolution Furthermore the activity of GS in vitro can be achieved by means of cell-free (CF) expression.
GS comprises catalytic subunits namely presenilins and supporting elements containing Pen-2, Aph-1 and Nicastrin. The origin of AD is hidden in the regulated intramembrnae proteolysis (RIP) which is involved in various physiological processes and also in leukemia. So far growth factors, cytokines, receptors, viral proteins, cell adhesion proteins, signal peptides and GS has been shown to undergo RIP. During RIP, the target proteins undergo extracellular shredding and intramembrane proteolysis.
This thesis is based on molecular modeling, molecular dynamics (MD) simulations, cell-free (CF) expression, mass spectrometry, NMR, crystallization, activity assay etc of the components of GS complex and G-protein coupled receptors (GPCRs).
First I validated the NMR structure of PS1 CTF in detergent micelles and lipid bilayers using coarse-grained MD simulations using MARTINI forcefield implemented in Gromacs. CTF was simulated in DPC micelles, DPPC and DLPC lipid bilayer. Starting from random configuration of detergent and lipids, micelle and lipid bilyer were formed respectively in presence of CTF and it was oriented properly to the micelle and bilyer during the simulation. Around DPC molecules formed micelle around CTF in agreement of the experimental results in which 80-85 DPC molecules are required to form micelles. The structure obtained in DPC was similar to that of NMR structure but differed in bilayer simulations showed the possibility of substrate docking in the conserved PAL motif. Simulations of CTF in implicit membrane (IMM1) in CHAMM yielded similar structure to that from coarse grained MD.
I performed cell-free expression optimization, crystallization and NMR spectroscopy of Pen-2 in various detergent micelles. Additionally Pen-2 was modeled by a combination of rosetta membrane ab-initio method, HHPred distant homology modeling and incorporating NMR constraints. The models were validated by all atom and coarse grained MD simulations both in detergent micelles and POPC/DPPC lipid bilayers using MARTINI forcefield.
GS operon consisting of all four subunits was co-expressed in CF and purified. The presence of of GS subunits after pull-down with Aph-1 was determined by western blotting (Pen-2) and mass spectrometry (Presenilin-1 and Aph-1). I also studied interactions of especially PS1 CTF, APP and NTF by docking and MD.
I also made models and interfaces of Pen-2 with PS1 NTF and checked their stability by MD simulations and compared with experimental results. The goal is to model the interfaces between GS subunits using molecular modeling approaches based on available experimental data like cross-linking, mutations and NMR structure of C-terminal fragment of PS1 and transmembrane part of APP. The obtained interfaces of GS subunits may explain its catalysis mechanism which can be exploited for novel lead design. Due to lack of crystal/NMR structure of the GS subunits except the PS1 CTF, it is not possible to predict the effect of mutations in terms of APP cleavage. So I also developed a sequence based approach based on machine learning using support vector machine to predict the effect of PS1 CTF L383 mutations in terms of Aβ40/Aβ42 ratio with 88% accuracy. Mutational data derived from the Molgen database of Presenilin 1 mutations was using for training.
GPCRs (also called 7TM receptors) form a large superfamily of membrane proteins, which can be activated by small molecules, lipids, hormones, peptides, light, pain, taste and smell etc. Although 50% of the drugs in market target GPCRs , only few are targeted therapeutically. Such wide range of targets is due to involvement of GPCRs in signaling pathways related to many diseases i.e. dementia (like Alzheimer's disease), metabolic (like diabetes) including endocrinological disorders, immunological including viral infections, cardiovascular, inflammatory, senses disorders, pain and cancer.
Cannabinoid and adrenergic receptors belong to the class A (similar to rhodopsin) GPCRs. Docking of agonists and antagonists to CB1 and CB2 cannabinoid receptors revealed the importance of a centrally located rotamer toggle switch, and its possible role in the mechanism of agonist/antagonist recognition. The switch is composed of two residues, F3.36 and W6.48, located on opposite transmembrane helices TM3 and TM6 in the central part of the membranous domain of cannabinoid receptors. The CB1 and CB2 receptor models were constructed based on the adenosine A2A receptor template. The two best scored conformations of each receptor were used for the docking procedure. In all poses (ligand-receptor conformations) characterized by the lowest ligand-receptor intermolecular energy and free energy of binding the ligand type matched the state of the rotamer toggle switch: antagonists maintained an inactive state of the switch, whereas agonists changed it. In case of agonists of β2AR, the (R,R) and (S,S) stereoisomers of fenoterol, the molecular dynamics simulations provided evidence of different binding modes while preserving the same average position of ligands in the binding site. The (S,S) isomer was much more labile in the binding site and only one stable hydrogen bond was created. Such dynamical binding modes may also be valid for ligands of cannabinoid receptors because of the hydrophobic nature of their ligand-receptor interactions. However, only very long molecular dynamics simulations could verify the validity of such binding modes and how they affect the process of activation.
Human N-formyl peptide receptors (FPRs) are G protein-coupled receptors (GPCRs) involved in many physiological processes, including host defense against bacterial infection and resolving inflammation. The three human FPRs (FPR1, FPR2 and FPR3) share significant sequence homology and perform their action via coupling to Gi protein. Activation of FPRs induces a variety of responses, which are dependent on the agonist, cell type, receptor subtype, and also species involved. FPRs are expressed mainly by phagocytic leukocytes. Together, these receptors bind a large number of structurally diverse groups of agonistic ligands, including N-formyl and nonformyl peptides of different composition, that chemoattract and activate phagocytes. For example, N-formyl-Met-Leu-Phe (fMLF), an FPR1 agonist, activates human phagocyte inflammatory responses, such as intracellular calcium mobilization, production of cytokines, generation of reactive oxygen species, and chemotaxis. This ligand can efficiently activate the major bactericidal neutrophil functions and it was one of the first characterized bacterial chemotactic peptides. Whereas fMLF is by far the most frequently used chemotactic peptide in studies of neutrophil functions, atomistic descriptions for fMLF-FPR1 binding mode are still scarce mainly because of the absence of a crystal structure of this receptor. Elucidating the binding modes may contribute to designing novel and more efficient non-peptide FPR1 drug candidates. Molecular modeling of FPR1, on the other hand, can provide an efficient way to reveal details of ligand binding and activation of the receptor. However, recent modelings of FPRs were confined only to bovine rhodopsin as a template.
To locate specific ligand-receptor interactions based on a more appropriate template than rhodopsin we generated the homology models of FPR1 using the crystal structure of the chemokine receptor CXCR4, which shares over 30% sequence identity with FPR1 and is located in the same γ branch of phylogenetic tree of GPCRs (rhodopsin is located in α branch). Docking and model refinement procedures were pursued afterward. Finally, 40 ns full-atom MD simulations were conducted for the Apo form as well as for complexes of fMLF (agonist) and tBocMLF (antagonist) with FPR1 in the membrane. Based on locations of the N- and C-termini of the ligand the FPR1 extracellular pocket can be divided into two zones, namely, the anchor and activation regions. The formylated M1 residue of fMLF bound to the activation region led to a series of conformational changes of conserved residues. Internal water molecules participating in extended hydrogen bond networks were found to play a crucial role in transmitting the agonist-receptor interactions. A mechanism of initial steps of the activation concurrent with ligand binding is proposed.
I accurately predicted the structure and ligand binding pose of dopamine receptor 3 (RMSD to the crystal structure: 2.13 Å) and chemokine receptor 4 (CXCR4, RMSD to the crystal structure 3.21 Å) in GPCR-Dock 2010 competition. The homology model of the dopamine receptor 3 was 8 th best overall in the competition.
- The first aim of this thesis is to give a Hadwiger-type theorem for the exceptional Lie group Spin(9). The space of Spin(9)-invariant k-homogeneous valuations is studied through the construction of an exact sequence involving some spaces of differential forms. We present then a description of the spin representation using the properties of the 8-dimensional division algebra of the octonions. Using this description as well as representation-theoretic formulas, we can compute the dimensions of the spaces of differential forms appearing in the exact sequence. Hence we obtain the dimensions of the spaces of k-homogeneous Spin(9)-invariant valuations for k=0,1,...,16.
In the second part of this work, we construct one new element for a basis of one of these spaces. It is clear, that the k-th intrinsic volume is also Spin(9)-invariant. The last chapter of this work presents the construction of a new 2-homogeneous Spin(9)-invariant valuation. On a Riemannian manifold (M,g), we construct a valuation by integrating the curvature tensor over the disc bundle. We associate to this valuation on M a family of valuations on the tangent spaces. We show that these valuations are even and homogeneous of degree 2. Moreover, since the valuation on M is invariant under the action of the isometry group of M, the induced valuation on the tangent space in a point p in M is invariant under the action of the stabilisator of p for all p in M. In the special case where M is the octonionic projective plane, this construction yields an even, homogeneous of degree 2, Spin(9)-invariant valuation, whose Klain function is not constant, i.e. which is linearly independent of the second intrinsic volume.
Design of competitive paging algorithms with good behaviour in practice
- Paging is one of the most prominent problems in the field of online algorithms. We have to serve a sequence of page requests using a cache that can hold up to k pages. If the currently requested page is in cache we have a cache hit, otherwise we say that a cache miss occurs, and the requested page needs to be loaded into the cache. The goal is to minimize the number of cache misses by providing a good page-replacement strategy. This problem is part of memory-management when data is stored in a two-level memory hierarchy, more precisely a small and fast memory (cache) and a slow but large memory (disk). The most important application area is the virtual memory management of operating systems. Accessed pages are either already in the RAM or need to be loaded from the hard disk into the RAM using expensive I/O. The time needed to access the RAM is insignificant compared to an I/O operation which takes several milliseconds.
The traditional evaluation framework for online algorithms is competitive analysis where the online algorithm is compared to the optimal offline solution. A shortcoming of competitive analysis consists of its too pessimistic worst-case guarantees. For example LRU has a theoretical competitive ratio of k but in practice this ratio rarely exceeds the value 4.
Reducing the gap between theory and practice has been a hot research issue during the last years. More recent evaluation models have been used to prove that LRU is an optimal online algorithm or part of a class of optimal algorithms respectively, which was motivated by the assumption that LRU is one of the best algorithms in practice. Most of the newer models make LRU-friendly assumptions regarding the input, thus not leaving much room for new algorithms.
Only few works in the field of online paging have introduced new algorithms which can compete with LRU as regards the small number of cache misses.
In the first part of this thesis we study strongly competitive randomized paging algorithms, i.e. algorithms with optimal competitive guarantees. Although the tight bound for the competitive ratio has been known for decades, current algorithms matching this bound are complex and have high running times and memory requirements. We propose the algorithm OnlineMin which processes a page request in O(log k/log log k) time in the worst case. The best previously known solution requires O(k^2) time.
Usually the memory requirement of a paging algorithm is measured by the maximum number of pages that the algorithm keeps track of. Any algorithm stores information about the k pages in the cache. In addition it can also store information about pages not in cache, denoted bookmarks. We answer the open question of Bein et al. '07 whether strongly competitive randomized paging algorithms using only o(k) bookmarks exist or not. To do so we modify the Partition algorithm of McGeoch and Sleator '85 which has an unbounded bookmark complexity, and obtain Partition2 which uses O(k/log k) bookmarks.
In the second part we extract ideas from theoretical analysis of randomized paging algorithms in order to design deterministic algorithms that perform well in practice. We refine competitive analysis by introducing the attack rate
parameter r, which ranges between 1 and k. We show that r is a tight bound on the competitive ratio of deterministic algorithms.
We give empirical evidence that r is usually much smaller than k and thus r-competitive algorithms have a reasonable performance on real-world traces. By introducing the r-competitive priority-based algorithm class OnOPT we obtain a collection of promising algorithms to beat the LRU-standard. We single out the new algorithm RDM and show that it outperforms LRU and some of its variants on a wide range of real-world traces.
Since RDM is more complex than LRU one may think at first sight that the gain in terms of lowering the number of cache misses is ruined by high runtime for processing pages. We engineer a fast implementation of RDM, and compare it
to LRU and the very fast FIFO algorithm in an overall evaluation scheme, where we measure the runtime of the algorithms and add penalties for each cache miss.
Experimental results show that for realistic penalties RDM still outperforms these two algorithms even if we grant the competitors an idealistic runtime of 0.
Analyzing user feedback of on-line communities
- The economic success of the World Wide Web makes it a highly competitive environment for web businesses. For this reason, it is crucial for web business owners to learn what their customers want. This thesis provides a conceptual framework and an implementation of a system that helps to better understand the behavior and potential interests of web site visitors by accounting for both explicit and implicit feedback. This thesis is divided into two parts.
The first part is rooted in computer science and information systems and uses graph theory and an extended click-stream analysis to define a framework and a system tool that is useful for analyzing web user behavior by calculating the interests of the users.
The second part is rooted in behavioral economics, mathematics, and psychology and is investigating influencing factors on different types of web user choices. In detail, a model for the cognitive process of rating products on the Web is defined and an importance hierarchy of the influencing factors is discovered.
Both parts make use of techniques from a variety of research fields and, therefore, contribute to the area of Web Science.
An erasure-resilient and compute-efficient coding scheme for storage applications
- Driven by rapid technological advancements, the amount of data that is created, captured, communicated, and stored worldwide has grown exponentially over the past decades. Along with this development it has become critical for many disciplines of science and business to being able to gather and analyze large amounts of data. The sheer volume of the data often exceeds the capabilities of classical storage systems, with the result that current large-scale storage systems are highly distributed and are comprised of a high number of individual storage components. As with any other electronic device, the reliability of storage hardware is governed by certain probability distributions, which in turn are influenced by the physical processes utilized to store the information. The traditional way to deal with the inherent unreliability of combined storage systems is to replicate the data several times. Another popular approach to achieve failure tolerance is to calculate the block-wise parity in one or more dimensions. With better understanding of the different failure modes of storage components, it has become evident that sophisticated high-level error detection and correction techniques are indispensable for the ever-growing distributed systems. The utilization of powerful cyclic error-correcting codes, however, comes with a high computational penalty, since the required operations over finite fields do not map very well onto current commodity processors. This thesis introduces a versatile coding scheme with fully adjustable fault-tolerance that is tailored specifically to modern processor architectures. To reduce stress on the memory subsystem the conventional table-based algorithm for multiplication over finite fields has been replaced with a polynomial version. This arithmetically intense algorithm is better suited to the wide SIMD units of the currently available general purpose processors, but also displays significant benefits when used with modern many-core accelerator devices (for instance the popular general purpose graphics processing units). A CPU implementation using SSE and a GPU version using CUDA are presented. The performance of the multiplication depends on the distribution of the polynomial coefficients in the finite field elements. This property has been used to create suitable matrices that generate a linear systematic erasure-correcting code which shows a significantly increased multiplication performance for the relevant matrix elements. Several approaches to obtain the optimized generator matrices are elaborated and their implications are discussed. A Monte-Carlo-based construction method allows it to influence the specific shape of the generator matrices and thus to adapt them to special storage and archiving workloads. Extensive benchmarks on CPU and GPU demonstrate the superior performance and the future application scenarios of this novel erasure-resilient coding scheme.