Conference Proceeding
Refine
Year of publication
Document Type
- Conference Proceeding (41) (remove)
Has Fulltext
- yes (41)
Is part of the Bibliography
- no (41)
Keywords
Institute
- Informatik (41) (remove)
Retiming is a widely investigated technique for performance optimization. In general, it performs extensive modifications on a circuit netlist, leaving it unclear, whether the achieved performance improvement will still be valid after placement has been performed. This paper presents an approach for integrating retiming into a timing-driven placement environment. The experimental results show the benefit of the proposed approach on circuit performance in comparison with design flows using retiming only as a pre- or postplacement optimization method.
We study threshold testing, an elementary probing model with the goal to choose a large value out of n i.i.d. random variables. An algorithm can test each variable X_i once for some threshold t_i, and the test returns binary feedback whether X_i ≥ t_i or not. Thresholds can be chosen adaptively or non-adaptively by the algorithm. Given the results for the tests of each variable, we then select the variable with highest conditional expectation. We compare the expected value obtained by the testing algorithm with expected maximum of the variables. Threshold testing is a semi-online variant of the gambler’s problem and prophet inequalities. Indeed, the optimal performance of non-adaptive algorithms for threshold testing is governed by the standard i.i.d. prophet inequality of approximately 0.745 + o(1) as n → ∞. We show how adaptive algorithms can significantly improve upon this ratio. Our adaptive testing strategy guarantees a competitive ratio of at least 0.869 - o(1). Moreover, we show that there are distributions that admit only a constant ratio c < 1, even when n → ∞. Finally, when each box can be tested multiple times (with n tests in total), we design an algorithm that achieves a ratio of 1 - o(1).
In the last decade, much effort went into the design of robust third-person pronominal anaphor resolution algorithms. Typical approaches are reported to achieve an accuracy of 60-85%. Recent research addresses the question of how to deal with the remaining difficult-toresolve anaphors. Lappin (2004) proposes a sequenced model of anaphor resolution according to which a cascade of processing modules employing knowledge and inferencing techniques of increasing complexity should be applied. The individual modules should only deal with and, hence, recognize the subset of anaphors for which they are competent. It will be shown that the problem of focusing on the competence cases is equivalent to the problem of giving precision precedence over recall. Three systems for high precision robust knowledge-poor anaphor resolution will be designed and compared: a ruleset-based approach, a salience threshold approach, and a machine-learning-based approach. According to corpus-based evaluation, there is no unique best approach. Which approach scores highest depends upon type of pronominal anaphor as well as upon text genre.
This paper proposes a new approach for the encoding of images by only a few important components. Classically, this is done by the Principal Component Analysis (PCA). Recently, the Independent Component Analysis (ICA) has found strong interest in the neural network community. Applied to images, we aim for the most important source patterns with the highest occurrence probability or highest information called principal independent components (PIC). For the example of a synthetic image composed by characters this idea selects the salient ones. For natural images it does not lead to an acceptable reproduction error since no a-priori probabilities can be computed. Combining the traditional principal component criteria of PCA with the independence property of ICA we obtain a better encoding. It turns out that this definition of PIC implements the classical demand of Shannon’s rate distortion theory.
Interest to become a data scientist or related professions in data science domain is rapidly growing. To meet such a demand, we propose a novel educational service that aims to provide tailored learning paths for data science. Our target user is one who aims to be an expert in data science. Our approach is to analyze the background of the practitioner and match the learning units. A critical feature is that we use gamification to reinforce the practitioner engagement. We believe that our work provides a practical guideline for those who want to learn data science.
This paper shows the equivalence of applicative similarity and contextual approximation, and hence also of bisimilarity and contextual equivalence, in the deterministic call-by-need lambda calculus with letrec. Bisimilarity simplifies equivalence proofs in the calculus and opens a way for more convenient correctness proofs for program transformations. Although this property may be a natural one to expect, to the best of our knowledge, this paper is the first one providing a proof. The proof technique is to transfer the contextual approximation into Abramsky’s lazy lambda calculus by a fully abstract and surjective translation. This also shows that the natural embedding of Abramsky’s lazy lambda calculus into the call-by-need lambda calculus with letrec is an isomorphism between the respective term-models. We show that the equivalence property proven in this paper transfers to a call-by-need letrec calculus developed by Ariola and Felleisen. 1998 ACM Subject Classification: F.4.2, F.3.2, F.3.3, F.4.1. Key words and phrases: semantics, contextual equivalence, bisimulation, lambda calculus, call-by-need, letrec.
In order to promote the accessibility of biodiversity data in historic and contemporary literature, we introduce a new interdisciplinary project called BIOfid (FID=Fachinformationsdienst, a service for providing specialized information). The project aims at a mobilization of data available in print only by combining digitization of scientific biodiversity literature with the development of innovative text mining tools for complex, eventually semantic searches throughout the complete text corpus. A major prerequisite for the development of such search tools is the provision of sophisticated anatomy ontologies on the one hand, and of complete lists of species names (currently considered valid as well as all synonyms) at a global scale on the other hand. In the initial stage, we chose examples from German publications of the past 250 years dealing with the geographic distribution and ecology of vascular plants (Tracheophyta), birds (Aves), as well as moths and butterflies (Lepidoptera) in Germany. These taxa have been prioritized according to current demands of German research groups (about 50 sites) aiming at analyses and modeling of distribution patterns and their changes through time. In the long term, we aim at providing data and open source software applicable for any taxon and geographic region. For this purpose, a platform for open access journals for long-term availability of professional e-journals will be established. All generated data will also be made accessible through GFBio (German Federation for Biological Data). BIOfid is supported by the LIS-Scientific Library Services and Information Systems program of the German Research Foundation (DFG).
This paper describes the ongoing efforts of the authors to present ancient Greek and Roman numismatic data on the public internet, with an emphasis on efforts to integrate information from multiple sources using Linked Data and Semantic Web techniques. By way of very modern metaphor, it is useful to think of coins as intentionally created packages of 'named entities'. Each coin was struck by a particular authority, often at a known site, and coins often make reference to familiar concepts such as deities, historical events, or symbols that were widely recognized in the ancient world. The institutions represented among the authors have deployed search interfaces that allow users to take advantage of this aspect of numismatic databases. The American Numismatic Society's database provides faceted search to its collection of over 550,000 objects. The Portable Antiquities Scheme (PAS) in the UK presents individual finds (and hoards) recorded throughout the country. The Römisch-Germanische Kommission and the University of Frankfurt (DBIS) are developing a prototype metaportal (INTERFACE) that accesses national databases of coin finds held in in Frankfurt, Vienna and Utrecht. Each of these resources is beginning to explore Semantic Web/Linked data approaches so that the role of numismatic standards is immediately coming to the fore. DBIS and INTERFACE are developing a numismatic ontology. At the ANS and PAS, the public database already presents RDF serializations based on Dublin Core. Together, the authors have begun to explore standardization of conceptual names on the basis of the vocabulary presented at the site http://nomisma.org . Nomisma.org is a collaborative effort to provide stable digital representations of numismatic concepts and entities. It provides URIs for such basic concepts as 'coin', 'mint', 'axis'. All of these are defined within the scope of numismatics but are already being linked to other stable resources where available. This is particularly the case for mints. For example, the URI http://nomisma.org/id/corinth is intended to represent that ancient city in its role as a minter/issuer of coins. The URI is linked via the SKOS ontology to the Pleiades Gazetteer of ancient places. This allows Nomisma to be the basis for a common representation of the concept that an object is a coin minted at Corinth. The ANS has already deployed such relationships in its public database. The work of all these projects is very much in progress so that this paper hopes to generate discussion on how multiple large projects can move forward in their own work while encouraging sufficient commonality to support large scale research questions undertaken by diverse audiences.
Exhaustive, automatic testing of dataflow (esp. mapreduce) programs has emerged as an important challenge. Past work demonstrated effective ways to generate small example data sets that exercise operators in the Pig platform, used to generate Hadoop map-reduce programs. Although such prior techniques attempt to cover all cases of operator use, in practice they often fail. Our SEDGE system addresses these completeness problems: for every dataflow operator, we produce data aiming to cover all cases that arise in the dataflow program (e.g., both passing and failing a filter). SEDGE relies on transforming the program into symbolic constraints, and solving the constraints using a symbolic reasoning engine (a powerful SMT solver), while using input data as concrete aids in the solution process. The approach resembles dynamic-symbolic (a.k.a. "concolic") execution in a conventional programming language, adapted to the unique features of the dataflow domain.
In third-party benchmarks, SEDGE achieves higher coverage than past techniques for 5 out of 20 PigMix benchmarks and 7 out of 11 SDSS benchmarks and (with equal coverage for the rest of the benchmarks). We also show that our targeting of the high-level dataflow language pays off: for complex programs, state-of-the-art dynamic-symbolic execution at the level of the generated map-reduce code (instead of the original dataflow program) requires many more test cases or achieves much lower coverage than our approach.
Syntactic coindexing restrictions are by now known to be of central importance to practical anaphor resolution approaches. Since, in particular due to structural ambiguity, the assumption of the availability of a unique syntactic reading proves to be unrealistic, robust anaphor resolution relies on techniques to overcome this deficiency. In this paper, two approaches are presented which generalize the verification of coindexing constraints to de cient descriptions. At first, a partly heuristic method is described, which has been implemented. Secondly, a provable complete method is specified. It provides the means to exploit the results of anaphor resolution for a further structural disambiguation. By rendering possible a parallel processing model, this method exhibits, in a general sense, a higher degree of robustness. As a practically optimal solution, a combination of the two approaches is suggested.
This paper considers the logic FOcard, i.e., first-order logic with cardinality predicates that can specify the size of a structure modulo some number. We study the expressive power of FOcard on the class of languages of ranked, finite, labelled trees with successor relations. Our first main result characterises the class of FOcard-definable tree languages in terms of algebraic closure properties of the tree languages. As it can be effectively checked whether the language of a given tree automaton satisfies these closure properties, we obtain a decidable characterisation of the class of regular tree languages definable in FOcard. Our second main result considers first-order logic with unary relations, successor relations, and two additional designated symbols < and + that must be interpreted as a linear order and its associated addition. Such a formula is called addition-invariant if, for each fixed interpretation of the unary relations and successor relations, its result is independent of the particular interpretation of < and +. We show that the FOcard-definable tree languages are exactly the regular tree languages definable in addition-invariant first-order logic. Our proof techniques involve tools from algebraic automata theory, reasoning with locality arguments, and the use of logical interpretations. We combine and extend methods developed by Benedikt and Segoufin (ACM ToCL, 2009) and Schweikardt and Segoufin (LICS, 2010).
Retiming is a widely investigated technique for performance optimization. It performs powerful modifications on a circuit netlist. However, often it is not clear, whether the predicted performance improvement will still be valid after placement has been performed. This paper presents a new retiming algorithm using a highly accurate timing model taking into account the effect of retiming on capacitive loads of single wires as well as fanout systems. We propose the integration of retiming into a timing-driven standard cell placement environment based on simulated annealing. Retiming is used as an optimization technique throughout the whole placement process. The experimental results show the benefit of the proposed approach. In comparison with the conventional design flow based on standard FEAS our approach achieved an improvement in cycle time of up to 34% and 17% on the average.
This paper is a contribution to exploring and analyzing space-improvements in concurrent programming languages, in particular in the functional process-calculus CHF. Space-improvements are defined as a generalization of the corresponding notion in deterministic pure functional languages. The main part of the paper is the O(n ·logn) algorithm SPOPTN for offline space optimization of several parallel independent processes. Applications of this algorithm are: (i) affirmation of space improving transformations for particular classes of program transformations; (ii) support of an interpreter-based method for refuting space-improvements; and (iii) as a stand-alone offline-optimizer for space (or similar resources) of parallel processes.
Given x small epsilon, Greek Rn an integer relation for x is a non-trivial vector m small epsilon, Greek Zn with inner product <m,x> = 0. In this paper we prove the following: Unless every NP language is recognizable in deterministic quasi-polynomial time, i.e., in time O(npoly(log n)), the ℓinfinity-shortest integer relation for a given vector x small epsilon, Greek Qn cannot be approximated in polynomial time within a factor of 2log0.5 − small gamma, Greekn, where small gamma, Greek is an arbitrarily small positive constant. This result is quasi-complementary to positive results derived from lattice basis reduction. A variant of the well-known L3-algorithm approximates for a vector x small epsilon, Greek Qn the ℓ2-shortest integer relation within a factor of 2n/2 in polynomial time. Our proof relies on recent advances in the theory of probabilistically checkable proofs, in particular on a reduction from 2-prover 1-round interactive proof-systems. The same inapproximability result is valid for finding the ℓinfinity-shortest integer solution for a homogeneous linear system of equations over Q.
In its first part, this contribution reviews shortly the application of neural network methods to medical problems and characterizes its advantages and problems in the context of the medical background. Successful application examples show that human diagnostic capabilities are significantly worse than the neural diagnostic systems. Then, paradigm of neural networks is shortly introduced and the main problems of medical data base and the basic approaches for training and testing a network by medical data are described. Additionally, the problem of interfacing the network and its result is given and the neuro-fuzzy approach is presented. Finally, as case study of neural rule based diagnosis septic shock diagnosis is described, on one hand by a growing neural network and on the other hand by a rule based system. Keywords: Statistical Classification, Adaptive Prediction, Neural Networks, Neurofuzzy, Medical Systems
In the last years, much effort went into the design of robust anaphor resolution algorithms. Many algorithms are based on antecedent filtering and preference strategies that are manually designed. Along a different line of research, corpus-based approaches have been investigated that employ machine-learning techniques for deriving strategies automatically. Since the knowledge-engineering effort for designing and optimizing the strategies is reduced, the latter approaches are considered particularly attractive. Since, however, the hand-coding of robust antecedent filtering strategies such as syntactic disjoint reference and agreement in person, number, and gender constitutes a once-for-all effort, the question arises whether at all they should be derived automatically. In this paper, it is investigated what might be gained by combining the best of two worlds: designing the universally valid antecedent filtering strategies manually, in a once-for-all fashion, and deriving the (potentially genre-specific) antecedent selection strategies automatically by applying machine-learning techniques. An anaphor resolution system ROSANA-ML, which follows this paradigm, is designed and implemented. Through a series of formal evaluations, it is shown that, while exhibiting additional advantages, ROSANAML reaches a performance level that compares with the performance of its manually designed ancestor ROSANA.
This paper gives a brief overview of computation models for data stream processing, and it introduces a new model for multi-pass processing of multiple streams, the so-called mp2s-automata. Two algorithms for solving the set disjointness problem with these automata are presented. The main technical contribution of this paper is the proof of a lower bound on the size of memory and the number of heads that are required for solving the set disjointness problem with mp2s-automata.