Refine
Year of publication
- 2003 (12) (remove)
Document Type
- Working Paper (5)
- Article (2)
- Conference Proceeding (2)
- Part of a Book (1)
- Preprint (1)
- Review (1)
Language
- English (12) (remove)
Has Fulltext
- yes (12)
Is part of the Bibliography
- no (12)
Keywords
Institute
- Informatik (12) (remove)
The Internet as the biggest human library ever assembled keeps on growing. Although all kinds of information carriers (e.g. audio/video/hybrid file formats) are available, text based documents dominate. It is estimated that about 80% of all information worldwide stored electronically exists in (or can be converted into) text form. More and more, all kinds of documents are generated by means of a text processing system and are therefore available electronically. Nowadays, many printed journals are also published online and may even discontinue to appear in print form tomorrow. This development has many convincing advantages: the documents are both available faster (cf. prepress services) and cheaper, they can be searched more easily, the physical storage only needs a fraction of the space previously necessary and the medium will not age. For most people, fast and easy access is the most interesting feature of the new age; computer-aided search for specific documents or Web pages becomes the basic tool for information-oriented work. But this tool has problems. The current keyword based search machines available on the Internet are not really appropriate for such a task; either there are (way) too many documents matching the specified keywords are presented or none at all. The problem lies in the fact that it is often very difficult to choose appropriate terms describing the desired topic in the first place. This contribution discusses the current state-of-the-art techniques in content-based searching (along with common visualization/browsing approaches) and proposes a particular adaptive solution for intuitive Internet document navigation, which not only enables the user to provide full texts instead of manually selected keywords (if available), but also allows him/her to explore the whole database.
In bioinformatics, biochemical pathways can be modeled by many differential equations. It is still an open problem how to fit the huge amount of parameters of the equations to the available data. Here, the approach of systematically learning the parameters is necessary. In this paper, for the small, important example of inflammation modeling a network is constructed and different learning algorithms are proposed. It turned out that due to the nonlinear dynamics evolutionary approaches are necessary to fit the parameters for sparse, given data. Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence - ICTAI 2003
We investigate a restricted one-way cellular automaton (OCA) model where the number of cells is bounded by a constant number k, so-called kC-OCAs. In contrast to the general model, the generative capacity of the restricted model is reduced to the set of regular languages. A kC-OCA can be algorithmically converted to a deterministic finite automaton (DFA). The blow-up in the number of states is bounded by a polynomial of degree k. We can exhibit a family of unary languages which shows that this upper bound is tight in order of magnitude. We then study upper and lower bounds for the trade-off when converting DFAs to kC-OCAs. We show that there are regular languages where the use of kC-OCAs cannot reduce the number of states when compared to DFAs. We then investigate trade-offs between kC-OCAs with different numbers of cells and finally treat the problem of minimizing a given kC-OCA.
The effect of adding two-way communication to k cells one-way cellular automata (kC-OCAs) on their size of description is studied. kC-OCAs are a parallel model for the regular languages that consists of an array of k identical deterministic finite automata (DFAs), called cells, operating in parallel. Each cell gets information from its right neighbor only. In this paper, two models with different amounts of two-way communication are investigated. Both models always achieve quadratic savings when compared to DFAs. When compared to a one-way cellular model, the result is that minimum two-way communication can achieve at most quadratic savings whereas maximum two-way communication may provide savings bounded by a polynomial of degree k.
The descriptional complexity of iterative arrays (lAs) is studied. Iterative arrays are a parallel computational model with a sequential processing of the input. It is shown that lAs when compared to deterministic finite automata or pushdown automata may provide savings in size which are not bounded by any recursive function, so-called non-recursive trade-offs. Additional non-recursive trade-offs are proven to exist between lAs working in linear time and lAs working in real time. Furthermore, the descriptional complexity of lAs is compared with cellular automata (CAs) and non-recursive trade-offs are proven between two restricted classes. Finally, it is shown that many decidability questions for lAs are undecidable and not semidecidable.
We investigate unary regular languages and compare deterministic finite automata (DFA’s), nondeterministic finite automata (NFA’s) and probabilistic finite automata (PFA’s) with respect to their size. Given a unary PFA with n states and an e-isolated cutpoint, we show that the minimal equivalent DFA has at most n exp 1/2e states in its cycle. This result is almost optimal, since for any alpha < 1 a family of PFA’s can be constructed such that every equivalent DFA has at least n exp alpha/2e states. Thus we show that for the model of probabilistic automata with a constant error bound, there is only a polynomial blowup for cyclic languages. Given a unary NFA with n states, we show that efficiently approximating the size of a minimal equivalent NFA within the factor sqrt(n)/ln n is impossible unless P = NP. This result even holds under the promise that the accepted language is cyclic. On the other hand we show that we can approximate a minimal NFA within the factor ln n, if we are given a cyclic unary n-state DFA.
Coreference-Based Summarization and Question Answering: a Case for High Precision Anaphor Resolution
(2003)
Approaches to Text Summarization and Question Answering are known to benefit from the availability of coreference information. Based on an analysis of its contributions, a more detailed look at coreference processing for these applications will be proposed: it should be considered as a task of anaphor resolution rather than coreference resolution. It will be further argued that high precision approaches to anaphor resolution optimally match the specific requirements. Three such approaches will be described and empirically evaluated, and the implications for Text Summarization and Question Answering will be discussed.
We present a novel practical algorithm that given a lattice basis b1, ..., bn finds in O(n exp 2 *(k/6) exp (k/4)) average time a shorter vector than b1 provided that b1 is (k/6) exp (n/(2k)) times longer than the length of the shortest, nonzero lattice vector. We assume that the given basis b1, ..., bn has an orthogonal basis that is typical for worst case lattice bases. The new reduction method samples short lattice vectors in high dimensional sublattices, it advances in sporadic big jumps. It decreases the approximation factor achievable in a given time by known methods to less than its fourth-th root. We further speed up the new method by the simple and the general birthday method. n2
We enhance the security of Schnorr blind signatures against the novel one-more-forgery of Schnorr [Sc01] andWagner [W02] which is possible even if the discrete logarithm is hard to compute. We show two limitations of this attack. Firstly, replacing the group G by the s-fold direct product G exp(×s) increases the work of the attack, for a given number of signer interactions, to the s-power while increasing the work of the blind signature protocol merely by a factor s. Secondly, we bound the number of additional signatures per signer interaction that can be forged effectively. That fraction of the additional forged signatures can be made arbitrarily small.