Refine
Year of publication
Document Type
- Report (25) (remove)
Has Fulltext
- yes (25) (remove)
Is part of the Bibliography
- no (25) (remove)
Keywords
- LLL-reduction (2)
- Textanalyse ; Linguistische Datenverarbeitung; Computerlinguistik (2)
- segments (2)
- Bildnisschutz (1)
- Cameras (1)
- Computerlinguistik (1)
- Data protection (1)
- Datenschutz (1)
- Digital Rights Management (1)
- Digitalkamera (1)
Institute
- Informatik (25) (remove)
In the last years, much effort went into the design of robust anaphor resolution algorithms. Many algorithms are based on antecedent filtering and preference strategies that are manually designed. Along a different line of research, corpus-based approaches have been investigated that employ machine-learning techniques for deriving strategies automatically. Since the knowledge-engineering effort for designing and optimizing the strategies is reduced, the latter approaches are considered particularly attractive. Since, however, the hand-coding of robust antecedent filtering strategies such as syntactic disjoint reference and agreement in person, number, and gender constitutes a once-for-all effort, the question arises whether at all they should be derived automatically. In this paper, it is investigated what might be gained by combining the best of two worlds: designing the universally valid antecedent filtering strategies manually, in a once-for-all fashion, and deriving the (potentially genre-specific) antecedent selection strategies automatically by applying machine-learning techniques. An anaphor resolution system ROSANA-ML, which follows this paradigm, is designed and implemented. Through a series of formal evaluations, it is shown that, while exhibiting additional advantages, ROSANAML reaches a performance level that compares with the performance of its manually designed ancestor ROSANA.
Robuste Anaphernresolution
(2004)
Im Zeitalter der ständig wachsenden Mobilitätsanforderungen kommt dem flexiblen, dezentralen Zugriff auf Datenbestände aller Art eine immer größere Bedeutung zu. Steht ein Zugang via Internet nicht zur Verfügung, so bietet sich als Alternative die Verwendung eines Mobiltelefons an. Auf der Grundlage des WAP-Protokolls konnen elementare grafische Zugriffsschnittstellen geschaffen werden; deren Möglichkeiten sind jedoch begrenzt: Im Vergleich zu stationären Computerterminals ist die Displaygröße i.d.R. gering; entsprchend aufwändig verlauft das Browsing. Die gegenwärtige Technologie verfügt über eine geringe Bandbreite. die Navigation über Tasten wird vom Benutzer als umständlich empfunden. Es gibt Einsatzkontexte, die eine tastaturbasierte Interaktion a priori ausschließen. Als Alternative bieten sich gesprochensprachige Schnittstellen an, in denen der Benutzer einen Mensch-Maschine-Dialog mit einem telefonbasierten Sprachportal führt. Die Grundlage derartiger Anwendungen bietet Hardware- bzw. Software-Technologie zu Computer-Telefonie-Integration, Spracherkennung, Sprachsynthese. Mit diesen technologischen Basiskomponenten alleine ist es jedoch noch nicht getan: In Abhängigkeit von den spezifischen Erfordernissen der jeweiligen Anwendung sind geeignete Vorgaben zu spezifizieren, die den Computer in die Lage versetzen, den Dialog mit seinem menschlichen Gegenüber in problemadaquater Weise zu führen. Wichtige Anforderungen sind: Natürlichkeit: Ausgestaltung der sprachlichen Interaktion in einer Weise, die den Erwartungen des Anwenders hinsichtlich des jeweiligen Anwendungsfalls entsprechen; Flexibilität: Anpassung an die Eigenarten des jeweiligen Nutzers (Novize oder geübter Anwender etc.); 2 Robustheit: geeignetes Handling von Missverständnissen, unvollständigem Benutzer-Input sowie Unzulänglichkeiten der maschinellen Sprachverarbeitung (insbesondere Fehler in der Spracherkennung) etc. Formale Spezifikationen des maschinellen Dialogverhaltens werden als Dialogmodelle bezeichnet. Hinsichtlich der generischen Wiederverwendbarkeit der Dialogsoftware ist es sinnvoll, derartige Beschreibungen in einem standardisierten Formalismus, einer Dialogmodellierungssprache abzufassen, die sich somit in erster Näherung als eine "Programmiersprache" für eine generische Dialogmaschine auffassen lässt. Folglich stellt sich die Frage, wie eine geeignete Dialogmodellierungssprache aussehen könnte. In Bezug auf webbasierte Sprachportale wurde vom W3C die XML-basierte Dialogmodellierungssprache VoiceXML als Standardisierungsvorschlag erarbeitet ([7]). Im vorliegenden Dokument sollen zunächst Reichweite und Grenzen der Sprache VoiceXML evaluiert werden. Auf der Grundlage der Evaluation sollen strategischen Empfehlungen fur Unternehmen abgeleitet werden, die sich als Anwendungsentwickler auf dem Innovationsmarkt der telefonbasierten Sprachportale betätigen wollen. Die zentralen Fragen lauten: 1. Welches sind die zentralen Probleme der Entwicklung telefonbasierter Sprachportale? 2. Inwieweit löst VoiceXML diese Probleme? 3. Inwiefern lohnt es sich somit, (z.B. zwecks Herausbildung eines Alleinstellungsmerkmals) auf die Technologie VoiceXML zu setzen? 4. Welche Alternativen existieren? In welchen anderen Bereichen sollte man ggf. Kernkompetenzen herausbilden?
Black box cryptanalysis applies to hash algorithms consisting of many small boxes, connected by a known graph structure, so that the boxes can be evaluated forward and backwards by given oracles. We study attacks that work for any choice of the black boxes, i.e. we scrutinize the given graph structure. For example we analyze the graph of the fast Fourier transform (FFT). We present optimal black box inversions of FFT-compression functions and black box constructions of collisions. This determines the minimal depth of FFT-compression networks for collision-resistant hashing. We propose the concept of multipermutation, which is a pair of orthogonal latin squares, as a new cryptographic primitive that generalizes the boxes of the FFT. Our examples of multipermutations are based on the operations circular rotation, bitwise xor, addition and multiplication.
We study the following problem: given x element Rn either find a short integer relation m element Zn, so that =0 holds for the inner product <.,.>, or prove that no short integer relation exists for x. Hastad, Just Lagarias and Schnorr (1989) give a polynomial time algorithm for the problem. We present a stable variation of the HJLS--algorithm that preserves lower bounds on lambda(x) for infinitesimal changes of x. Given x \in {\RR}^n and \alpha \in \NN this algorithm finds a nearby point x' and a short integer relation m for x'. The nearby point x' is 'good' in the sense that no very short relation exists for points \bar{x} within half the x'--distance from x. On the other hand if x'=x then m is, up to a factor 2^{n/2}, a shortest integer relation for \mbox{x.} Our algorithm uses, for arbitrary real input x, at most \mbox{O(n^4(n+\log \alpha))} many arithmetical operations on real numbers. If x is rational the algorithm operates on integers having at most \mbox{O(n^5+n^3 (\log \alpha)^2 + \log (\|q x\|^2))} many bits where q is the common denominator for x.
We modify the concept of LLL-reduction of lattice bases in the sense of Lenstra, Lenstra, Lovasz [LLL82] towards a faster reduction algorithm. We organize LLL-reduction in segments of the basis. Our SLLL-bases approximate the successive minima of the lattice in nearly the same way as LLL-bases. For integer lattices of dimension n given by a basis of length 2exp(O(n)), SLLL-reduction runs in O(n.exp(5+epsilon)) bit operations for every epsilon > 0, compared to O(exp(n7+epsilon)) for the original LLL and to O(exp(n6+epsilon)) for the LLL-algorithms of Schnorr (1988) and Storjohann (1996). We present an even faster algorithm for SLLL-reduction via iterated subsegments running in O(n*exp(3)*log n) arithmetic steps.
We present a practical algorithm that given an LLL-reduced lattice basis of dimension n, runs in time O(n3(k=6)k=4+n4) and approximates the length of the shortest, non-zero lattice vector to within a factor (k=6)n=(2k). This result is based on reasonable heuristics. Compared to previous practical algorithms the new method reduces the proven approximation factor achievable in a given time to less than its fourthth root. We also present a sieve algorithm inspired by Ajtai, Kumar, Sivakumar [AKS01].
Let G be a finite cyclic group with generator \alpha and with an encoding so that multiplication is computable in polynomial time. We study the security of bits of the discrete log x when given \exp_{\alpha}(x), assuming that the exponentiation function \exp_{\alpha}(x) = \alpha^x is one-way. We reduce he general problem to the case that G has odd order q. If G has odd order q the security of the least-significant bits of x and of the most significant bits of the rational number \frac{x}{q} \in [0,1) follows from the work of Peralta [P85] and Long and Wigderson [LW88]. We generalize these bits and study the security of consecutive shift bits lsb(2^{-i}x mod q) for i=k+1,...,k+j. When we restrict \exp_{\alpha} to arguments x such that some sequence of j consecutive shift bits of x is constant (i.e., not depending on x) we call it a 2^{-j}-fraction of \exp_{\alpha}. For groups of odd group order q we show that every two 2^{-j}-fractions of \exp_{\alpha} are equally one-way by a polynomial time transformation: Either they are all one-way or none of them. Our key theorem shows that arbitrary j consecutive shift bits of x are simultaneously secure when given \exp_{\alpha}(x) iff the 2^{-j}-fractions of \exp_{\alpha} are one-way. In particular this applies to the j least-significant bits of x and to the j most-significant bits of \frac{x}{q} \in [0,1). For one-way \exp_{\alpha} the individual bits of x are secure when given \exp_{\alpha}(x) by the method of Hastad, N\"aslund [HN98]. For groups of even order 2^{s}q we show that the j least-significant bits of \lfloor x/2^s\rfloor, as well as the j most-significant bits of \frac{x}{q} \in [0,1), are simultaneously secure iff the 2^{-j}-fractions of \exp_{\alpha'} are one-way for \alpha' := \alpha^{2^s}. We use and extend the models of generic algorithms of Nechaev (1994) and Shoup (1997). We determine the generic complexity of inverting fractions of \exp_{\alpha} for the case that \alpha has prime order q. As a consequence, arbitrary segments of (1-\varepsilon)\lg q consecutive shift bits of random x are for constant \varepsilon >0 simultaneously secure against generic attacks. Every generic algorithm using $t$ generic steps (group operations) for distinguishing bit strings of j consecutive shift bits of x from random bit strings has at most advantage O((\lg q) j\sqrt{t} (2^j/q)^{\frac14}).
Korrektur zu: C.P. Schnorr: Security of 2t-Root Identification and Signatures, Proceedings CRYPTO'96, Springer LNCS 1109, (1996), pp. 143-156 page 148, section 3, line 5 of the proof of Theorem 3. Die Korrektur wurde präsentiert als: "Factoring N via proper 2 t-Roots of 1 mod N" at Eurocrypt '97 rump session.
We present techniques to prove termination of cycle rewriting, that is, string rewriting on cycles, which are strings in which the start and end are connected. Our main technique is to transform cycle rewriting into string rewriting and then apply state of the art techniques to prove termination of the string rewrite system. We present three such transformations, and prove for all of them that they are sound and complete. In this way not only termination of string rewriting of the transformed system implies termination of the original cycle rewrite system, a similar conclusion can be drawn for non-termination. Apart from this transformational approach, we present a uniform framework of matrix interpretations, covering most of the earlier approaches to automatically proving termination of cycle rewriting. All our techniques serve both for proving termination and relative termination. We present several experiments showing the power of our techniques.
Given a real vector alpha =(alpha1 ; : : : ; alpha d ) and a real number E > 0 a good Diophantine approximation to alpha is a number Q such that IIQ alpha mod Zk1 ", where k \Delta k1 denotes the 1-norm kxk1 := max 1id jx i j for x = (x1 ; : : : ; xd ). Lagarias [12] proved the NP-completeness of the corresponding decision problem, i.e., given a vector ff 2 Q d , a rational number " ? 0 and a number N 2 N+ , decide whether there exists a number Q with 1 Q N and kQff mod Zk1 ". We prove that, unless ...
We study the approximability of the following NP-complete (in their feasibility recognition forms) number theoretic optimization problems: 1. Given n numbers a1 ; : : : ; an 2 Z, find a minimum gcd set for a1 ; : : : ; an , i.e., a subset S fa1 ; : : : ; ang with minimum cardinality satisfying gcd(S) = gcd(a1 ; : : : ; an ). 2. Given n numbers a1 ; : : : ; an 2 Z, find a 1-minimum gcd multiplier for a1 ; : : : ; an , i.e., a vector x 2 Z n with minimum max 1in jx i j satisfying P n...
We address to the problem to factor a large composite number by lattice reduction algorithms. Schnorr has shown that under a reasonable number theoretic assumptions this problem can be reduced to a simultaneous diophantine approximation problem. The latter in turn can be solved by finding sufficiently many l_1--short vectors in a suitably defined lattice. Using lattice basis reduction algorithms Schnorr and Euchner applied Schnorrs reduction technique to 40--bit long integers. Their implementation needed several hours to compute a 5% fraction of the solution, i.e., 6 out of 125 congruences which are necessary to factorize the composite. In this report we describe a more efficient implementation using stronger lattice basis reduction techniques incorporating ideas of Schnorr, Hoerner and Ritter. For 60--bit long integers our algorithm yields a complete factorization in less than 3 hours.
In diesem Bericht wurde das in [Pae02] eingeführte Verfahren "GenDurchschnitt" auf die symbolischen Daten zweier Datenbanken septischer Schock-Patienten angewendet. Es wurden jeweils Generalisierungsregeln generiert, die neben einer robusten Klassifikation der Patienten in die Klassen "überlebt" und "verstorben" auch eine Interpretation der Daten ermöglichten. Ein Vergleich mit den aktuellen Verfahren A-priori und FP-Baum haben die gute Verwendbarkeit des Algorithmus belegt. Die Heuristiken führten zu Laufzeitverbesserungen. Insbesondere die Möglichkeit, die Wichtigkeit von Variablen pro Klasse zu berechnen, führte zu einer Variablenreduktion im Eingaberaum und zu der Identifikation wichtiger Items. Einige Regelbeispiele wurden für jeden Datensatz genannt. Die Frühzeitigkeit von Regeln lieferte für die beiden Datenbanken ein unterschiedliches Ergebnis: Bei den ASK-Daten treten die Regeln für die Klasse "verstorben" früher als die der Klasse "überlebt" auf; bei den MEDAN-Klinikdaten ist es umgekehrt. Eine Erklärung hierfür könnte sein, dass es sich im Vergleich zu den MEDAN-Klinikdaten bei den ASK-Daten um ein Patientenkollektiv mit einer anderen, speziellen Patientencharakteristik handelt. Anhand der Ähnlichkeit der Regeln konnten für den Anwender eine überschaubare Anzahl zuverlässiger Regeln ausgegeben werden, die möglichst unähnlich zueinander sind und somit für einen Arzt in ihrer Gesamtheit interessant sind. Assoziationsregeln und FP-Baum-Regeln erzeugen zwar kürzere Regeln, die aber zu zahlreich und nicht hinreichend sind (vgl. [Pae02, Abschnitt 4]). Zusätzlich zu der Analyse der symbolischen Daten ist auch die Analyse der metrischen MEDAN-Klinikdaten der septischen Schock-Patienten interessant. Ebenfalls ist eine Kombination der Analysen der metrischen und symbolischen Daten sinnvoll. Solche Analysen wurden ebenfalls durchgeführt; die Ergebnisse dieser Analysen werden an anderer Stelle präsentiert werden. Weitere Anwendungen der Generalisierungsregeln sind denkbar. Auch eine Verbesserung des theoretischen Fundaments (vgl. [Pae02]) erscheint sinnvoll, da erst das Zusammenspiel theoretischer und praktischer Anstrengungen zum Ziel führt.
Paging is one of the prominent problems in the field of on-line algorithms. While in the deterministic setting there exist simple and efficient strongly competitive algorithms, in the randomized setting a tradeoff between competitiveness and memory is still not settled. Bein et al. [4] conjectured that there exist strongly competitive randomized paging algorithms, using o(k) bookmarks, i.e. pages not in cache that the algorithm keeps track of. Also in [4] the first algorithm using O(k) bookmarks (2k more precisely), Equitable2, was introduced, proving in the affirmative a conjecture in [7].
We prove tighter bounds for Equitable2, showing that it requires less than k bookmarks, more precisely ≈ 0.62k. We then give a lower bound for Equitable2 showing that it cannot both be strongly competitive and use o(k) bookmarks. Nonetheless, we show that it can trade competitiveness for space. More precisely, if its competitive ratio is allowed to be (Hk + t), then it requires k/(1 + t) bookmarks.
Our main result proves the conjecture that there exist strongly competitive paging algorithms using o(k) bookmarks. We propose an algorithm, denoted Partition2, which is a variant of the Partition algorithm byMcGeoch and Sleator [13]. While classical Partition is unbounded in its space requirements, Partition2 uses θ(k/ log k) bookmarks. Furthermore, we show that this result is asymptotically tight when the forgiveness steps are deterministic.
Since Mobile Virtual Assistants are rising in popularity and come with most new smartphones out of the box and theoretical work in the field is hard to come by, a test is in order to establish the status quo of development. We did a manual test on six different Mobile Virtual Assistants in the categories Voice Recognition, Online Search, Phone Control and Natural Conversation and the results show that Siri is currently the best Mobile Virtual Assistant on the market with a success rate of 65.8% on average over all four categories.
We present an efficient variant of LLL-reduction of lattice bases in the sense of Lenstra, Lenstra, Lov´asz [LLL82]. We organize LLL-reduction in segments of size k. Local LLL-reduction of segments is done using local coordinates of dimension 2k. Strong segment LLL-reduction yields bases of the same quality as LLL-reduction but the reduction is n-times faster for lattices of dimension n. We extend segment LLL-reduction to iterated subsegments. The resulting reduction algorithm runs in O(n3 log n) arithmetic steps for integer lattices of dimension n with basis vectors of length 2O(n), compared to O(n5) steps for LLL-reduction.
We generalize the concept of block reduction for lattice bases from l2-norm to arbitrary norms. This extends the results of Schnorr. We give algorithms for block reduction and apply the resulting enumeration concept to solve subset sum problems. The deterministic algorithm solves all subset sum problems. For up to 66 weights it needs in average less then two hours on a HP 715/50 under HP-UX 9.05.