OPUS 4 | Search

A machine learning approach to preference strategies for anaphor resolution (2005)

In the last years, much effort went into the design of robust anaphor resolution algorithms. Many algorithms are based on antecedent filtering and preference strategies that are manually designed. Along a different line of research, corpus-based approaches have been investigated that employ machine-learning techniques for deriving strategies automatically. Since the knowledge-engineering effort for designing and optimizing the strategies is reduced, the latter approaches are considered particularly attractive. Since, however, the hand-coding of robust antecedent filtering strategies such as syntactic disjoint reference and agreement in person, number, and gender constitutes a once-for-all effort, the question arises whether at all they should be derived automatically. In this paper, it is investigated what might be gained by combining the best of two worlds: designing the universally valid antecedent filtering strategies manually, in a once-for-all fashion, and deriving the (potentially genre-specific) antecedent selection strategies automatically by applying machine-learning techniques. An anaphor resolution system ROSANA-ML, which follows this paradigm, is designed and implemented. Through a series of formal evaluations, it is shown that, while exhibiting additional advantages, ROSANAML reaches a performance level that compares with the performance of its manually designed ancestor ROSANA.

A stable integer relation algorithm (1994)

Schnorr, Claus Peter ; Rössner, Carsten

We study the following problem: given x element Rn either find a short integer relation m element Zn, so that =0 holds for the inner product <.,.>, or prove that no short integer relation exists for x. Hastad, Just Lagarias and Schnorr (1989) give a polynomial time algorithm for the problem. We present a stable variation of the HJLS--algorithm that preserves lower bounds on lambda(x) for infinitesimal changes of x. Given x \in {\RR}^n and \alpha \in \NN this algorithm finds a nearby point x' and a short integer relation m for x'. The nearby point x' is 'good' in the sense that no very short relation exists for points \bar{x} within half the x'--distance from x. On the other hand if x'=x then m is, up to a factor 2^{n/2}, a shortest integer relation for \mbox{x.} Our algorithm uses, for arbitrary real input x, at most \mbox{O(n^4(n+\log \alpha))} many arithmetical operations on real numbers. If x is rational the algorithm operates on integers having at most \mbox{O(n^5+n^3 (\log \alpha)^2 + \log (\|q x\|^2))} many bits where q is the common denominator for x.

Approximating good simultaneous diophantine approximations is almost NP-hard (1997)

Rössner, Carsten ; Seifert, Jean-Pierre

Given a real vector alpha =(alpha1 ; : : : ; alpha d ) and a real number E > 0 a good Diophantine approximation to alpha is a number Q such that IIQ alpha mod Zk1 ", where k \Delta k1 denotes the 1-norm kxk1 := max 1id jx i j for x = (x1 ; : : : ; xd ). Lagarias [12] proved the NP-completeness of the corresponding decision problem, i.e., given a vector ff 2 Q d , a rational number " ? 0 and a number N 2 N+ , decide whether there exists a number Q with 1 Q N and kQff mod Zk1 ". We prove that, unless ...

Black Box Cryptanalysis of Hash Networks based on Multipermutations (1994)

Schnorr, Claus Peter ; Vaudenay, Serge

Black box cryptanalysis applies to hash algorithms consisting of many small boxes, connected by a known graph structure, so that the boxes can be evaluated forward and backwards by given oracles. We study attacks that work for any choice of the black boxes, i.e. we scrutinize the given graph structure. For example we analyze the graph of the fast Fourier transform (FFT). We present optimal black box inversions of FFT-compression functions and black box constructions of collisions. This determines the minimal depth of FFT-compression networks for collision-resistant hashing. We propose the concept of multipermutation, which is a pair of orthogonal latin squares, as a new cryptographic primitive that generalizes the boxes of the FFT. Our examples of multipermutations are based on the operations circular rotation, bitwise xor, addition and multiplication.

Block reduction for arbitrary norms (1994)

Kaib, Michael ; Ritter, Harald

We generalize the concept of block reduction for lattice bases from l2-norm to arbitrary norms. This extends the results of Schnorr. We give algorithms for block reduction and apply the resulting enumeration concept to solve subset sum problems. The deterministic algorithm solves all subset sum problems. For up to 66 weights it needs in average less then two hours on a HP 715/50 under HP-UX 9.05.

Credit card fraud detection by adaptive neural data mining (1999)

Brause, Rüdiger W. ; Langsdorf, Timm Sebastian ; Hepp, Hans-Michael

The prevention of credit card fraud is an important application for prediction techniques. One major obstacle for using neural network training techniques is the high necessary diagnostic quality: Since only one financial transaction of a thousand is invalid no prediction success less than 99.9% is acceptable. Due to these credit card transaction proportions complete new concepts had to be developed and tested on real credit card data. This paper shows how advanced data mining techniques and neural network algorithm can be combined successfully to obtain a high fraud coverage combined with a low false alarm rate.

Diophantine approximation of a plane (1997)

Rössner, Carsten ; Schnorr, Claus Peter

Durchschnittsbasierte Generalisierungsregeln : Teil 1., Grundlagen (2002)

Paetz, Jürgen

Durchschnittsbasierte Generalisierungsregeln : Teil 2., Analyse von Daten septischer Schock-Patienten (2002)

Paetz, Jürgen ; Brause, Rüdiger W.

In diesem Bericht wurde das in [Pae02] eingeführte Verfahren "GenDurchschnitt" auf die symbolischen Daten zweier Datenbanken septischer Schock-Patienten angewendet. Es wurden jeweils Generalisierungsregeln generiert, die neben einer robusten Klassifikation der Patienten in die Klassen "überlebt" und "verstorben" auch eine Interpretation der Daten ermöglichten. Ein Vergleich mit den aktuellen Verfahren A-priori und FP-Baum haben die gute Verwendbarkeit des Algorithmus belegt. Die Heuristiken führten zu Laufzeitverbesserungen. Insbesondere die Möglichkeit, die Wichtigkeit von Variablen pro Klasse zu berechnen, führte zu einer Variablenreduktion im Eingaberaum und zu der Identifikation wichtiger Items. Einige Regelbeispiele wurden für jeden Datensatz genannt. Die Frühzeitigkeit von Regeln lieferte für die beiden Datenbanken ein unterschiedliches Ergebnis: Bei den ASK-Daten treten die Regeln für die Klasse "verstorben" früher als die der Klasse "überlebt" auf; bei den MEDAN-Klinikdaten ist es umgekehrt. Eine Erklärung hierfür könnte sein, dass es sich im Vergleich zu den MEDAN-Klinikdaten bei den ASK-Daten um ein Patientenkollektiv mit einer anderen, speziellen Patientencharakteristik handelt. Anhand der Ähnlichkeit der Regeln konnten für den Anwender eine überschaubare Anzahl zuverlässiger Regeln ausgegeben werden, die möglichst unähnlich zueinander sind und somit für einen Arzt in ihrer Gesamtheit interessant sind. Assoziationsregeln und FP-Baum-Regeln erzeugen zwar kürzere Regeln, die aber zu zahlreich und nicht hinreichend sind (vgl. [Pae02, Abschnitt 4]). Zusätzlich zu der Analyse der symbolischen Daten ist auch die Analyse der metrischen MEDAN-Klinikdaten der septischen Schock-Patienten interessant. Ebenfalls ist eine Kombination der Analysen der metrischen und symbolischen Daten sinnvoll. Solche Analysen wurden ebenfalls durchgeführt; die Ergebnisse dieser Analysen werden an anderer Stelle präsentiert werden. Weitere Anwendungen der Generalisierungsregeln sind denkbar. Auch eine Verbesserung des theoretischen Fundaments (vgl. [Pae02]) erscheint sinnvoll, da erst das Zusammenspiel theoretischer und praktischer Anstrengungen zum Ziel führt.

Fachspezifischer Anhang zur SPoL (Teil III) : Studienfach Informatik im Studiengang L3 (2006)

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Is part of the Bibliography

Keywords

Institute

25 search hits