Refine
Year of publication
Document Type
- Report (25) (remove)
Has Fulltext
- yes (25)
Is part of the Bibliography
- no (25)
Keywords
- LLL-reduction (2)
- Textanalyse ; Linguistische Datenverarbeitung; Computerlinguistik (2)
- segments (2)
- Bildnisschutz (1)
- Cameras (1)
- Computerlinguistik (1)
- Data protection (1)
- Datenschutz (1)
- Digital Rights Management (1)
- Digitalkamera (1)
Institute
- Informatik (25) (remove)
For the efficient management of large image databases, the automated characterization of images and the usage of that characterization for searching and ordering tasks is highly desirable. The purpose of the project SEMACODE is to combine the still unsolved problem of content-oriented characterization of images with scale-invariant object recognition and modelbased compression methods. To achieve this goal, existing techniques as well as new concepts related to pattern matching, image encoding, and image compression are examined. The resulting methods are integrated in a common framework with the aid of a content-oriented conception. For the application, an image database at the library of the university of Frankfurt/Main (StUB; about 60000 images), the required operations are developed. The search and query interfaces are defined in close cooperation with the StUB project “Digitized Colonial Picture Library”. This report describes the fundamentals and first results of the image encoding and object recognition algorithms developed within the scope of the project.
The prevention of credit card fraud is an important application for prediction techniques. One major obstacle for using neural network training techniques is the high necessary diagnostic quality: Since only one financial transaction of a thousand is invalid no prediction success less than 99.9% is acceptable. Due to these credit card transaction proportions complete new concepts had to be developed and tested on real credit card data. This paper shows how advanced data mining techniques and neural network algorithm can be combined successfully to obtain a high fraud coverage combined with a low false alarm rate.
The wide-area deployment of WiFi hot spots challenges IP access providers. While new profit models are sought after by them, profitability as well as logistics for large-scale deployment of 802.11 wireless technology are still to be proven. Expenditure for hardware, locations, maintenance, connectivity, marketing, billing and customer care must be considered. Even for large carriers with infrastructure, the deployment of a large-scale WiFi infrastructure may be risky. This paper proposes a multi-level scheme for hot spot distribution and customer acquisition that reduces financial risk, cost of marketing and cost of maintenance for the large-scale deployment of WiFi hot spots.
With ubiquitous use of digital camera devices, especially in mobile phones, privacy is no longer threatened by governments and companies only. The new technology creates a new threat by ordinary people, who now have the means to take and distribute pictures of one’s face at no risk and little cost in any situation in public and private spaces. Fast distribution via web based photo albums, online communities and web pages expose an individual’s private life to the public in unpreceeded ways. Social and legal measures are increasingly taken to deal with this problem. In practice however, they lack efficiency, as they are hard to enforce in practice. In this paper, we discuss a supportive infrastructure aiming for the distribution channel; as soon as the picture is publicly available, the exposed individual has a chance to find it and take proper action.
We generalize the concept of block reduction for lattice bases from l2-norm to arbitrary norms. This extends the results of Schnorr. We give algorithms for block reduction and apply the resulting enumeration concept to solve subset sum problems. The deterministic algorithm solves all subset sum problems. For up to 66 weights it needs in average less then two hours on a HP 715/50 under HP-UX 9.05.
We present an efficient variant of LLL-reduction of lattice bases in the sense of Lenstra, Lenstra, Lov´asz [LLL82]. We organize LLL-reduction in segments of size k. Local LLL-reduction of segments is done using local coordinates of dimension 2k. Strong segment LLL-reduction yields bases of the same quality as LLL-reduction but the reduction is n-times faster for lattices of dimension n. We extend segment LLL-reduction to iterated subsegments. The resulting reduction algorithm runs in O(n3 log n) arithmetic steps for integer lattices of dimension n with basis vectors of length 2O(n), compared to O(n5) steps for LLL-reduction.
Since Mobile Virtual Assistants are rising in popularity and come with most new smartphones out of the box and theoretical work in the field is hard to come by, a test is in order to establish the status quo of development. We did a manual test on six different Mobile Virtual Assistants in the categories Voice Recognition, Online Search, Phone Control and Natural Conversation and the results show that Siri is currently the best Mobile Virtual Assistant on the market with a success rate of 65.8% on average over all four categories.
Paging is one of the prominent problems in the field of on-line algorithms. While in the deterministic setting there exist simple and efficient strongly competitive algorithms, in the randomized setting a tradeoff between competitiveness and memory is still not settled. Bein et al. [4] conjectured that there exist strongly competitive randomized paging algorithms, using o(k) bookmarks, i.e. pages not in cache that the algorithm keeps track of. Also in [4] the first algorithm using O(k) bookmarks (2k more precisely), Equitable2, was introduced, proving in the affirmative a conjecture in [7].
We prove tighter bounds for Equitable2, showing that it requires less than k bookmarks, more precisely ≈ 0.62k. We then give a lower bound for Equitable2 showing that it cannot both be strongly competitive and use o(k) bookmarks. Nonetheless, we show that it can trade competitiveness for space. More precisely, if its competitive ratio is allowed to be (Hk + t), then it requires k/(1 + t) bookmarks.
Our main result proves the conjecture that there exist strongly competitive paging algorithms using o(k) bookmarks. We propose an algorithm, denoted Partition2, which is a variant of the Partition algorithm byMcGeoch and Sleator [13]. While classical Partition is unbounded in its space requirements, Partition2 uses θ(k/ log k) bookmarks. Furthermore, we show that this result is asymptotically tight when the forgiveness steps are deterministic.
In diesem Bericht wurde das in [Pae02] eingeführte Verfahren "GenDurchschnitt" auf die symbolischen Daten zweier Datenbanken septischer Schock-Patienten angewendet. Es wurden jeweils Generalisierungsregeln generiert, die neben einer robusten Klassifikation der Patienten in die Klassen "überlebt" und "verstorben" auch eine Interpretation der Daten ermöglichten. Ein Vergleich mit den aktuellen Verfahren A-priori und FP-Baum haben die gute Verwendbarkeit des Algorithmus belegt. Die Heuristiken führten zu Laufzeitverbesserungen. Insbesondere die Möglichkeit, die Wichtigkeit von Variablen pro Klasse zu berechnen, führte zu einer Variablenreduktion im Eingaberaum und zu der Identifikation wichtiger Items. Einige Regelbeispiele wurden für jeden Datensatz genannt. Die Frühzeitigkeit von Regeln lieferte für die beiden Datenbanken ein unterschiedliches Ergebnis: Bei den ASK-Daten treten die Regeln für die Klasse "verstorben" früher als die der Klasse "überlebt" auf; bei den MEDAN-Klinikdaten ist es umgekehrt. Eine Erklärung hierfür könnte sein, dass es sich im Vergleich zu den MEDAN-Klinikdaten bei den ASK-Daten um ein Patientenkollektiv mit einer anderen, speziellen Patientencharakteristik handelt. Anhand der Ähnlichkeit der Regeln konnten für den Anwender eine überschaubare Anzahl zuverlässiger Regeln ausgegeben werden, die möglichst unähnlich zueinander sind und somit für einen Arzt in ihrer Gesamtheit interessant sind. Assoziationsregeln und FP-Baum-Regeln erzeugen zwar kürzere Regeln, die aber zu zahlreich und nicht hinreichend sind (vgl. [Pae02, Abschnitt 4]). Zusätzlich zu der Analyse der symbolischen Daten ist auch die Analyse der metrischen MEDAN-Klinikdaten der septischen Schock-Patienten interessant. Ebenfalls ist eine Kombination der Analysen der metrischen und symbolischen Daten sinnvoll. Solche Analysen wurden ebenfalls durchgeführt; die Ergebnisse dieser Analysen werden an anderer Stelle präsentiert werden. Weitere Anwendungen der Generalisierungsregeln sind denkbar. Auch eine Verbesserung des theoretischen Fundaments (vgl. [Pae02]) erscheint sinnvoll, da erst das Zusammenspiel theoretischer und praktischer Anstrengungen zum Ziel führt.