Informatik
Refine
Year of publication
Document Type
- Preprint (748)
- Article (400)
- Working Paper (119)
- Doctoral Thesis (92)
- Diploma Thesis (46)
- Conference Proceeding (41)
- Book (37)
- Bachelor Thesis (35)
- diplomthesis (29)
- Report (25)
- Part of a Book (13)
- Contribution to a Periodical (10)
- Master's Thesis (6)
- Habilitation (1)
- Lecture (1)
- Review (1)
Has Fulltext
- yes (1604)
Is part of the Bibliography
- no (1604)
Keywords
Institute
- Informatik (1604)
- Frankfurt Institute for Advanced Studies (FIAS) (1002)
- Physik (982)
- Mathematik (55)
- Präsidium (41)
- Medizin (25)
- Biowissenschaften (21)
- Exzellenzcluster Makromolekulare Komplexe (8)
- Psychologie (8)
- Deutsches Institut für Internationale Pädagogische Forschung (DIPF) (5)
Poster presentation: Introduction Dopaminergic neurons in the midbrain show a variety of firing patterns, ranging from very regular firing pacemaker cells to bursty and irregular neurons. The effects of different experimental conditions (like pharmacological treatment or genetical manipulations) on these neuronal discharge patterns may be subtle. Applying a stochastic model is a quantitative approach to reveal these changes. ...
Poster presentation: Introduction The brain is a highly interconnected network of constantly interacting units. Understanding the collective behavior of these units requires a multi-dimensional approach. The results of such analyses are hard to visualize and interpret. Hence tools capable of dealing with such tasks become imperative. ....
The pathogenesis of nodular lymphocyte–predominant Hodgkin lymphoma (NLPHL) and its relationship to other lymphomas are largely unknown. This is partly because of the technical challenge of analyzing its rare neoplastic lymphocytic and histiocytic (L&H) cells, which are dispersed in an abundant nonneoplastic cellular microenvironment. We performed a genome-wide expression study of microdissected L&H lymphoma cells in comparison to normal and other malignant B cells that indicated a relationship of L&H cells to and/or that they originate from germinal center B cells at the transition to memory B cells. L&H cells show a surprisingly high similarity to the tumor cells of T cell–rich B cell lymphoma and classical Hodgkin lymphoma, a partial loss of their B cell phenotype, and deregulation of many apoptosis regulators and putative oncogenes. Importantly, L&H cells are characterized by constitutive nuclear factor {kappa}B activity and aberrant extracellular signal-regulated kinase signaling. Thus, these findings shed new light on the nature of L&H cells, reveal several novel pathogenetic mechanisms in NLPHL, and may help in differential diagnosis and lead to novel therapeutic strategies.
Background Although current molecular clock methods offer greater flexibility in modelling historical evolutionary events, calibration of the clock with dates from the fossil record is still problematic for many groups. Here we implement several new approaches in molecular dating to estimate evolutionary ages of Lacertidae, an Old World family of lizards with a poor fossil record and uncertain phylogeny. Four different models of rate variation are tested in a new program for Bayesian phylogenetic analysis called TreeTime, based on a combination of mitochondrial and nuclear gene sequences. We incorporate paleontological uncertainty into divergence estimates by expressing multiple calibration dates as a range of probabilistic distributions. We also test the reliability of our proposed calibrations by exploring effects of individual priors on posterior estimates. Results According to the most reliable model, as indicated by Bayes factor comparison, modern lacertids arose shortly after the K/T transition and entered Africa about 45 million years ago, with the majority of their African radiation occurring in the Eocene and Oligocene. Our findings indicate much earlier origins for these clades than previously reported, and we discuss our results in light of paleogeographic trends during the Cenozoic. Conclusions This study represents the first attempt to estimate evolutionary ages of a specific group of reptiles exhibiting uncertain phylogenetic relationships, molecular rate variation and a poor fossil record. Our results emphasize the sensitivity of molecular divergence dates to fossil calibrations, and support the use of combined molecular data sets and multiple, well-spaced dates from the fossil record as minimum node constraints. The bioinformatics program used here, TreeTime, is publicly available, and we recommend its use for molecular dating of taxa faced with similar challenges.
Im heutigen Zahlungsverkehr übernehmen in zunehmendem Maße Zahlungen mit Kreditkarten eine entscheidende Rolle. Entsprechend der Verbreitung dieser Art des Zahlungsverkehrs nimmt ebenfalls der Mißbrauch mit diesem bargeldlosen Zahlungsmittel zu. Um die Verluste, die bei dem Kreditkarteninstitut auf diese Weise entstehen, so weit wie möglich einzudämmen, wird versucht, Mißbrauchstransaktionen bei der Autorisierung der Zahlungsaufforderung zu erkennen. Ziel dieser Diplomarbeit ist es zu bestimmen, in wie weit es möglich ist, illegale Transaktionen aus der Menge von Autorisierungsanfragen mit Hilfe adaptiver Algorithmen aufzudecken. Dabei sollen sowohl Methoden aus dem Bereich des Data-Mining, als auch aus den Bereichen der neuronalen Netze benutzt werden. Erschwerend bei der Mißbrauchsanalyse kommt hinzu, daß die Beurteilung der einzelnen Transaktionen in Sekundenbruchteilen abgeschlossen sein muß, um die hohe Anzahl an Autorisierungsanfragen verarbeiten zu können und den Kundenservice auf Seiten des Benutzers und des Händlers auf diese Weise zu optimieren. Weiter handelt es sich bei einem Großteil der bei der Analyse zu Verfügung stehenden Datensätze um symbolische Daten, also alpha-numerisch kodierte Werte, die stellvertretend für verschiedene Eigenschaften verwendet werden. Nur wenige der Transaktionsdaten sind analoger Natur, weisen also eine Linearität auf, die es erlaubt, "Nachbarschaften" zwischen den Daten bestimmen zu können. Damit scheidet eine reine Analyse auf Basis von neuronalen Netzwerken aus. Diese Problematik führte unter anderem zu dem verfolgten Ansatz. Als Grundlage der Analyse dienen bekannte Mißbrauchstransaktionen aus einem Zeitintervall von ungefähr einem Jahr, die jedoch aufgrund der hohen Anzahl nicht komplett als solche mit den eingehenden Transaktionen verglichen werden können, da ein sequentieller Vergleich zu viel Zeit in Anspruch nähme. Im übrigen würde durch einen einfachen Vergleich nur der schon bekannte Mißbrauch erkannt werden; eine Abstraktion der Erkenntnisse aus den Mißbrauchserfahrungen ist nicht möglich. Aus diesem Grund werden diese Mißbrauchstransaktionen mit Hilfe von Methoden aus dem Bereich des Data-Mining verallgemeinert und damit auf ein Minimum, soweit es die Verläßlichkeit dieser Datensätze zuläßt, reduziert. Desweiteren schließt sich eine Analyse der zu diesem Zeitpunkt noch nicht betrachteten analogen Daten an, um die maximale, enthaltene Information aus den Transaktionsdaten zu beziehen. Dafür werden moderne Methoden aus dem Bereich der neuronalen Netzwerke, sogenannte radiale Basisfunktionsnetze, verwendet. Da eine Mißbrauchsanalyse ohne eine entsprechende Profilanalyse unvollständig wäre, wurde abschließend mit den vorhanden Mitteln auf den zugrunde liegenden Daten in Anlehnung an die bisherige Methodik eine solche Profilauswertung und zeitabhängige Analyse realisiert. Mit dem so implementierten Modell wurde versucht, auf allgemeine Art und Weise, Verhaltens- beziehungsweise Transaktionsmuster einzuordnen und mit bei der Mißbrauchsentscheidung einfließen zu lassen. Aus den vorgestellten Analyseverfahren wurden verschiedene Klassifizierungsmodelle entwickelt, die zu guten Ergebnissen auf den Simulationsdaten führen. Es kann gezeigt werden, daß die Mißbrauchserkennung durch eine kombinierte Anwendung aus symbolischer und analoger Auswertung bestmöglich durchzuführen ist.
FIFO is the most prominent queueing strategy due to its simplicity and the fact that it only works with local information. Its analysis within the adversarial queueing theory however has shown, that there are networks that are not stable under the FIFO protocol, even at arbitrarily low rate. On the other hand there are networks that are universally stable, i.e., they are stable under every greedy protocol at any rate r < 1. The question as to which networks are stable under the FIFO protocol arises naturally. We offer the first polynomial time algorithm for deciding FIFO stability and simple-path FIFO stability of a directed network, answering an open question posed in [1, 4]. It turns out, that there are networks, that are FIFO stable but not universally stable, hence FIFO is not a worst case protocol in this sense. Our characterization of FIFO stability is constructive and disproves an open characterization in [4].
The efficient management of large multimedia databases requires the development of new techniques to process, characterize, and search for multimedia objects. Especially in the case of image data, the rapidly growing amount of documents prohibits a manual description of the images’ content. Instead, the automated characterization is highly desirable to support annotation and retrieval of digital images. However, this is a very complex and still unsolved task. To contribute to a solution of this problem, we have developed a mechanism for recognizing objects in images based on the query by example paradigm. Therefore, the most salient image features of an example image representing the searched object are extracted to obtain a scale-invariant object model. The use of this model provides an efficient and robust strategy for recognizing objects in images independently of their size. Further applications of the mechanism are classical recognition tasks such as scene decomposition or object tracking in video sequences.
For the efficient management of large image databases, the automated characterization of images and the usage of that characterization for searching and ordering tasks is highly desirable. The purpose of the project SEMACODE is to combine the still unsolved problem of content-oriented characterization of images with scale-invariant object recognition and modelbased compression methods. To achieve this goal, existing techniques as well as new concepts related to pattern matching, image encoding, and image compression are examined. The resulting methods are integrated in a common framework with the aid of a content-oriented conception. For the application, an image database at the library of the university of Frankfurt/Main (StUB; about 60000 images), the required operations are developed. The search and query interfaces are defined in close cooperation with the StUB project “Digitized Colonial Picture Library”. This report describes the fundamentals and first results of the image encoding and object recognition algorithms developed within the scope of the project.
The prevention of credit card fraud is an important application for prediction techniques. One major obstacle for using neural network training techniques is the high necessary diagnostic quality: Since only one financial transaction of a thousand is invalid no prediction success less than 99.9% is acceptable. Due to these credit card transaction proportions complete new concepts had to be developed and tested on real credit card data. This paper shows how advanced data mining techniques and neural network algorithm can be combined successfully to obtain a high fraud coverage combined with a low false alarm rate.
Classically, encoding of images by only a few, important components is done by the Principal Component Analysis (PCA). Recently, a data analysis tool called Independent Component Analysis (ICA) for the separation of independent influences in signals has found strong interest in the neural network community. This approach has also been applied to images. Whereas the approach assumes continuous source channels mixed up to the same number of channels by a mixing matrix, we assume that images are composed by only a few image primitives. This means that for images we have less sources than pixels. Additionally, in order to reduce unimportant information, we aim only for the most important source patterns with the highest occurrence probabilities or biggest information called „Principal Independent Components (PIC)“. For the example of a synthetic picture composed by characters this idea gives us the most important ones. Nevertheless, for natural images where no a-priori probabilities can be computed this does not lead to an acceptable reproduction error. Combining the traditional principal component criteria of PCA with the independence property of ICA we obtain a better encoding. It turns out that this definition of PIC implements the classical demand of Shannon’s rate distortion theory.
We study the effect of randomness in the adversarial queueing model. All proofs of instability for deterministic queueing strategies exploit a finespun strategy of insertions by an adversary. If the local queueing decisions in the network are subject to randomness, it is far from obvious, that an adversary can still trick the network into instability. We show that uniform queueing is unstable even against an oblivious adversary. Consequently, randomizing the queueing decisions made to operate a network is not in itself a suitable fix for poor network performances due to packet pileups.
This paper describes a method to treat contextual equivalence in polymorphically typed lambda-calculi, and also how to transfer equivalences from the untyped versions of lambda-calculi to their typed variant, where our specific calculus has letrec, recursive types and is nondeterministic. An addition of a type label to every subexpression is all that is needed, together with some natural constraints for the consistency of the type labels and well-scopedness of expressions. One result is that an elementary but typed notion of program transformation is obtained and that untyped contextual equivalences also hold in the typed calculus as long as the expressions are well-typed. In order to have a nice interaction between reduction and typing, some reduction rules have to be accompanied with a type modification by generalizing or instantiating types.
Motivated by the question of correctness of a specific implementation of concurrent buffers in the lambda calculus with futures underlying Alice ML, we prove that concurrent buffers and handled futures can correctly encode each other. Correctness means that our encodings preserve and reflect the observations of may- and must-convergence. This also shows correctness wrt. program semantics, since the encodings are adequate translations wrt. contextual semantics. While these translations encode blocking into queuing and waiting, we also provide an adequate encoding of buffers in a calculus without handles, which is more low-level and uses busy-waiting instead of blocking. Furthermore we demonstrate that our correctness concept applies to the whole compilation process from high-level to low-level concurrent languages, by translating the calculus with buffers, handled futures and data constructors into a small core language without those constructs.
We show on an abstract level that contextual equivalence in non-deterministic program calculi defined by may- and must-convergence is maximal in the following sense. Using also all the test predicates generated by the Boolean, forall- and existential closure of may- and must-convergence does not change the contextual equivalence. The situation is different if may- and total must-convergence is used, where an expression totally must-converges if all reductions are finite and terminate with a value: There is an infinite sequence of test-predicates generated by the Boolean, forall- and existential closure of may- and total must-convergence, which also leads to an infinite sequence of different contextual equalities.
Driving can be dangerous. Humans become inattentive when performing a monotonous task like driving. Also the risk implied while multi-tasking, like using the cellular phone while driving, can break the concentration of the driver and increase the risk of accidents. Others factors like exhaustion, nervousness and excitement affect the performance of the driver and the response time. Consequently, car manufacturers have developed systems in the last decades which assist the driver under various circumstances. These systems are called driver assistance systems. Driver assistance systems are meant to support the task of driving, and the field of action varies from alerting the driver, with acoustical or optical warnings, to taking control of the car, such as keeping the vehicle in the traffic lane until the driver resumes control. For such a purpose, the vehicle is equipped with on-board sensors which allow the perception of the environment and/or the state of the vehicle. Cameras are sensors which extract useful information about the visual appearance of the environment. Additionally, a binocular system allows the extraction of 3D information. One of the main requirements for most camera-based driver assistance systems is the accurate knowledge of the motion of the vehicle. Some sources of information, like velocimeters and GPS, are of common use in vehicles today. Nevertheless, the resolution and accuracy usually achieved with these systems are not enough for many real-time applications. The computation of ego-motion from sequences of stereo images for the implementation of driving intelligent systems, like autonomous navigation or collision avoidance, constitutes the core of this thesis. This dissertation proposes a framework for the simultaneous computation of the 6 degrees of freedom of ego-motion (rotation and translation in 3D Euclidean space), the estimation of the scene structure and the detection and estimation of independently moving objects. The input is exclusively provided by a binocular system and the framework does not call for any data acquisition strategy, i.e. the stereo images are just processed as they are provided. Stereo allows one to establish correspondences between left and right images, estimating 3D points of the environment via triangulation. Likewise, feature tracking establishes correspondences between the images acquired at different time instances. When both are used together for a large number of points, the result is a set of clouds of 3D points with point-to-point correspondences between clouds. The apparent motion of the 3D points between consecutive frames is caused by a variety of reasons. The most dominant motion for most of the points in the clouds is caused by the ego-motion of the vehicle; as the vehicle moves and images are acquired, the relative position of the world points with respect to the vehicle changes. Motion is also caused by objects moving in the environment. They move independently of the vehicle motion, so the observed motion for these points is the sum of the ego-vehicle motion and the independent motion of the object. A third reason, and of paramount importance in vision applications, is caused by correspondence problems, i.e. the incorrect spatial or temporal assignment of the point-to-point correspondence. Furthermore, all the points in the clouds are actually noisy measurements of the real unknown 3D points of the environment. Solving ego-motion and scene structure from the clouds of points requires some previous analysis of the noise involved in the imaging process, and how it propagates as the data is processed. Therefore, this dissertation analyzes the noise properties of the 3D points obtained through stereo triangulation. This leads to the detection of a bias in the estimation of 3D position, which is corrected with a reformulation of the projection equation. Ego-motion is obtained by finding the rotation and translation between the two clouds of points. This problem is known as absolute orientation, and many solutions based on least squares have been proposed in the literature. This thesis reviews the available closed form solutions to the problem. The proposed framework is divided in three main blocks: 1) stereo and feature tracking computation, 2) ego-motion estimation and 3) estimation of 3D point position and 3D velocity. The first block solves the correspondence problem providing the clouds of points as output. No special implementation of this block is required in this thesis. The ego-motion block computes the motion of the cameras by finding the absolute orientation between the clouds of static points in the environment. Since the cloud of points might contain independently moving objects and outliers generated by false correspondences, the direct computation of the least squares might lead to an erroneous solution. The first contribution of this thesis is an effective rejection rule that detects outliers based on the distance between predicted and measured quantities, and reduces the effects of noisy measurement by assigning appropriate weights to the data. This method is called Smoothness Motion Constraint (SMC). The ego-motion of the camera between two frames is obtained finding the absolute orientation between consecutive clouds of weighted 3D points. The complete ego-motion since initialization is achieved concatenating the individual motion estimates. This leads to a super-linear propagation of the error, since noise is integrated. A second contribution of this dissertation is a predictor/corrector iterative method, which integrates the clouds of 3D points of multiple time instances for the computation of ego-motion. The presented method considerably reduces the accumulation of errors in the estimated ego-position of the camera. Another contribution of this dissertation is a method which recursively estimates the 3D world position of a point and its velocity; by fusing stereo, feature tracking and the estimated ego-motion in a Kalman Filter system. An improved estimation of point position is obtained this way, which is used in the subsequent system cycle resulting in an improved computation of ego-motion. The general contribution of this dissertation is a single framework for the real time computation of scene structure, independently moving objects and ego-motion for automotive applications.
Various concurrency primitives have been added to sequential programming languages, in order to turn them concurrent. Prominent examples are concurrent buffers for Haskell, channels in Concurrent ML, joins in JoCaml, and handled futures in Alice ML. Even though one might conjecture that all these primitives provide the same expressiveness, proving this equivalence is an open challenge in the area of program semantics. In this paper, we establish a first instance of this conjecture. We show that concurrent buffers can be encoded in the lambda calculus with futures underlying Alice ML. Our correctness proof results from a systematic method, based on observational semantics with respect to may and must convergence.
Exported proteases of Helicobacter pylori (H. pylori) are potentially involved in pathogen-associated disorders leading to gastric inflammation and neoplasia. By comprehensive sequence screening of the H. pylori proteome for predicted secreted proteases, we retrieved several candidate genes. We detected caseinolytic activities of several such proteases, which are released independently from the H. pylori type IV secretion system encoded by the cag pathogenicity island (cagPAI). Among these, we found the predicted serine protease HtrA (Hp1019), which was previously identified in the bacterial secretome of H. pylori. Importantly, we further found that the H. pylori genes hp1018 and hp1019 represent a single gene likely coding for an exported protein. Here, we directly verified proteolytic activity of HtrA in vitro and identified the HtrA protease in zymograms by mass spectrometry. Overexpressed and purified HtrA exhibited pronounced proteolytic activity, which is inactivated after mutation of Ser205 to alanine in the predicted active center of HtrA. These data demonstrate that H. pylori secretes HtrA as an active protease, which might represent a novel candidate target for therapeutic intervention strategies.
Erkennung kritischer Zustände von Patienten mit der Diagnose "Septischer Schock" mit einem RBF-Netz
(2000)
Es wurde gezeigt, dass der Arzt mit dem wachsenden RBF-Netz durch die Ausgabe von verlässlichen Warnungen unterstützt werden kann. Wie in der Clusteranalyse erläutert, leiden die Ergebnisse jedoch unter den wenigen Patienten und unter der ungenauen zeitlichen Erfassung der Daten. Da jeder Patient sehr individuelle Zustände annimmt, ist ein größeres Patientenkollektiv notwendig, um eine umfassende Wissensbasis zu lernen. Eine medizinische Nachbearbeitung der Wissensbasis durch die Analyse der Fälle ließe eine weitere Verbesserung des Ergebnisses erwarten. Somit könnten unbekannte Zusammenhänge durch das Lernen aus Beispielen und medizinisches Fachwissen kombiniert werden. Abstraktere Merkmale, die weniger abhängig von individuellen Zuständen sind, könnten eine Klassifikation noch weiter verbessern. Ein Ansatzpunkt ist z.B. die Abweichung der Messwerte vom gleitenden Mittelwert. Dieses Maß ist unempfindlicher gegenüber den individuellen Arbeitspunkten der Patienten und bildet auch die Basis von relativen Abhängigkeiten zwischen zwei Variablen, die in einem weiteren Schritt ebenfalls als Merkmal herangezogen wurden. Obwohl die Verwendung der relativen Abhängigkeiten zwischen zwei Variablen als Merkmal nicht deutlichere oder häufigere Warnungen hervorbringen konnte, weist doch die Clusteranalyse auf eine bessere Verteilung der Patienten hin. Einige Cluster sind besser für die Vorhersage geeignet, als dieses bei einer Clusterung auf Basis der Zustände erreicht werden kann. Unterstützt wird dieses Ergebnis auch durch den größeren Unterschied der Sicherheiten von falschen und richtigen Klassifikationen. Neben den bisher untersuchten Merkmalen scheinen auch die Variablen interessant zu sein, bei denen festgestellt wurde, dass sie sich trotz Medikamentengabe und adäquater Behandlung schwer stabilisieren lassen. Durch den behandelnden Arzt werden diese Werte üblicherweise in einem gewissen Bereich gehalten. Falls sich das Paar Medikament/physiologischer Parameter nicht mehr in einem sinnvollen Verhältnis befindet, kann dieses ein wichtiger Indikator sein. Nach dem Aufbau der grundlegenden Funktionalität der hier untersuchten Methoden ist die Suche nach geeigneten Merkmalen als Eingabe für ein neuronales Netz ein wesentlicher Bestandteil folgender Arbeiten. Abgesehen von dem generell anspruchsvollen Vorhaben aus Klinikdaten deutliche Hinweise für die Mortalität septischer-Schock-Patienten zu erhalten, liegen die wesentlichen Probleme in dem Umfang und der Messhäufigkeit der Frankfurter Vorstudie begründet, so dass eine Anwendung von Klassifikationsverfahren auf das umfassendere Patientenkollektiv der MEDAN Multicenter-Studie klarere Ergebnisse erwarten lässt. Eine weitere, für medizinische Anwendungen interessante, Analysemöglichkeit ist die Regelgenerierung, die zur Zeit in einem anderen Teilprojekt in der MEDAN-Arbeitsgruppe bearbeitet wird. Hier können im Fall metrischer Daten zusätzliche Hinweise für die Leistung eines reinen Klassifikationsverfahrens gewonnen werden mit dem Vorteil einer expliziten Regelausgabe. Zum anderen werden in diesem Teilprojekt auch Verfahren zur Regelgenerierung eingesetzt, die ordinale und nominale Variablen wie Diagnosen, Operationen, Therapien und Medikamentenangaben (binär, ohne genaue Dosis) auswerten können. Diese werden in den Multicenter-Daten vorhanden sein. Durch Kopplung der Regelgeneratoren für metrische Daten auf der einen Seite und für diskrete Variablen auf der anderen Seite, besteht durchaus die Hoffnung bessere Ergebnisse zu erzielen. Da der Regelgenerator für metrische Daten auf dem RBF-DDA (Abk. für: Dynamic Decay Adjustment)-Netz [BERTHOLD und DIAMOND, 1995] beruht, bietet es sich innerhalb des MEDAN-Projekts an, einen (bislang nicht durchgeführten) Vergleich mit dem hier verwendeten Netztyp durchzuführen. Der Vergleich ist allerdings nur von prinzipiellem Interesse und kann auf den hier betrachteten Daten kein grundsätzlich besseres Ergebnis liefern als die bislang durchgeführten Analysen; er kann aber zu einer umfangreichen Bewertung der Ergebnisse beitragen.
In the context of information theory, the term Mutual Information has first been formulated by Claude Elwood Shannon. Information theory is the consistent mathematical description of technical communication systems. To this day, it is the basis of numerous applications in modern communications engineering and yet became indispensable in this field. This work is concerned with the development of a concept for nonlinear feature selection from scalar, multivariate data on the basis of the mutual information. From the viewpoint of modelling, the successful construction of a realistic model depends highly on the quality of the employed data. In the ideal case, high quality data simply consists of the relevant features for deriving the model. In this context, it is important to possess a suitable method for measuring the degree of the, mostly nonlinear, dependencies between input- and output variables. By means of such a measure, the relevant features could be specifically selected. During the course of this work, it will become evident that the mutual information is a valuable and feasible measure for this task and hence the method of choice for practical applications. Basically and without the claim of being exhaustive, there are two possible constellations that recommend the application of feature selection. On the one hand, feature selection plays an important role, if the computability of a derived system model cannot be guaranteed, due to a multitude of available features. On the other hand, the existence of very few data points with a significant number of features also recommends the employment of feature selection. The latter constellation is closely related to the so called "Curse of Dimensionality". The actual statement behind this is the necessity to reduce the dimensionality to obtain an adequate coverage of the data space. In other word, it is important to reduce the dimensionality of the data, since the coverage of the data space exponentially decreases, for a constant number of data points, with the dimensionality of the available data. In the context of mapping between input- and output space, this goal is ideally reached by selecting only the relevant features from the available data set. The basic idea for this work has its origin in the rather practical field of automotive engineering. It was motivated by the goals of a complex research project in which the nonlinear, dynamic dependencies among a multitude of sensor signals should be identified. The final goal of such activities was to derive so called virtual sensors from identified dependencies among the installed automotive sensors. This enables the real-time computability of the required variable without the expenses of additional hardware. The prospect of doing without additional computing hardware is a strong motive force in particular in automotive engineering. In this context, the major problem was to find a feasible method to capture the linear- as well as the nonlinear dependencies. As mentioned before, the goal of this work is the development of a flexibly applicable system for nonlinear feature selection. The important point here is to guarantee the practicable computability of the developed method even for high dimensional data spaces, which are rather realistic in technical environments. The employed measure for the feature selection process is based on the sophisticated concept of mutual information. The property of the mutual information, regarding its high sensitivity and specificity to linear- and nonlinear statistical dependencies, makes it the method of choice for the development of a highly flexible, nonlinear feature selection framework. In addition to the mere selection of relevant features, the developed framework is also applicable for the nonlinear analysis of the temporal influences of the selected features. Hence, a subsequent dynamic modelling can be performed more efficiently, since the proposed feature selection algorithm additionally provides information about the temporal dependencies between input- and output variables. In contrast to feature extraction techniques, the developed feature selection algorithm in this work has another considerable advantage. In the case of cost intensive measurements, the variables with the highest information content can be selected in a prior feasibility study. Hence, the developed method can also be employed to avoid redundance in the acquired data and thus prevent for additional costs.
We investigate methods and tools for analysing translations between programming languages with respect to observational semantics. The behaviour of programs is observed in terms of may- and must-convergence in arbitrary contexts, and adequacy of translations, i.e., the reflection of program equivalence, is taken to be the fundamental correctness condition. For compositional translations we propose a notion of convergence equivalence as a means for proving adequacy. This technique avoids explicit reasoning about contexts, and is able to deal with the subtle role of typing in implementations of language extension.