Refine
Year of publication
Document Type
- Doctoral Thesis (15)
Has Fulltext
- yes (15)
Is part of the Bibliography
- no (15)
Keywords
- Computer Vision (2)
- SLAM (2)
- Adaptronik (1)
- Bilderkennung (1)
- Claude Elwood (1)
- Ego-motion Estimation (1)
- Eigenbewegungsschaetzung (1)
- Entropie (1)
- Entropie <Informationstheorie> (1)
- Gedächtnis (1)
Institute
The dissertation deals with the general problem of how the brain can establish correspondences between neural patterns stored in different cortical areas. Although an important capability in many cognitive areas like language understanding, abstract reasoning, or motor control, this thesis concentrates on invariant object recognition as application of correspondence finding. One part of the work presents a correspondence-based, neurally plausible system for face recognition. Other parts address the question of visual information routing over several stages by proposing optimal architectures for such routing ('switchyards') and deriving ontogenetic mechanisms for the growth of switchyards. Finally, the idea of multi-stage routing is united with the object recognition system introduced before, making suggestions of how the so far distinct feature-based and correspondence-based approaches to object recognition could be reconciled.
This dissertation is concerned with the task of map-based self-localization, using images of the ground recorded with a downward-facing camera. In this context, map-based (self-)localization is the task of determining the position and orientation of a query image that is to be localized. The map used for this purpose consists of a set of reference images with known positions and orientations in a common coordinate system. For localization, the considered methods determine correspondences between features of the query image and those of the reference images.
In comparison with localization approaches that use images of the surrounding environment, we expect that using images of the ground has the advantage that, unlike the surrounding, the visual appearance of the ground is often long-term stable. Also, by using active lighting of the ground, localization becomes independent of external lighting conditions.
This dissertation includes content of several published contributions, which present research on the development and testing of methods for feature-based localization of ground images. Our first contribution examines methods for the extraction of image features that have not been designed to be used on ground images. This survey shows that, with appropriate parametrization, several of these methods are well suited for the task.
Based on this insight, we develop and examine methods for various subtasks of map-based localization in the following contributions. We examine global localization, where all reference images have to be considered, as well as local localization, where an approximation of the query image position is already known, which allows for disregarding reference images with a large distance to this position.
In our second contribution, we present the first systematic comparison of state-of-the-art methods for ground texture based localization. Furthermore, we present a method, which is characterized by its usage of our novel feature matching technique. This technique is called identity matching, as it matches only those features with identical descriptors, in contrast to the state-of-the-art that also matches features with similar descriptors. We show that our method is well suited for global and local localization, as it has favorable scaling with the number of reference images considered during the localization process. In another contribution, we develop a variant of our localization method that is significantly faster to compute, as it applies a sampling approach to determine the image positions at which local features are extracted, instead of using classical feature detectors.
Two further contributions are concerned with global localization. The first one introduces a prediction model for the global localization performance, based on an evaluation of the local localization performance. This allows us to quickly evaluate any considered parameter settings of global localization methods. The second contribution introduces a learning-based method that computes compact descriptors of ground images. This descriptor can be used to retrieve the overlapping reference images of a query image from a large set of reference images with little computational effort.
The most recent contribution included in this dissertation presents a new ground image database, which was recorded with a dedicated platform using a downward-facing camera. In addition to the data, we also explain our guidelines for the construction of the platform. In comparison with existing databases, our database contains more images and presents a larger variety of ground textures. Furthermore, this database enables us to perform the first systematic evaluation of how localization performance is affected by the time interval between the point in time at which the reference images are recorded and the point in time at which the query image is recorded. We find out that for outdoor areas all ground texture based localization methods have reliability issues, if the time interval between the recording of the query and reference images is large, and also if there are different weather conditions. These findings point to remaining challenges in ground texture base localization that should be addressed in future work.
Powerful environment perception systems are a fundamental prerequisite for the successful deployment of intelligent vehicles, from advanced driver assistance systems to self-driving cars. Arguably the most essential task of such systems is the reliable detection and localization of obstacles in order to avoid collisions. Two particularly challenging scenarios in this context are represented by small, unexpected obstacles on the road ahead, and by potentially dynamic objects observed from a large distance. Both scenarios become exceedingly critical when the ego-vehicle is traveling at high speed. As a consequence, two major requirements placed on environment perception systems are the capability of (a) high-sensitivity generic object detection and (b) high-accuracy obstacle distance estimation. The present thesis addresses both requirements by proposing novel approaches based on stereo vision for spatial perception.
First, this work presents a novel method for the detection of small, generic obstacles and objects at long range directly from stereo imagery. The detection is based on sound statistical tests using local geometric criteria which are applicable to both static and moving objects. The approach is not limited to predefined sets of semantic object classes and does not rely on restrictive assumptions on the environment, such as oversimplified global ground surface models. Free-space and obstacle hypotheses are evaluated based on a statistical model of the input image data in order to avoid a loss of sensitivity through intermediate processing steps. In addition to the detection result, the algorithm simultaneously yields refined estimates of object distances, originating from an implicit optimization of the geometric obstacle hypothesis models. The proposed detection system provides multiple flexible output representations, ranging from 3D obstacle point clouds to compact mid-level obstacle segments to bounding box representations of object instances suitable for model-based tracking. The core algorithm concept lends itself to massive parallelization and can be implemented efficiently on dedicated hardware. Real-time execution is demonstrated on a test vehicle in real-world traffic. For a thorough quantitative evaluation of the detection performance, two dedicated datasets are employed, covering small and hard-to-detect obstacles in urban environments as well as distant dynamic objects in highway driving scenarios. The proposed system is shown to significantly outperform current general purpose obstacle detection approaches in both setups, providing a considerable increase in detection range while reducing the false positive rate at the same time.
Second, this work considers the high-accuracy estimation of object distances from stereo vision, particularly at long range. Several new methods for optimizing the stereo-based distance estimates of detected objects are proposed and compared to state-of-the-art concepts. A comprehensive statistical evaluation is performed on an extensive dedicated dataset, establishing reference values for the accuracy limits actually achievable in practice. Notably, the refined distance estimates implicitly provided by the proposed obstacle detection system are shown to yield highly accurate results, on par with the top-performing dedicated stereo matching algorithms considered in the analysis.
In order to investigate the role of neuronal synchronization in perceptual grouping, a new method was developed to record selectively from multiple cortical sites of known functional specificity as determined by optical imaging of intrinsic signals. To this end, a matrix of closely spaced guide tubes was developed in cooperation with a company providing the essential manufacturing technique RMPD® (Rapid Micro Product Development). The matrix was embedded into a framework of hard and software that allowed for the mapping of each guide tube onto the cortical site an electrode would be led to if inserted into that guide tube. With these developments, it was possible to determine the functional layout of the cortex by optical imaging and subsequently perform targeted recordings with multiple electrodes in parallel. The method was tested for its accuracy and found to target the electrodes with a precision of 100 µm to the desired cortical locations. Using the developed technique, neuronal activity was recorded from area 18 of anesthetized cats. For stimulation, Gabor-patches in different geometrical configurations were placed over the recorded receptive fields merging into visual objects appropriate for testing the hypothesis of feature binding by synchrony. Synchronization strength was measured by the height of the cross-correlation centre peaks. All pairwise synchronizations were summarized in a correlation index which determined the mean difference of the correlation strengths between conditions in which recording sites should or should not fire in synchrony according to the binding hypothesis. The correlation index deviated significantly from zero for several of these configurations, further supporting the hypothesis that synchronization plays an important role in the process of perceptual grouping. Furthermore, direct evidence was found for the independence of the synchronization strength from the neuronal firing rate and for neurons that change dynamically the ensemble they participate in. In parallel to the experimental approach, mechanisms of oscillatory long range synchronization were studied by network simulations. To this end, a biologically plausible model was implemented using pyramidal and basket cells with Hodgkin-Huxley like conductances. Several columns were built from these cells and intra- and inter-columnar connections were mimicked from physiological data. When activated by independent Poisson spike trains, the columns showed oscillatory activity in the gamma frequency range. Correlation analysis revealed the tendency to locally synchronize the oscillations among the columns, but a rapid phase transition occurred with increasing cortical distance. This finding suggests that the present view of the inter-columnar connectivity does not fully explain oscillatory long range synchronization and predicts that other processes such as top-down influences are necessary for long range synchronization phenomena.
In this thesis, we opened the door towards a novel estimation theory for homogeneous vectors and have taken several steps into this new and uncharted territory. Present state of the art for homogeneous estimation problems treats such vectors p 2 Pn as unit vectors embedded in Rn+1 and approximates the unit hypersphere by a tangent plane (which is a n-dimensional real space, thus having the same number of degrees of freedom as Pn). This approach allows to use known and established methods from real space (e.g. the variational approach which leads to the FNS algorithm), but it only works well for small errors and has several drawbacks: • The unit sphere is a two-sheeted covering space of the projective space. Embedding approaches cannot model this fact and therefore can cause a degradation of estimation quality. • Linearization breaks down if distributions are not highly concentrated (e.g. if data configurations approach degenerate situations). • While estimation in tangential planes is possible with little error, the characterization of uncertainties with covariance matrices is much more problematic. Covariance matrices are not suited for modelling axial uncertainties if distributions are not concentrated. Therefore, we linked approaches from directional statistics and estimation theory together. (Homogeneous) TLS estimation could be identified as central model for homogeneous estimation and links to axial statistics were established. In the first chapters, a unified estimation theory for the point data and axial data was developed. In contrast to present approaches, we identified axial data as a specific data model (and not just as directional data with symmetric probability density function); this led to the development of novel terms like axial mean vectors, axial variances and axial expectation values. Like a tunnel which is constructed from both ends simultaneously, we also drilled from the parameter estimation side towards directional/axial statistics in the second part. The presentation of parameter estimation given in this thesis deviates strongly from all known textbooks by presenting homogeneous estimation problems as a distinguished class of problems which calls for different estimation tools. Using the results from the first part, the TLS solution can be interpreted as the weighted anti-mean vector of an axial sample. This link allows to use our results from axial statistics; for instance, the certainty of the anti-mode (i.e. of the TLS solution!) can be described with a weighted Bingham distribution (see (3.91)). While present approaches are only interested in the eigenvector of the some matrix, we can now exploit the whole mean scatter matrix to describe TLS solution and its certainty. Algorithms like FNS, HEIV or renormalization were presented in a common context and linked to each other. One central result is that all iterative homogeneous estimation algorithms essentially minimize a series of evolving Rayleigh coefficients which corresponds to a series of (converging?) cost functions. Statistical optimization is only possible if we clearly identify every step as what it exactly is. For instance, the vague statement “solving Xp ... 0” means nothing but setting ˆp := arg minp pTXp pT p . We identified the most complex scenario for which closed form optimal solutions are possible (in terms of axial statistics: the type-I matrix weighted model). The IETLS approach which is developed in this thesis then solves general type-II matrix weighted problems with an iterative solution of a series of type-I matrix weighted problems. This approach also allows to built converging schemes including robust and/or constrained estimation – in contrast to other approaches which can have severe convergence problems even without such extensions if error levels are not low. Chapter 6 then is another big step forward. We presented the theoretical background of homogeneous estimation by introducing novel concepts like singular vector unbiasedness of random matrices and solved the problem of optimal estimation for correlated data. For instance, these results could be used for better estimation of local image orientation / optical flow (see section 7.2). At the end of this thesis, simulations and experiments for a few computer vision applications were presented; besides orientation estimation, especially the results for robust and constrained estimation for fundamental matrices is impressive. The novel algorithms are applicable for a lot of other applications not presented here, for instance camera calibration, factorization algorithm formulti-view structure from motion, or conic fitting. The fact that this work paved the way for a lot of further research is certainly a good sign.
At present, there is a huge lag between the artificial and the biological information processing systems in terms of their capability to learn. This lag could be certainly reduced by gaining more insight into the higher functions of the brain like learning and memory. For instance, primate visual cortex is thought to provide the long-term memory for the visual objects acquired by experience. The visual cortex handles effortlessly arbitrary complex objects by decomposing them rapidly into constituent components of much lower complexity along hierarchically organized visual pathways. How this processing architecture self-organizes into a memory domain that employs such compositional object representation by learning from experience remains to a large extent a riddle. The study presented here approaches this question by proposing a functional model of a self-organizing hierarchical memory network. The model is based on hypothetical neuronal mechanisms involved in cortical processing and adaptation. The network architecture comprises two consecutive layers of distributed, recurrently interconnected modules. Each module is identified with a localized cortical cluster of fine-scale excitatory subnetworks. A single module performs competitive unsupervised learning on the incoming afferent signals to form a suitable representation of the locally accessible input space. The network employs an operating scheme where ongoing processing is made of discrete successive fragments termed decision cycles, presumably identifiable with the fast gamma rhythms observed in the cortex. The cycles are synchronized across the distributed modules that produce highly sparse activity within each cycle by instantiating a local winner-take-all-like operation. Equipped with adaptive mechanisms of bidirectional synaptic plasticity and homeostatic activity regulation, the network is exposed to natural face images of different persons. The images are presented incrementally one per cycle to the lower network layer as a set of Gabor filter responses extracted from local facial landmarks. The images are presented without any person identity labels. In the course of unsupervised learning, the network creates simultaneously vocabularies of reusable local face appearance elements, captures relations between the elements by linking associatively those parts that encode the same face identity, develops the higher-order identity symbols for the memorized compositions and projects this information back onto the vocabularies in generative manner. This learning corresponds to the simultaneous formation of bottom-up, lateral and top-down synaptic connectivity within and between the network layers. In the mature connectivity state, the network holds thus full compositional description of the experienced faces in form of sparse memory traces that reside in the feed-forward and recurrent connectivity. Due to the generative nature of the established representation, the network is able to recreate the full compositional description of a memorized face in terms of all its constituent parts given only its higher-order identity symbol or a subset of its parts. In the test phase, the network successfully proves its ability to recognize identity and gender of the persons from alternative face views not shown before. An intriguing feature of the emerging memory network is its ability to self-generate activity spontaneously in absence of the external stimuli. In this sleep-like off-line mode, the network shows a self-sustaining replay of the memory content formed during the previous learning. Remarkably, the recognition performance is tremendously boosted after this off-line memory reprocessing. The performance boost is articulated stronger on those face views that deviate more from the original view shown during the learning. This indicates that the off-line memory reprocessing during the sleep-like state specifically improves the generalization capability of the memory network. The positive effect turns out to be surprisingly independent of synapse-specific plasticity, relying completely on the synapse-unspecific, homeostatic activity regulation across the memory network. The developed network demonstrates thus functionality not shown by any previous neuronal modeling approach. It forms and maintains a memory domain for compositional, generative object representation in unsupervised manner through experience with natural visual images, using both on- ("wake") and off-line ("sleep") learning regimes. This functionality offers a promising departure point for further studies, aiming for deeper insight into the learning mechanisms employed by the brain and their consequent implementation in the artificial adaptive systems for solving complex tasks not tractable so far.
Bayessche Methoden zur Schätzung von Stammbäumen mit Verzweigungszeitpunkten aus molekularen Daten
(2009)
Ein großes Ziel der Evolutionsbiologie ist es, die Stammesgeschichte der Arten zu rekonstruieren. Historisch verwendeten Systematiker hierfür morphologische und anatomische Merkmale. Mit dem stetigen Zuwachs an verfügbaren Sequenzdaten werden heute verstärkt Methoden entwickelt und eingesetzt, welche die Rekonstruktion auf Basis von molekularen Daten ermöglichen. Im Fokus der aktuellen Forschung steht die Anwendung und Weiterentwicklung Bayesscher Methoden. Diese Methoden besitzen große Popularität, da sie in Verbindung mit Markov-Ketten-Monte-Carlo-Verfahren eingesetzt werden können, um einen Stammbaum zu vorgegebenen Spezies zu schätzen und dessen Variabilität zu bestimmen. Im Rahmen dieser Dissertation wurde die erweiterbare Software TreeTime entwickelt. TreeTime bietet Schnittstellen für die Einbindung von molekularen Evolutions- und Ratenänderungsmodellen und stellt neu entwickelte Methoden bereit, um Stammbäume mit Verzweigungszeitpunkten zu rekonstruieren. In TreeTime werden die molekularen Daten und die zeitlichen Informationen, wie z.B. Fossilfunde, in einem Bayes-Verfahren simultan berücksichtigt, um die Zeitpunkte der Artaufspaltungen genauer zu datieren. Für die Anwendung Bayesscher Methoden in der Rekonstruktion von Stammbäumen wird ein stochastisches Modell benötigt, das die Evolution der molekularen Sequenzen entlang den Kanten eines Stammbaums beschreibt. Der Mutationsprozess der Sequenzen wird durch ein molekulares Evolutionsmodell definiert. Die Verwendung der klassischen molekularen Evolutionsmodelle impliziert die Annahme einer konstanten Evolutionsgeschwindigkeit der Sequenzen im Stammbaum. Diese Annahme wird als Hypothese der molekularen Uhr bezeichnet und bildet die Grundlage zum Schätzen der Verzweigungszeiten des Stammbaums. Der Verzweigungszeitpunkt, an dem sich zwei Spezies im Stammbaum aufspalten, spiegelt sich in der Ähnlichkeit der zugehörigen molekularen Sequenzen. Je älter dieser Verzweigungszeitpunkt ist, desto größer ist die Anzahl der unterschiedlichen Positionen in den Sequenzen. Häufig ist jedoch die Annahme der molekularen Uhr verletzt, so dass in gewissen Teilbereichen eines Stammbaums eine erhöhte Evolutionsgeschwindigkeit nachweisbar ist. Falls die Verletzung konstanter Evolutionsgeschwindigkeiten nicht ausgeschlossen werden kann, sollten schwankende Mutationsraten in der Modellierung explizit berücksichtigt werden. Hierfür wurden verschiedene Ratenänderungsmodelle vorgeschlagen. Bisher sind nur wenige dieser Ratenänderungsmodelle in Softwarepaketen verfügbar und ihre Eigenschaften sind nicht ausreichend erforscht. Das Ziel dieser Arbeit ist die Entwicklung und Bereitstellung von Bayesschen Modellen und Methoden zum Schätzen von Stammbäumen mit Verzweigungszeitpunkten. Die Methoden sollten auch bei unterschiedlichen Evolutionsgeschwindigkeiten im Stammbaum anwendbar sein. Vorgestellt wird ein neues Ratenänderungsmodell, eine neue Möglichkeit der Angabe von flexiblen Beschränkungen für die Topologie des Stammbaums sowie die Nutzung dieser Beschränkungen für die zeitliche Kalibrierung. Das neue Raten Änderungsmodell sowie die topologischen und zeitlichen Beschränkungen werden in einen modularen Softwareentwurf eingebettet. Durch den erweiterbaren Entwurf können bestehende und zukünftige molekulare Evolutionsmodelle und Ratenänderungsmodelle in die Software eingebunden und verwendet werden. Die vorgestellten Modelle und Methoden werden gemäß dem Softwareentwurf in das neu entwickelte Programm TreeTime aufgenommen und effzient implementiert. Zusätzlich werden bereits vorhandene Modelle programmiert und eingebunden, die nicht in anderen Softwarepaketen verfügbar sind. Des Weiteren wird eine neue Methode entwickelt und angewendet, um die Passgenauigkeit eines Modells für die Apriori-Verteilung auf der Menge der Baumtopologien zu beurteilen. Diese Methode wird zur Auswahl geeigneter Modelle benutzt, indem eine Auswertung der beobachteten Baumtopologien der Datenbank TreeBASE durchgeführt wird. Anschließend wird die Software TreeTime in einer Simulationsstudie eingesetzt, um die Eigenschaften der implementierten Ratenänderungsmodelle zu vergleichen. Die Software wird für die Rekonstruktion des Stammbaums zu 38 Spezies aus der Familie der Eidechsen (Lacertidae) verwendet. Da die zugehörigen molekularen Daten von der Hypothese der molekularen Uhr abweichen, werden unterschiedliche Ratenänderungsmodelle bei der Rekonstruktion verwendet und abschließend bewertet. ........
In the context of information theory, the term Mutual Information has first been formulated by Claude Elwood Shannon. Information theory is the consistent mathematical description of technical communication systems. To this day, it is the basis of numerous applications in modern communications engineering and yet became indispensable in this field. This work is concerned with the development of a concept for nonlinear feature selection from scalar, multivariate data on the basis of the mutual information. From the viewpoint of modelling, the successful construction of a realistic model depends highly on the quality of the employed data. In the ideal case, high quality data simply consists of the relevant features for deriving the model. In this context, it is important to possess a suitable method for measuring the degree of the, mostly nonlinear, dependencies between input- and output variables. By means of such a measure, the relevant features could be specifically selected. During the course of this work, it will become evident that the mutual information is a valuable and feasible measure for this task and hence the method of choice for practical applications. Basically and without the claim of being exhaustive, there are two possible constellations that recommend the application of feature selection. On the one hand, feature selection plays an important role, if the computability of a derived system model cannot be guaranteed, due to a multitude of available features. On the other hand, the existence of very few data points with a significant number of features also recommends the employment of feature selection. The latter constellation is closely related to the so called "Curse of Dimensionality". The actual statement behind this is the necessity to reduce the dimensionality to obtain an adequate coverage of the data space. In other word, it is important to reduce the dimensionality of the data, since the coverage of the data space exponentially decreases, for a constant number of data points, with the dimensionality of the available data. In the context of mapping between input- and output space, this goal is ideally reached by selecting only the relevant features from the available data set. The basic idea for this work has its origin in the rather practical field of automotive engineering. It was motivated by the goals of a complex research project in which the nonlinear, dynamic dependencies among a multitude of sensor signals should be identified. The final goal of such activities was to derive so called virtual sensors from identified dependencies among the installed automotive sensors. This enables the real-time computability of the required variable without the expenses of additional hardware. The prospect of doing without additional computing hardware is a strong motive force in particular in automotive engineering. In this context, the major problem was to find a feasible method to capture the linear- as well as the nonlinear dependencies. As mentioned before, the goal of this work is the development of a flexibly applicable system for nonlinear feature selection. The important point here is to guarantee the practicable computability of the developed method even for high dimensional data spaces, which are rather realistic in technical environments. The employed measure for the feature selection process is based on the sophisticated concept of mutual information. The property of the mutual information, regarding its high sensitivity and specificity to linear- and nonlinear statistical dependencies, makes it the method of choice for the development of a highly flexible, nonlinear feature selection framework. In addition to the mere selection of relevant features, the developed framework is also applicable for the nonlinear analysis of the temporal influences of the selected features. Hence, a subsequent dynamic modelling can be performed more efficiently, since the proposed feature selection algorithm additionally provides information about the temporal dependencies between input- and output variables. In contrast to feature extraction techniques, the developed feature selection algorithm in this work has another considerable advantage. In the case of cost intensive measurements, the variables with the highest information content can be selected in a prior feasibility study. Hence, the developed method can also be employed to avoid redundance in the acquired data and thus prevent for additional costs.
In der vorliegenden Arbeit beschäftigen wir uns mit der Frage, wie ein Regler für ein hochdimensionales physikalisch/technisches System strukturiert und optimiert werden soll. Diesbezüglich untersuchen wir einen neuen Ansatz, welcher versucht, Regel-Mechanismen des ökonomischen Marktes und Lern-Prozesse mit in den Regler einzubauen. Um eine anschauliche Vorstellung von der Wirkung des Reglers zu erhalten, wenden wir diesen auf ein einfaches physikalisches Model an, eine an ihren Enden eingespannte eindimensionale Federkette. Wir implementieren das Model auf einem Rechner und simulieren den Einfluß des Regelverfahrens auf die Bewegung der Kette. Dabei beschränken wir uns auf den Grenzfall kleiner Amplituden, um das System im Rahmen einer näherungsweise linearen Dynamik beschreiben zu können. Mit Hilfe eines schwachen destabilisierenden Zusatzpotentials erreichen wir, daß die niedrigen Eigenmoden der schwingenden Kette instabil werden und die ausgestreckte Kette eine instabile Gleichgewichtslage darstellt. Wir stellen uns die Aufgabe, diese unter Verwendung des Reglers zu stabilisieren. Anhand des Modells untersuchen wir den Einfluß verschiedener Anfangsbedingungen der Kette, den Einfluß der Markt-Regelung, den Einfluß verschiedener Kommunikationsstrukturen und den Einfluß des Lernverfahrens auf die Wirksamkeit und die Robustheit des Regelprozesses. Als wichtigstes Ergebnis erkennen wir, daß die Regelung mit dem Markt robuster im Vergleich mit der Regelung ohne Markt ist, aber im allgemeinen einen höheren Regel-Energieaufwand aufweist. Untersuchungen anhand des Lernverfahrens ergeben, daß sich das Lernen der Markt- und der Kommunikationsstruktur kombinieren läßt und dadurch die Wirksamkeit der Regelung gegen über der Verwendung von nur einem der beiden Lern-Ansätze erhöht werden kann. Unsere Ergebnisse zeigen, daß sich das Markt-Konzept vollständig auf den gegebenen technischen Regelprozeß übertragen läßt. In der Diskussion der Ergebnisse führen wir die erhöhte Robustheit und den erhöhten Energieaufwand der Markt-Regelung auf eine indirekte, nichtlineare Kopplung der Regeleinheiten zurück, die der Markt-Mechanismus in den Regelprozeß einführt. Die Nichtlinearität bewirkt, daß die von dem Regler bestimmten Regelkräfte bei kleinen Kontrollfehlern relativ größer sind als bei großen Kontrollfehlern. Daduch ist der Energieaufwand der Markt-Regelung bei kleinen Kontrollfehlern gegenüber der Regelung ohne Markt erhöht. Der Regler ist damit in der Lage, die Kette auch bei dem Ausfall einer Regeleinheit zu stabilisieren, da ausreichend große Regelkräfte durch die verbleibenden Regeleinheiten ausgeübt werden. Die Kopplung von benachbarten Massenpunkten durch Federn unterstützt die Robustheit der Regelung in dem untersuchten Ketten-Modell, da die Kopplung dazu führt, daß die Massenpunkte eine zur instabilen Gleichgewichtslage rücktreibende Kraft erfahren und dadurch in den Bereich von kleinen Kontrollfehlern und relativ hohen Regelkräften gelangen. Am Ende der Diskussion gehen wir kurz auf mögliche Anwendungen der gewonnen Ergebnisse ein. Dabei haben wir besonders technische Regelprozesse im Sinne von Smart Matter (intelligente Bauteile) im Auge.
Visual perception has increasingly grown important during the last decades in the robotics domain. Mobile robots have to localize themselves in known environments and carry out complex navigation tasks. This thesis presents an appearance-based or view-based approach to robot self-localization and robot navigation using holistic, spherical views obtained by cameras with large fields of view. For view-based methods, it is crucial to have a compressed image representation where different views can be stored and compared efficiently. Our approach relies on the spherical Fourier transform, which transforms a signal defined on the sphere to a small set of coefficients, approximating the original signal by a weighted sum of orthonormal basis functions, the so-called spherical harmonics. The truncated low order expansion of the image signal allows to compare input images efficiently, and the mathematical properties of spherical harmonics also allow for estimating rotation between two views, even in 3D. Since no geometrical measurements need to be done, modest quality of the vision system is sufficient. All experiments shown in this thesis are purely based on visual information to show the applicability of the approach. The research presented on robot self localization was focused on demonstrating the usability of the compressed spherical harmonics representation to solve the well-known kidnapped robot problem. To address this problem, the basic idea is to compare the current view to a set of images from a known environment to obtain a likelihood of robot positions. To localize the robot, one could choose the most probable position from the likelihood map; however, it is more beneficial to apply standard methods to integrate information over time while the robot moves, that is, particle or Kalman filters. The first step was to design a fast expansion method to obtain coefficient vectors directly in image space. This was achieved by back-projecting basis functions on the input image. The next steps were to develop a dissimilarity measure, an estimator for rotations between coefficient vectors, and a rotation-invariant dissimilarity measure, all of them purely based on the compact signal representation. With all these techniques at hand, generating likelihood maps is straightforward, but first experiments indicated strong dependence on illumination conditions. This is obviously a challenge for all holistic methods, in particular for a spherical harmonics approach, since local changes usually affect each single element of the coefficient vector. To cope with illumination changes, we investigated preprocessing steps leading to feature images (e.g. edge images, depth images), which bring together our holistic approach and classical feature-based methods. Furthermore, we concentrated on building a statistical model for typical changes of the coefficient vectors in presence of changes in illumination. This task is more demanding but leads to even better results. The second major topic of this thesis is appearance-based robot navigation. I present a view-based approach called Optical Rails (ORails), which leads a robot along a prerecorded track. The robot navigates in a network of known locations which are denoted as waypoints. At each waypoint, we store a compressed view representation. A visual servoing method is used to reach a current target waypoint based on the appearance and the current camera image. Navigating in a network of views is achieved by reaching a sequence of stopover locations, one after another. The main contribution of this work is a model which allows to deduce the best driving direction of the robot based purely on the coefficient vectors of the current and the target image. It is based on image registration as the classical method by Lucas-Kanade, but has been transferred to the spectral domain, which allows for great speedup. ORails also includes a waypoint selection strategy and a module for steering our nonholonomic robot. As for our self-localization algorithm, dependance on illumination changes is also problematic in ORails. Furthermore, occlusions have to be handled for ORails to work properly. I present a solution based on the optimal expansion, which is able to deal with incomplete image signals. To handle dynamic occlusions, i.e. objects appearing in an arbitrary region of the image, we use the linearity of the expansion process and cut the image into segments. These segments can be treated separately, and finally we merge the results. At this point, we can decide to disregard certain segments. Slicing the view allows for local illumination compensation, which is inherently non-robust if applied to the whole view. In conclusion, this approach allows to handle the most important criticism to holistic view-based approaches, that is, occlusions and illumination changes, and consequently improves the performance of Optical Rails.