### Refine

#### Keywords

- Adaptronik (1)
- Bilderkennung (1)
- Claude Elwood (1)
- Ego-motion Estimation (1)
- Eigenbewegungsschaetzung (1)
- Entropie (1)
- Entropie <Informationstheorie> (1)
- Gehirn (1)
- Gesicht (1)
- Gesichtserkennung (1)

#### Institute

- Nonlinear feature selection using the general mutual information (2008)
- In the context of information theory, the term Mutual Information has first been formulated by Claude Elwood Shannon. Information theory is the consistent mathematical description of technical communication systems. To this day, it is the basis of numerous applications in modern communications engineering and yet became indispensable in this field. This work is concerned with the development of a concept for nonlinear feature selection from scalar, multivariate data on the basis of the mutual information. From the viewpoint of modelling, the successful construction of a realistic model depends highly on the quality of the employed data. In the ideal case, high quality data simply consists of the relevant features for deriving the model. In this context, it is important to possess a suitable method for measuring the degree of the, mostly nonlinear, dependencies between input- and output variables. By means of such a measure, the relevant features could be specifically selected. During the course of this work, it will become evident that the mutual information is a valuable and feasible measure for this task and hence the method of choice for practical applications. Basically and without the claim of being exhaustive, there are two possible constellations that recommend the application of feature selection. On the one hand, feature selection plays an important role, if the computability of a derived system model cannot be guaranteed, due to a multitude of available features. On the other hand, the existence of very few data points with a significant number of features also recommends the employment of feature selection. The latter constellation is closely related to the so called "Curse of Dimensionality". The actual statement behind this is the necessity to reduce the dimensionality to obtain an adequate coverage of the data space. In other word, it is important to reduce the dimensionality of the data, since the coverage of the data space exponentially decreases, for a constant number of data points, with the dimensionality of the available data. In the context of mapping between input- and output space, this goal is ideally reached by selecting only the relevant features from the available data set. The basic idea for this work has its origin in the rather practical field of automotive engineering. It was motivated by the goals of a complex research project in which the nonlinear, dynamic dependencies among a multitude of sensor signals should be identified. The final goal of such activities was to derive so called virtual sensors from identified dependencies among the installed automotive sensors. This enables the real-time computability of the required variable without the expenses of additional hardware. The prospect of doing without additional computing hardware is a strong motive force in particular in automotive engineering. In this context, the major problem was to find a feasible method to capture the linear- as well as the nonlinear dependencies. As mentioned before, the goal of this work is the development of a flexibly applicable system for nonlinear feature selection. The important point here is to guarantee the practicable computability of the developed method even for high dimensional data spaces, which are rather realistic in technical environments. The employed measure for the feature selection process is based on the sophisticated concept of mutual information. The property of the mutual information, regarding its high sensitivity and specificity to linear- and nonlinear statistical dependencies, makes it the method of choice for the development of a highly flexible, nonlinear feature selection framework. In addition to the mere selection of relevant features, the developed framework is also applicable for the nonlinear analysis of the temporal influences of the selected features. Hence, a subsequent dynamic modelling can be performed more efficiently, since the proposed feature selection algorithm additionally provides information about the temporal dependencies between input- and output variables. In contrast to feature extraction techniques, the developed feature selection algorithm in this work has another considerable advantage. In the case of cost intensive measurements, the variables with the highest information content can be selected in a prior feasibility study. Hence, the developed method can also be employed to avoid redundance in the acquired data and thus prevent for additional costs.

- Estimation in projective spaces and applications in computer vision (2005)
- In this thesis, we opened the door towards a novel estimation theory for homogeneous vectors and have taken several steps into this new and uncharted territory. Present state of the art for homogeneous estimation problems treats such vectors p 2 Pn as unit vectors embedded in Rn+1 and approximates the unit hypersphere by a tangent plane (which is a n-dimensional real space, thus having the same number of degrees of freedom as Pn). This approach allows to use known and established methods from real space (e.g. the variational approach which leads to the FNS algorithm), but it only works well for small errors and has several drawbacks: • The unit sphere is a two-sheeted covering space of the projective space. Embedding approaches cannot model this fact and therefore can cause a degradation of estimation quality. • Linearization breaks down if distributions are not highly concentrated (e.g. if data configurations approach degenerate situations). • While estimation in tangential planes is possible with little error, the characterization of uncertainties with covariance matrices is much more problematic. Covariance matrices are not suited for modelling axial uncertainties if distributions are not concentrated. Therefore, we linked approaches from directional statistics and estimation theory together. (Homogeneous) TLS estimation could be identified as central model for homogeneous estimation and links to axial statistics were established. In the first chapters, a unified estimation theory for the point data and axial data was developed. In contrast to present approaches, we identified axial data as a specific data model (and not just as directional data with symmetric probability density function); this led to the development of novel terms like axial mean vectors, axial variances and axial expectation values. Like a tunnel which is constructed from both ends simultaneously, we also drilled from the parameter estimation side towards directional/axial statistics in the second part. The presentation of parameter estimation given in this thesis deviates strongly from all known textbooks by presenting homogeneous estimation problems as a distinguished class of problems which calls for different estimation tools. Using the results from the first part, the TLS solution can be interpreted as the weighted anti-mean vector of an axial sample. This link allows to use our results from axial statistics; for instance, the certainty of the anti-mode (i.e. of the TLS solution!) can be described with a weighted Bingham distribution (see (3.91)). While present approaches are only interested in the eigenvector of the some matrix, we can now exploit the whole mean scatter matrix to describe TLS solution and its certainty. Algorithms like FNS, HEIV or renormalization were presented in a common context and linked to each other. One central result is that all iterative homogeneous estimation algorithms essentially minimize a series of evolving Rayleigh coefficients which corresponds to a series of (converging?) cost functions. Statistical optimization is only possible if we clearly identify every step as what it exactly is. For instance, the vague statement “solving Xp ... 0” means nothing but setting ˆp := arg minp pTXp pT p . We identified the most complex scenario for which closed form optimal solutions are possible (in terms of axial statistics: the type-I matrix weighted model). The IETLS approach which is developed in this thesis then solves general type-II matrix weighted problems with an iterative solution of a series of type-I matrix weighted problems. This approach also allows to built converging schemes including robust and/or constrained estimation – in contrast to other approaches which can have severe convergence problems even without such extensions if error levels are not low. Chapter 6 then is another big step forward. We presented the theoretical background of homogeneous estimation by introducing novel concepts like singular vector unbiasedness of random matrices and solved the problem of optimal estimation for correlated data. For instance, these results could be used for better estimation of local image orientation / optical flow (see section 7.2). At the end of this thesis, simulations and experiments for a few computer vision applications were presented; besides orientation estimation, especially the results for robust and constrained estimation for fundamental matrices is impressive. The novel algorithms are applicable for a lot of other applications not presented here, for instance camera calibration, factorization algorithm formulti-view structure from motion, or conic fitting. The fact that this work paved the way for a lot of further research is certainly a good sign.

- Strukturierung und Optimierung ausgedehnter dynamischer Regel-Netze : eine eindimensionale mit Regler versehene Kette als Modell für das Regel-Verhalten von "Smart Matter" (1998)
- In der vorliegenden Arbeit beschäftigen wir uns mit der Frage, wie ein Regler für ein hochdimensionales physikalisch/technisches System strukturiert und optimiert werden soll. Diesbezüglich untersuchen wir einen neuen Ansatz, welcher versucht, Regel-Mechanismen des ökonomischen Marktes und Lern-Prozesse mit in den Regler einzubauen. Um eine anschauliche Vorstellung von der Wirkung des Reglers zu erhalten, wenden wir diesen auf ein einfaches physikalisches Model an, eine an ihren Enden eingespannte eindimensionale Federkette. Wir implementieren das Model auf einem Rechner und simulieren den Einfluß des Regelverfahrens auf die Bewegung der Kette. Dabei beschränken wir uns auf den Grenzfall kleiner Amplituden, um das System im Rahmen einer näherungsweise linearen Dynamik beschreiben zu können. Mit Hilfe eines schwachen destabilisierenden Zusatzpotentials erreichen wir, daß die niedrigen Eigenmoden der schwingenden Kette instabil werden und die ausgestreckte Kette eine instabile Gleichgewichtslage darstellt. Wir stellen uns die Aufgabe, diese unter Verwendung des Reglers zu stabilisieren. Anhand des Modells untersuchen wir den Einfluß verschiedener Anfangsbedingungen der Kette, den Einfluß der Markt-Regelung, den Einfluß verschiedener Kommunikationsstrukturen und den Einfluß des Lernverfahrens auf die Wirksamkeit und die Robustheit des Regelprozesses. Als wichtigstes Ergebnis erkennen wir, daß die Regelung mit dem Markt robuster im Vergleich mit der Regelung ohne Markt ist, aber im allgemeinen einen höheren Regel-Energieaufwand aufweist. Untersuchungen anhand des Lernverfahrens ergeben, daß sich das Lernen der Markt- und der Kommunikationsstruktur kombinieren läßt und dadurch die Wirksamkeit der Regelung gegen über der Verwendung von nur einem der beiden Lern-Ansätze erhöht werden kann. Unsere Ergebnisse zeigen, daß sich das Markt-Konzept vollständig auf den gegebenen technischen Regelprozeß übertragen läßt. In der Diskussion der Ergebnisse führen wir die erhöhte Robustheit und den erhöhten Energieaufwand der Markt-Regelung auf eine indirekte, nichtlineare Kopplung der Regeleinheiten zurück, die der Markt-Mechanismus in den Regelprozeß einführt. Die Nichtlinearität bewirkt, daß die von dem Regler bestimmten Regelkräfte bei kleinen Kontrollfehlern relativ größer sind als bei großen Kontrollfehlern. Daduch ist der Energieaufwand der Markt-Regelung bei kleinen Kontrollfehlern gegenüber der Regelung ohne Markt erhöht. Der Regler ist damit in der Lage, die Kette auch bei dem Ausfall einer Regeleinheit zu stabilisieren, da ausreichend große Regelkräfte durch die verbleibenden Regeleinheiten ausgeübt werden. Die Kopplung von benachbarten Massenpunkten durch Federn unterstützt die Robustheit der Regelung in dem untersuchten Ketten-Modell, da die Kopplung dazu führt, daß die Massenpunkte eine zur instabilen Gleichgewichtslage rücktreibende Kraft erfahren und dadurch in den Bereich von kleinen Kontrollfehlern und relativ hohen Regelkräften gelangen. Am Ende der Diskussion gehen wir kurz auf mögliche Anwendungen der gewonnen Ergebnisse ein. Dabei haben wir besonders technische Regelprozesse im Sinne von Smart Matter (intelligente Bauteile) im Auge.

- Electrophysiological and computational studies on the mechanisms and functional impact of cortical synchronization (2005)
- In order to investigate the role of neuronal synchronization in perceptual grouping, a new method was developed to record selectively from multiple cortical sites of known functional specificity as determined by optical imaging of intrinsic signals. To this end, a matrix of closely spaced guide tubes was developed in cooperation with a company providing the essential manufacturing technique RMPD® (Rapid Micro Product Development). The matrix was embedded into a framework of hard and software that allowed for the mapping of each guide tube onto the cortical site an electrode would be led to if inserted into that guide tube. With these developments, it was possible to determine the functional layout of the cortex by optical imaging and subsequently perform targeted recordings with multiple electrodes in parallel. The method was tested for its accuracy and found to target the electrodes with a precision of 100 µm to the desired cortical locations. Using the developed technique, neuronal activity was recorded from area 18 of anesthetized cats. For stimulation, Gabor-patches in different geometrical configurations were placed over the recorded receptive fields merging into visual objects appropriate for testing the hypothesis of feature binding by synchrony. Synchronization strength was measured by the height of the cross-correlation centre peaks. All pairwise synchronizations were summarized in a correlation index which determined the mean difference of the correlation strengths between conditions in which recording sites should or should not fire in synchrony according to the binding hypothesis. The correlation index deviated significantly from zero for several of these configurations, further supporting the hypothesis that synchronization plays an important role in the process of perceptual grouping. Furthermore, direct evidence was found for the independence of the synchronization strength from the neuronal firing rate and for neurons that change dynamically the ensemble they participate in. In parallel to the experimental approach, mechanisms of oscillatory long range synchronization were studied by network simulations. To this end, a biologically plausible model was implemented using pyramidal and basket cells with Hodgkin-Huxley like conductances. Several columns were built from these cells and intra- and inter-columnar connections were mimicked from physiological data. When activated by independent Poisson spike trains, the columns showed oscillatory activity in the gamma frequency range. Correlation analysis revealed the tendency to locally synchronize the oscillations among the columns, but a rapid phase transition occurred with increasing cortical distance. This finding suggests that the present view of the inter-columnar connectivity does not fully explain oscillatory long range synchronization and predicts that other processes such as top-down influences are necessary for long range synchronization phenomena.

- Binocular ego-motion estimation for automotive applications (2008)
- Driving can be dangerous. Humans become inattentive when performing a monotonous task like driving. Also the risk implied while multi-tasking, like using the cellular phone while driving, can break the concentration of the driver and increase the risk of accidents. Others factors like exhaustion, nervousness and excitement affect the performance of the driver and the response time. Consequently, car manufacturers have developed systems in the last decades which assist the driver under various circumstances. These systems are called driver assistance systems. Driver assistance systems are meant to support the task of driving, and the field of action varies from alerting the driver, with acoustical or optical warnings, to taking control of the car, such as keeping the vehicle in the traffic lane until the driver resumes control. For such a purpose, the vehicle is equipped with on-board sensors which allow the perception of the environment and/or the state of the vehicle. Cameras are sensors which extract useful information about the visual appearance of the environment. Additionally, a binocular system allows the extraction of 3D information. One of the main requirements for most camera-based driver assistance systems is the accurate knowledge of the motion of the vehicle. Some sources of information, like velocimeters and GPS, are of common use in vehicles today. Nevertheless, the resolution and accuracy usually achieved with these systems are not enough for many real-time applications. The computation of ego-motion from sequences of stereo images for the implementation of driving intelligent systems, like autonomous navigation or collision avoidance, constitutes the core of this thesis. This dissertation proposes a framework for the simultaneous computation of the 6 degrees of freedom of ego-motion (rotation and translation in 3D Euclidean space), the estimation of the scene structure and the detection and estimation of independently moving objects. The input is exclusively provided by a binocular system and the framework does not call for any data acquisition strategy, i.e. the stereo images are just processed as they are provided. Stereo allows one to establish correspondences between left and right images, estimating 3D points of the environment via triangulation. Likewise, feature tracking establishes correspondences between the images acquired at different time instances. When both are used together for a large number of points, the result is a set of clouds of 3D points with point-to-point correspondences between clouds. The apparent motion of the 3D points between consecutive frames is caused by a variety of reasons. The most dominant motion for most of the points in the clouds is caused by the ego-motion of the vehicle; as the vehicle moves and images are acquired, the relative position of the world points with respect to the vehicle changes. Motion is also caused by objects moving in the environment. They move independently of the vehicle motion, so the observed motion for these points is the sum of the ego-vehicle motion and the independent motion of the object. A third reason, and of paramount importance in vision applications, is caused by correspondence problems, i.e. the incorrect spatial or temporal assignment of the point-to-point correspondence. Furthermore, all the points in the clouds are actually noisy measurements of the real unknown 3D points of the environment. Solving ego-motion and scene structure from the clouds of points requires some previous analysis of the noise involved in the imaging process, and how it propagates as the data is processed. Therefore, this dissertation analyzes the noise properties of the 3D points obtained through stereo triangulation. This leads to the detection of a bias in the estimation of 3D position, which is corrected with a reformulation of the projection equation. Ego-motion is obtained by finding the rotation and translation between the two clouds of points. This problem is known as absolute orientation, and many solutions based on least squares have been proposed in the literature. This thesis reviews the available closed form solutions to the problem. The proposed framework is divided in three main blocks: 1) stereo and feature tracking computation, 2) ego-motion estimation and 3) estimation of 3D point position and 3D velocity. The first block solves the correspondence problem providing the clouds of points as output. No special implementation of this block is required in this thesis. The ego-motion block computes the motion of the cameras by finding the absolute orientation between the clouds of static points in the environment. Since the cloud of points might contain independently moving objects and outliers generated by false correspondences, the direct computation of the least squares might lead to an erroneous solution. The first contribution of this thesis is an effective rejection rule that detects outliers based on the distance between predicted and measured quantities, and reduces the effects of noisy measurement by assigning appropriate weights to the data. This method is called Smoothness Motion Constraint (SMC). The ego-motion of the camera between two frames is obtained finding the absolute orientation between consecutive clouds of weighted 3D points. The complete ego-motion since initialization is achieved concatenating the individual motion estimates. This leads to a super-linear propagation of the error, since noise is integrated. A second contribution of this dissertation is a predictor/corrector iterative method, which integrates the clouds of 3D points of multiple time instances for the computation of ego-motion. The presented method considerably reduces the accumulation of errors in the estimated ego-position of the camera. Another contribution of this dissertation is a method which recursively estimates the 3D world position of a point and its velocity; by fusing stereo, feature tracking and the estimated ego-motion in a Kalman Filter system. An improved estimation of point position is obtained this way, which is used in the subsequent system cycle resulting in an improved computation of ego-motion. The general contribution of this dissertation is a single framework for the real time computation of scene structure, independently moving objects and ego-motion for automotive applications.

- Information routing, correspondence finding, and object recognition in the brain (2008)
- The dissertation deals with the general problem of how the brain can establish correspondences between neural patterns stored in different cortical areas. Although an important capability in many cognitive areas like language understanding, abstract reasoning, or motor control, this thesis concentrates on invariant object recognition as application of correspondence finding. One part of the work presents a correspondence-based, neurally plausible system for face recognition. Other parts address the question of visual information routing over several stages by proposing optimal architectures for such routing ('switchyards') and deriving ontogenetic mechanisms for the growth of switchyards. Finally, the idea of multi-stage routing is united with the object recognition system introduced before, making suggestions of how the so far distinct feature-based and correspondence-based approaches to object recognition could be reconciled.

- Applications of spherical harmonics in robot vision (2011)
- Visual perception has increasingly grown important during the last decades in the robotics domain. Mobile robots have to localize themselves in known environments and carry out complex navigation tasks. This thesis presents an appearance-based or view-based approach to robot self-localization and robot navigation using holistic, spherical views obtained by cameras with large fields of view. For view-based methods, it is crucial to have a compressed image representation where different views can be stored and compared efficiently. Our approach relies on the spherical Fourier transform, which transforms a signal defined on the sphere to a small set of coefficients, approximating the original signal by a weighted sum of orthonormal basis functions, the so-called spherical harmonics. The truncated low order expansion of the image signal allows to compare input images efficiently, and the mathematical properties of spherical harmonics also allow for estimating rotation between two views, even in 3D. Since no geometrical measurements need to be done, modest quality of the vision system is sufficient. All experiments shown in this thesis are purely based on visual information to show the applicability of the approach. The research presented on robot self localization was focused on demonstrating the usability of the compressed spherical harmonics representation to solve the well-known kidnapped robot problem. To address this problem, the basic idea is to compare the current view to a set of images from a known environment to obtain a likelihood of robot positions. To localize the robot, one could choose the most probable position from the likelihood map; however, it is more beneficial to apply standard methods to integrate information over time while the robot moves, that is, particle or Kalman filters. The first step was to design a fast expansion method to obtain coefficient vectors directly in image space. This was achieved by back-projecting basis functions on the input image. The next steps were to develop a dissimilarity measure, an estimator for rotations between coefficient vectors, and a rotation-invariant dissimilarity measure, all of them purely based on the compact signal representation. With all these techniques at hand, generating likelihood maps is straightforward, but first experiments indicated strong dependence on illumination conditions. This is obviously a challenge for all holistic methods, in particular for a spherical harmonics approach, since local changes usually affect each single element of the coefficient vector. To cope with illumination changes, we investigated preprocessing steps leading to feature images (e.g. edge images, depth images), which bring together our holistic approach and classical feature-based methods. Furthermore, we concentrated on building a statistical model for typical changes of the coefficient vectors in presence of changes in illumination. This task is more demanding but leads to even better results. The second major topic of this thesis is appearance-based robot navigation. I present a view-based approach called Optical Rails (ORails), which leads a robot along a prerecorded track. The robot navigates in a network of known locations which are denoted as waypoints. At each waypoint, we store a compressed view representation. A visual servoing method is used to reach a current target waypoint based on the appearance and the current camera image. Navigating in a network of views is achieved by reaching a sequence of stopover locations, one after another. The main contribution of this work is a model which allows to deduce the best driving direction of the robot based purely on the coefficient vectors of the current and the target image. It is based on image registration as the classical method by Lucas-Kanade, but has been transferred to the spectral domain, which allows for great speedup. ORails also includes a waypoint selection strategy and a module for steering our nonholonomic robot. As for our self-localization algorithm, dependance on illumination changes is also problematic in ORails. Furthermore, occlusions have to be handled for ORails to work properly. I present a solution based on the optimal expansion, which is able to deal with incomplete image signals. To handle dynamic occlusions, i.e. objects appearing in an arbitrary region of the image, we use the linearity of the expansion process and cut the image into segments. These segments can be treated separately, and finally we merge the results. At this point, we can decide to disregard certain segments. Slicing the view allows for local illumination compensation, which is inherently non-robust if applied to the whole view. In conclusion, this approach allows to handle the most important criticism to holistic view-based approaches, that is, occlusions and illumination changes, and consequently improves the performance of Optical Rails.

- Predictive monocular odometry using propagation-based tracking (2018)
- The technology of advanced driver assistance systems (ADAS) has rapidly developed in the last few decades. The current level of assistance provided by the ADAS technology significantly makes driving much safer by using the developed driver protection systems such as automatic obstacle avoidance and automatic emergency braking. With the use of ADAS, driving not only becomes safer but also easier as ADAS can take over some routine tasks from the driver, e.g. by using ADAS features of automatic lane keeping and automatic parking. With the continuous advancement of the ADAS technology, fully autonomous cars are predicted to be a reality in the near future. One of the most important tasks in autonomous driving is to accurately localize the egocar and continuously track its position. The module which performs this task, namely odometry, can be built using different kinds of sensors: camera, LIDAR, GPS, etc. This dissertation covers the topic of visual odometry using a camera. While stereo visual odometry frameworks are widely used and dominating the KITTI odometry benchmark (Geiger, Lenz and Urtasun 2012), the accuracy and performance of monocular visual odometry is much less explored. In this dissertation, a new monocular visual odometry framework is proposed, namely Predictive Monocular Odometry (PMO). PMO employs the prediction-and-correction mechanism in different steps of its implementation. PMO falls into the category of sparse methods. It detects and chooses keypoints from images and tracks them on the subsequence frames. The relative pose between two consecutive frames is first pre-estimated using the pitch-yaw-roll estimation based on the far-field view (Barnada, Conrad, Bradler, Ochs and Mester 2015) and the statistical motion prediction based on the vehicle motion model (Bradler, Wiegand and Mester 2015). The correction and optimization of the relative pose estimates are carried out by minimizing the photometric error of the keypoints matches using the joint epipolar tracking method (Bradler, Ochs, Fanani and Mester 2017). The monocular absolute scale is estimated by employing a new approach to ground plane estimation. The camera height over ground is assumed to be known. The scale is first estimated using the propagation-based scale estimation. Both of the sparse matching and the dense matching of the ground features between two consecutive frames are then employed to refine the scale estimates. Additionally, street masks from a convolutional neural network (CNN) are also utilized to reject non-ground objects in the region of interest. PMO also has a method to detect independently moving objects (IMO). This is important for visual odometry frameworks because the localization of the ego-car should be estimated only based on static objects. The IMO candidate masks are provided by a CNN. The case of crossing IMOs is handled by checking the epipolar consistency. The parallel-moving IMOs, which are epipolar conformant, are identified by checking the depth consistency against the depth maps from CNN. In order to evaluate the accuracy of PMO, a full simulation on the KITTI odometry dataset was performed. PMO achieved the best accuracy level among the published monocular frameworks when it was submitted to the KITTI odometry benchmark in July 2017. As of January 2018, it is still one of the leading monocular methods in the KITTI odometry benchmark. It is important to note that PMO was developed without employing random sampling consensus (RANSAC) which arguably has been long considered as one of the irreplaceable components in a visual odometry framework. In this sense, PMO introduces a new style of visual odometry framework. PMO was also developed without a multi-frame bundle adjustment step. This reflects the high potential of PMO when such multi-frame optimization scheme is also taken into account.

- Stixmentation : from Stixels to objects (2016)
- Already today modern driver assistance systems contribute more and more to make individual mobility in road traffic safer and more comfortable. For this purpose, modern vehicles are equipped with a multitude of sensors and actuators which perceive, interpret and react to the environment of the vehicle. In order to reach the next set of goals along this path, for example to be able to assist the driver in increasingly complex situations or to reach a higher degree of autonomy of driver assistance systems, a detailed understanding of the vehicle environment and especially of other moving traffic participants is necessary. It is known that motion information plays a key role for human object recognition [Spelke, 1990]. However, full 3D motion information is mostly not taken into account for Stereo Vision-based object segmentation in literature. In this thesis, novel approaches for motion-based object segmentation of stereo image sequences are proposed from which a generic environmental model is derived that contributes to a more precise analysis and understanding of the respective traffic scene. The aim of the environmental model is to yield a minimal scene description in terms of a few moving objects and stationary background such as houses, crash barriers or parking vehicles. A minimal scene description aggregates as much information as possible and it is characterized by its stability, precision and efficiency. Instead of dense stereo and optical flow information, the proposed object segmentation builds on the so-called Stixel World, an efficient superpixel-like representation of space-time stereo data. As it turns out this step substantially increases stability of the segmentation and it reduces the computational time by several orders of magnitude, thus enabling real-time automotive use in the first place. Besides the efficient, real-time capable optimization, the object segmentation has to be able to cope with significant noise which is due to the measurement principle of the used stereo camera system. For that reason, in order to obtain an optimal solution under the given extreme conditions, the segmentation task is formulated as a Bayesian optimization problem which allows to incorporate regularizing prior knowledge and redundancies into the object segmentation. Object segmentation as it is discussed here means unsupervised segmentation since typically the number of objects in the scene and their individual object parameters are not known in advance. This information has to be estimated from the input data as well. For inference, two approaches with their individual pros and cons are proposed, evaluated and compared. The first approach is based on dynamic programming. The key advantage of this approach is the possibility to take into account non-local priors such as shape or object size information which is impossible or which is prohibitively expensive with more local, conventional graph optimization approaches such as graphcut or belief propagation. In the first instance, the Dynamic Programming approach is limited to one-dimensional data structures, in this case to the first Stixel row. A possible extension to capture multiple Stixel rows is discussed at the end of this thesis. Further novel contributions include a special outlier concept to handle gross stereo errors associated with so-called stereo tear-off edges. Additionally, object-object interactions are taken into account by explicitly modeling object occlusions. These extensions prove to be dramatic improvements in practice. This first approach is compared with a second approach that is based on an alternating optimization of the Stixel segmentation and of the relevant object parameters in an expectation maximization (EM) sense. The labeling step is performed by means of the _−expansion graphcut algorithm, the parameter estimation step is done via one-dimensional sampling and multidimensional gradient descent. By using the Stixel World and due to an efficient implementation, one step of the optimization only takes about one millisecond on a standard single CPU core. To the knowledge of the author, at the time of development there was no faster global optimization in a demonstrator car. For both approaches, various testing scenarios have been carefully selected and allow to examine the proposed methods thoroughly under different real-world conditions with limited groundtruth at hand. As an additional innovative application, the first approach was successfully implemented in a demonstrator car that drove the so-called Bertha Benz Memorial Route from Mannheim to Pforzheim autonomously in real traffic. At the end of this thesis, the limits of the proposed systems are discussed and a prospect on possible future work is given.