A locally global approach to stereo correspondence (3DIM 2009)

S. Mattoccia, "A locally global approach to stereo correspondence", IEEE Workshop on 3D Digital Imaging and Modeling (3DIM2009), October 3-4, 2009, Kyoto, Japan

In this paper a novel approach to deal with the stereo correspondence problem induced by the implicit assumptions made by cost aggregation strategies is proposed. Cost aggregation relies on the implicit assumption that disparity varies smoothly within neighboring points except at depth discontinuities and state-of-the-art cost aggregation strategies adapt their support to image content by classifying each pixel based on geometric and photometric constraints. This method explicitly models this behavior from a different perspective, by gathering for each point, multiple assumptions that locally would be made by a hypothetical variable cost aggregation strategy. This framework enables to derive a function that locally captures the plausibility of the underlying geometric and photometric constraints independently enforced by supports of neighboring points.

Update: the following methods rely on the LC-technique.

Paper [8] shows that by enforcing the local consistency of the disparity field provided by fast algorithms based on 1D disparity optimization methods enables us to improve significantly the accuracy of the resulting disparity fields. The proposed approach relies on the LC[1] technique and, according to the Middlebury evaluation site [4], yields results comparable to top-ranked approaches based on 2D disparity optimization methods deploying the initial disparity hypotheses provided by two fast and representative SO or DP based algorithms. For our evaluation we deployed the disparity hypotheses provided by Hirschmuller's C-Semiglobal [6] and Wang et al's RealtimeGPU [7] algorithms. Additional experimental results concerned with the 3DPVT 2010 paper can be found at this link.

[8] S. Mattoccia, "Improving the accuracy of fast dense stereo correspondence algorithms by enforcing local consistency of disparity fields", 3D Data Processing, Visualization and Transmission (3DPVT2010), May 17-20, 2010, Paris, France [Pdf] [Supplementay_material] [Bibtex] [Additional experimental results]

Paper [9] proposes computational optimization and simplifications that allow us to enforce a relaxed local consistency constraint (RLC). The RLC technique, according to the Middlebury dataset [4] and deploying the framework proposed in [8], yields much more efficiently, in most cases, results comparable to the original LC technique. The RLC technique also exploits coarse grained parallelism available in modern multicore architectures and, on a Core 2 Quad (Q9300) processor @2.49 GHz, its execution time is less than 2 seconds . Additional experimental results concerned with the ECVW 2010 paper can be found at this link.

[9] S. Mattoccia, "Fast locally consistent dense stereo on multicore", to appear in Sixth IEEE Embedded Computer Vision Workshop (ECVW2010), CVPR workshop, June 13, 2010, San Francisco, USA [Pdf] [Supplementay_material] [Bibtex] [Additional experimental results]

In [10] is proposed an approach that enables us to significantly improve the effectiveness of a fast dense stereo algorithm by constraining local consistency on a superpixel basis. Superpixels are obtained by means of the Mean Shift segmentation algorithm. Additional experimental results concerned with the ICPR 2010 paper can be found at this link.

[10] S. Mattoccia, "Accurate dense stereo by constraining local consistency on superpixels", 20th International Conference on Pattern Recognition (ICPR 2010), August 23-26, 2010, Istanbul, Turkey

A description of these method has been included in the presentation reported at the bottom of this page.

Introduction to the Locally Consistent (LC) technique

Let's consider the following figure and a correspondence algorithm that aggregates costs on a support. Under these assumptions the same red point is included by the supports of different neighboring points (in blue). Each time that the red point is included by the support of a blue point a (potentially different) disparity hypothesis is assumed (for the same red point).

For example: in the figure, the red point is included by the supports (size 5x5) deployed by each of the blue points (shown 8 out of 25 cases). Each time a disparity assumption is implicitly enforced for the red point without taking into accounts the evidence that the disparity assumptions (for the red point) should be locally consistent.

Effectiveness of the Locally Consistent (LC) approach proposed is reported in two cases: deploying the disparity hypothesis provided by the (quite accurate and relatively fast) Fast Bilateral Stereo algorithm [1] (see this link for experimental results concerned with FBS [1] and [Software]) and deploying the disparity hypothesis provided by the classic (fast and inaccurate) algorithm based on fixed support windows (typically referred to as Fixed Window (FW)).

The LC approach has been evaluated on the Middlebury site and the results are available here (LC is referred to as LocallyConsist). Moreover, LC has been compared to state approaches according to the CVPR2008 paper [5]; experimental results are available here. Among state-of-the-art aprpoaches evaluated in [5], LC combined with FBS is ranked 1st while LC combined with FW is ranked 4th (see here for details).

LC + FBS vs FBS: experimental results on the Middlebury dataset

The top row shows the disparity maps concerned with the Fast Bilateral Stereo algorithm [2] while the bottom row shows the disparity maps after the Locally Consistent (LC) approach [1]. The regulatization effect provided by the LC approach can be easily perceived by comparing the disparity maps. Execution time for the LC approach with unoptimized code is less than 15 seconds on a 2.49 GHz Intel processor.

LC + FW vs FW: experimental results on the Middlebury dataset

The top row shows the disparity maps concerned with the Fixed Window (FW)algorithm (i.e. dummy cost aggregation) while the bottom row shows the disparity maps after the application of the LC approach [1]. In this case the regularization provided by the LC approach is even more evident since the disparity hypothesis provided by FW are quite inaccurate. Execution time for the LC approach with unoptimized code is less than 15 seconds on a 2.49 GHz Intel processor.

If you are interested in stereo vision, you might find interesting this:

Stefano Mattoccia
"Stereo vision: algorithms and applications" [

]
Extended version of the talk given at the University of Twente, April 1st 2009 (Updated on July 25th, 2010)

If you have any question feel free to contact me at:
Stefano Mattoccia, email: smatt AT ieee DOT org