December 2012
We
have completed the design of our embedded (stereo and mono) camera with
highly efficient FPGA onboard processing. In stereo mode, the whole
processing pipeline fits into entry level FPGA devices without
additional hardware requirements delivering accurate and dense depth map in real-time. The
imaging sensors, connected to the FPGA board with a standard
interface, provide color and monochrome images up to 60 fps.
The embedded camera has software API for:
- Windows 32 and 64 bit
- Linux 32 and 64 bit
- Linux ARM
- Mac
- Android
Further details and videos will be available soon.
If you are interested in this project for your applications feel free to contact me:
The
research activity on stereo reported below is quite outdated. For an
updated overview of my research activity on stereo follow this link.Moreover, if you are interested in stereo vision you might find interesting this seminar on "Stereo vision: algorithms and applications".
This page provides experimental results
and applications concerned with the
Single
Matching
Phase (SMP) stereo algorithm [
1]. Although
several approaches for computing very accurate depth maps have been
recently proposed
(see for example [
8], [
9])
most of these
are not currently suitable for real-time applications (see [
11] for a performace evaluation of cost
aggregation strategies proposed for stereo matching). Conversely, SMP
is a fast and reliable algorithm for computing dense stereo
correspondence in real-time. The
SMP
algorithm uses the
uniqueness constraint as one
of the main cues for detecting unreliable
measurements. In [
1]
we provide,
on a
large set of standard stereo pairs with ground
truth (namely "
Tsukuba", "
Map",
"
Sawtooth", "
Venus", "
Barn1", "
Barn2", "
Bull" and "
Poster"
available at
the Scharstein
and Szeliski's web site [
4] and used
in
their paper [
5]), the result of
a
quantitative comparison between the SMP approach and a known
algorithm [
3] based on
bidirectional
matching (BM). Bidirectional matching is also often referred to
as
left-right consistency check
or
left-right constraint. We
also
provide, in [
1]
and [
2], experimental results
concerned
with rectified stereo sequences acquired in our
laboratory with a digital stereo camera and preliminary results
concerned with a
3D Tracking
application and a
3D People Counting
application.
The
SMP algorithm has been implemented in C exploiting the SIMD parallell
capabilities (e.g. MMX and SSE technologies) available in recent
Intel, AMD (and many others) microprocessors. A detailed description of
the
SIMD
mapping of the SMP algorithm is available in [2].
A more recent approach concerned with (
near) real-time stereo matching algorithm
was proposed in
[12] (experimental results here, evaluation on the Middlebury dataset here).
November 2010: The SMP algorithm has been implemented on a Texas Instrument DaVinci DSP (300 MHz CPU + 600 MHz DSP) by Anouar Manders at SenseIT. This implementation runs at 5/6 fps with 640x480 stereo pairs, 15x15
windows, disparity range of 64 pixels and 1/8 subpixel disparity interpolation
(detection of unreliable disparities is not implemented yet).
If
you are interested in the SMP
algorithm
or in its applications feel free to contact me at:
Overview of a stereo vision system
In this page are provided detailed experimental results
and
videos concerned with
3D Tracking,
3D People Counting
and
3D
Change/Intrusion Detection applications (described in [
10]) that
rely on the SMP algorithm [
1]
for real-time dense depth
measurements.
OpenGL based real-time 3D
visualization of the depth map provided by the SMP algorithm [
1]
Experimental
results with stereo pairs with ground truth:
comparison between SMP
and BM algorithms
This section provides
experimental results obtained with SMP [
1]
and BM [
3]
on
a standard set of
stereo pairs (namely "
Tsukuba",
"
Map",
"
Sawtooth", "
Venus", "
Barn1", "
Barn2", "
Bull" and "
Poster"
)
with available ground truth. The stereo pairs and the ground truth are
available at the Scharstein
and Szeliski's [
4]
web site.
Disparity values are encoded with 256 gray levels, with brighter levels
representing points closer to the camera and unmatched points
represented in white.
Click on the image to view the results
obtained by SMP
and BM algorithms.
Tsukuba
Map
Venus
Sawtooth
Barn1
Barn2
Bull
Poster
Figure 1
reports the execution times obtained
on a Pentium III 800 MHz running the two algorithms on 320x240, 640x480, 800x600 and 1024x768
pixels images and with disparity ranges of 16, 32,
48, 64, and 80 pixels. The graph shows that with a small disparity
range and small image sizes the BM algorithm is slightly faster. However, as soon as disparity and/or
image size increases SMP algorithm gets faster. The SMP algorithm
turns out to be significantly faster with a large disparity
ranges and/or image sizes.
Figure 1: Performance in terms
of msec
per frame for SMP and BM on Pentium III 800 MHz processor
For example; on a Pentium III Processor at 800
MHz, with 800x600 stereo pairs and a
disparity range
of 16 our algorithm runs at 5.56 fps while BM at 6.96. With this image
size and a disparity range of 80 our SMP algorithm is nearly twice
faster
than BM (i.e. 2.89 fps for SMP and 1.51 for BM).
Experimental results with real
stereo sequences
and applications of the SMP algorithm
This
section presents experimental results obtained on stereo
sequences acquired in
our laboratory with a monochrome
MEGA-D
digital stereo head (by Videre Design) equipped with a pair of
4.8 mm lenses.
Calibration of the stereo
camera:
dataset and results
The
MEGA-D stereo head uses a IEEE 1394 firewire
interface and has a fixed
baseline of about 9 cm. The original stereo pairs were rectified
using
the
intrinsic and extrinsic camera parameters
estimated with the functions
provided by the MATLAB Camera Calibration Toolbox available here.
Image size is 640x480 and the rectified sequences were processed
using a 15x15 correlation window, a disparity search range of 64 pixels
and a subpixel accuracy of 1/8 .
- The
stereo pairs used for the calibration
of the stereo
camera are available here
The calibration result
(estimated intrinsic and extrinsic parameters) is available here
Application I: "3D tracking"
In
this section we show experimental results obtained with SMP [
1]
on two
stereo sequences acquired in our laboratory and referred to as "Outdoor" and "Indoor". We are currently using
these sequences within a research activity aimed at developing a
real-time 3D
People Tracking application. The tracking approach is based on first merging the disparity maps
extracted
by SMP algorithm with the information provided by a change-detection
algorithm in order to build a suitable plan-view
representation [6] and [7] that
enables us to
track, in real-time,
moving objects in the 3D space.
Stereo sequence: "Lab_1"
The videos are provided in DivX format.
"Lab_1" stereo sequence: 3D tracking (video available here)
Sequence acquired with a VidereDesign stereo color camera @640x480. Rectified stereo pairs, output of the SMP algorithm and other details will be provided soon.
Stereo sequence: "Lab_2"
The videos are provided in DivX format.
"Lab_1" stereo sequence: 3D tracking (video available here)
Sequence acquired with a VidereDesign stereo color camera @640x480. Rectified stereo pairs, output of the SMP algorithm and other details will be provided soon.
"Cortile " stereo sequence
Background
Moving people
At this
link you can find information
concerned with the stereo sequence "
Cortile". We provide: the rectified
stereo sequences (320x240 and 640x480), the disparity maps (for five settings 1, 2, ,4 , 8,
16 of the subpixel parameter) computed by the SMP stereo
algorithm and the parameters for obtaining the 3D depth measurements. Disparity
maps are encoded with RGB images (saved with OpenCV) as described in
the README
file.
"Outdoor" stereo sequence
The videos are provided for best quality
in zipped AVI format.
The videos are also provided in DivX format
(the DivX codec is available at www.divx.com)
Frame 0050 of the "Outdoor"
stereo
sequence
(Top Left) Original Left Image, (Top Right) Original Right
Image,
(Bottom Left) Rectified Left Image, (Bottom Right) Rectified Right
Image.
The entire video; in DivX format is available here (size 2.8 MB), in zipped AVI format is
available here
(size 32.1 MB)
Results on frame 0050 of the
"Outdoor" stereo sequence
(Top Left) Disparity map with threshold set
to 0, (Top Right) Disparity map with threshold set to 1,
(Bottom Left)
Disparity map with threshold set to 2, (Bottom Right) Disparity map
with threshold set to 3.
The entire video; in
DivX format is available here (size 2.8 MB), in zipped AVI
format is available here
(size 5.12 MB)
Results on
frame 0050 of the "Outdoor" stereo sequence
(Top Left) Original Left
image, (Top Right) Rectified Left image,
(Bottom Left) Disparity map with
threshold set to 0, (Bottom Right) Disparity map with threshold set to
3.
The entire video; in
DivX format is available here (size 2.8 MB), in zipped AVI format is
available here
(size 51.1 MB)
Preliminary results of
the real-time 3D tracking application on
frame 0219 of the "Outdoor" stereo sequence
(Top Left) Rectified Left
image, (Top Right) Disparity map with
threshold set to 0,
(Bottom Left) Output of the change detection merged with the disparity
map, (Bottom Right) Detected 3D position of the moving people/objects
in the field of view of the cameras.
The entire video; in
DivX format is available here (size 4.4 MB), in zipped AVI format is
available here
(size 29.6 MB)
- The original stereo
sequence in
".iss" format is available here
- The left images of the original
sequence in ".bmp" format are available here
- The right images of the original
sequence in
".bmp" format are available here
- The rectified left images of the
original sequence in ".bmp" format are available here
- The rectified right
images of the original sequence in ".bmp" format are available here
- The disparity maps
(size 640x480) in ".bmp" format obtained processing the rectified
stereo sequence with the SMP algorithm:
- disparity range 64, subpixel 1/8, window
size 15 x 15 and threshold
set to 0: here
- disparity range 64, subpixel 1/8, window
size 15 x 15 and threshold
set to 1: here
- disparity range 64, subpixel 1/8, window
size 15 x 15 and threshold
set to 2: here
- disparity range 64, subpixel 1/8, window
size 15 x 15 and threshold
set to 3: here
"Indoor" stereo sequence
The videos are provided
for best quality in (zipped) AVI format.
Some videos are also provided
in DivX format
(the DivX codec is available at www.divx.com)
Frame 0103 of
the "Indoor" stereo sequence
(Top Left) Original Left Image, (Top
Right) Original Right Image,
(Bottom Left) Rectified Left Image, (Bottom Right) Rectified Right
Image.
The entire video; in DivX format, is available here (size 2.8 MB), in zipped AVI format is
available here
(size 46.9 MB)
Results on
frame 0103 of the "Indoor" stereo sequence
(Top Left) Disparity map
with threshold set to 0, (Top Right) Disparity map with threshold set
to 1,
(Bottom Left)
Disparity map with threshold set to 2, (Bottom Right) Disparity map
with threshold set to 3.
The entire video; in in
DivX format is available here (size 12.8 MB), in zipped AVI format is available here (size 15.6 MB)
Results on
frame 0103 of the "Indoor" stereo sequence
(Top Left) Original Left
image, (Top Right) Rectified Left image,
(Bottom Left) Disparity map with
threshold set to 0, (Bottom Right) Disparity map with threshold set to
3.
The entire video;
in in
DivX format is available here (size 10 MB), in zipped AVI format is available here (size 46.9 MB)
Preliminary results of
the real-time 3D tracking application on
frame 0120 of the "Indoor" stereo sequence
(Top Left) Rectified Left
image, (Top Right) Disparity map with
threshold set to 0,
(Bottom
Left) Output of the change detection merged with the disparity map,
(Bottom Right) Detected 3D position of the moving people/objects
in the field of view of the cameras.
The entire video; in in
DivX format is available here (size 8.3 MB), in zipped AVI format is available here
(size 27.2 MB)
- The original stereo sequence in
".iss" format is available here
- The left images (size 640x480) of the
original
sequence in ".bmp" format are available here
- The right
images (size 640x480) of
the original sequence in ".bmp" format are available here
- The rectified left
images (size 640x480) of the original sequence in ".bmp" format
are available here
- The rectified right
images (size 640x480) of the original sequence in ".bmp" format
are available here
- The disparity maps
(size 640x480) in ".bmp" format obtained processing the rectified
stereo sequence with the SMP algorithm:
- disparity range 64, subpixel 1/8, window
size 15 x 15 and threshold
set to 0: here
- disparity range 64, subpixel 1/8, window
size 15 x 15 and threshold
set to 1: here
- disparity range 64, subpixel 1/8, window
size 15 x 15 and threshold
set to 2: here
- disparity range 64, subpixel 1/8, window
size 15 x 15 and threshold
set to 3: here
Application II: "3D people
counting"
This section shows preliminary results
of another application aimed at counting in real-time people
moving in the field of
view of a stereo camera. The
3D
People Counting application measures the flow of
people crossing a
virtual gate
in the 3D space. The green line on the floor, in the first shot of
the "
Count" stereo sequence,
shows the 3D
position of the
virtual gate between
regions A and
B. The
3D People Counting
application relies on the 3D
depth measurements provided by SMP algorithm [
1]
for tracking
and counting people in
real-time using a
plan-view representation [6] and [7].
The application counts people crossing from region A to region B (red
in the plan view map on the right) and people crossing from region B to
region A (green in the plan view map on the right). A video containing
the entire sequence is available
here.
Preliminary results of
the real-time 3D People Counting application: (Left) Original
Left
image of the "Count" stereo sequence
(Right)
Detected 3D position of the tracked people
in the field of view of the cameras and statistics about the crossing
in the two
directions (A->B and B->A).
The entire video, in DivX
format, is
available here (size 11.7 MB)
Application
III: "3D
Change/Intrusion Detection"
This section provides experimental
results concerned with a robust real-time Change/Intrusion Detection
approach, described in [10], which jointly exploits depth information
coming from a 3D device and 2D brightness information. Information on
scene changes is recovered by means of two different strategies. The
former, referred to as 3D Output,
mainly relies on depth information, and aims at being robust to
camouflage, shadows and sudden illumination changes. The latter,
referred to as 2D Output,
aims at obtaining robustness with regards to sudden illumination
changes as well as accuracy in the foreground segmentation. The final
change masks determined by the two outputs will be referred to as,
respectively, C2D and C3D.
As depicted the following figure, the proposed approach, using as 3D
device a stereo vision system, can be outlined as a 4-stage
algorithm. The overall system relies on the SMP algorithm [1] for
real time dense depth measurements.
Flow diagram of the 3D
Change/Intrusion
Detection application.
A detailed description of the overall approach can be found in [10]
The following figure shows
preliminary experimental results obtained processing a challenging
stereo sequence, referred to as "Office",
with the 3D Change/Intrusion Detection application.
In this indoor sequence, acquired with a rectified color stereo camera,
the strong photometric distortions (clearly visible comparing the 9
frames shown in the following figure) are induced by switching lights
on and off. Moreover, it is worth observing that the same sequence is
also affected by severe shadow and camouflage problems. The overall 3D
Change/Intrusion Detection application,
includng the disparity maps generation step, runs in real-time on a
standard personal computer.
Preliminary experimental results on 9 out of 195 frames of the
Office stereo sequence:
(First column) - Reference image F of the stereo pair (Second column)
Background model B2D registered according to the
specification given by the histogram of the frame F
(Third column) - Disparity map D computed by the SMP algorithm (Fourth
column) - Change mask C2D provided by the 2D Output approach
(Fifth column) - Change mask C3D provided by the 3D Output
approach.
- The rectified left images of the
original sequence in ".bmp" format are available here
- The rectified right
images of the original sequence in ".bmp" format are available here
- The disparity maps
(size 320x240) in ".bmp" format obtained processing the rectified
stereo sequence with the SMP algorithm are available here.
- 2D Output in ".bmp" format available here.
- 3D Output in ".bmp" format available here.
NOTE
If you use the "Indoor",
"Outdoor" or "Office" datasets or
the dataset used for the calibration of the stereo
head please cite this website:
www.vision.deis.unibo.it/smatt/stereo.htm
In you use the disparity maps computed with the SMP algorithm available
on this site please cite paper [1]:
L.
Di Stefano, M. Marchionni, S. Mattoccia
“A fast area-based stereo matching
algorithm”
Image
and Vision Computing 22(12),
pp 983-1005, October 2004