Another technique used in computer vision to recover 3-D information is binocular stereopsis. This technique is more useful than line labeling because it can be applied to real-world scenes. Humans recover depth information the same way. In fact, we
can simulate seeing a 3-D image with 2 separate left and right images sent to
each eye -- like the Viewmasters of our youth, or the stereopticons of the first
half of the century.
Disparity
The key component of this technique when applied to computer vision is disparity:
the difference in location between objects in the left and right image pair.
These paired features are called conjugate pairs. Increasing the baseline (the
distance between the two cameras) increases depth perception, but makes for a
smaller set of conjugate pairs as some points are obscured.
As can be seen in the example above, one part of the block in the background is occluded by the Rubik's cube in the right half of the image pair. Thus, the computer cannot generate conjugate pairs for that edge of the block.
Conjugate Pairs
The task of the computer, then, is to find conjugate pairs. One way to limit the
data is through edge detection -- consider only pixels on edges to find conjugate
pairs. Another method is through region correlation around interesting points. An
"interesting point" in an image is defined as a point that has a high local
variance in intensity, where the local region can be anywhere from a 5 to 11
pixel square. They can indicate, as edges do, a change in what is being viewed.
The Epipolar Line
The most important constraint in binocular stereopsis is the existence of the
epipolar line. Since the offset between the left and right cameras is only
horizontal, the only possible disparity is horizontal, as shown in the image below.
This important constraint
allows for fast recovery of depth in computers through stereopsis.