Computer Vision

Elements of Computer Vision

Binocular Stereopsis

Binocular Stereopsis

Another technique used in computer vision to recover 3-D information is binocular stereopsis. This technique is more useful than line labeling because it can be applied to real-world scenes. Humans recover depth information the same way. In fact, we can simulate seeing a 3-D image with 2 separate left and right images sent to each eye -- like the Viewmasters of our youth, or the stereopticons of the first half of the century.

Disparity
The key component of this technique when applied to computer vision is disparity: the difference in location between objects in the left and right image pair. These paired features are called conjugate pairs. Increasing the baseline (the distance between the two cameras) increases depth perception, but makes for a smaller set of conjugate pairs as some points are obscured.

As can be seen in the example above, one part of the block in the background is occluded by the Rubik's cube in the right half of the image pair. Thus, the computer cannot generate conjugate pairs for that edge of the block.

Conjugate Pairs
The task of the computer, then, is to find conjugate pairs. One way to limit the data is through edge detection -- consider only pixels on edges to find conjugate pairs. Another method is through region correlation around interesting points. An "interesting point" in an image is defined as a point that has a high local variance in intensity, where the local region can be anywhere from a 5 to 11 pixel square. They can indicate, as edges do, a change in what is being viewed.

The Epipolar Line
The most important constraint in binocular stereopsis is the existence of the epipolar line. Since the offset between the left and right cameras is only horizontal, the only possible disparity is horizontal, as shown in the image below.

This important constraint allows for fast recovery of depth in computers through stereopsis.

Back to the Table of Contents.