Having two eyes facing forward is a great advantage. Animals that have two eyes which face forward see the some objects in each eye. But because the two eyes are separated by a small distance each eye can see the same scene but with slight differences. The two photos of the glass with the pens have been taken by moving the camera the same distance as there is between the average person’s eyes. Look closely and see that there are slight but definite differences in the spacial arrangement of the object. In the third photo I have photoshopped the two images together and you can see the intolerable double vision that would result if the brain just superimposed the two images one on top of the other.
Clearly the brain is able to deal in some way with this so we don’t see two different images superimposed. Instead in reality we get a clear 3 dimensional view of the scene. How does our brain do it?
Look closely at the edge of the blue pen nearest the camera in each of the two photos. See that in one there is greater distance A between that pen and the pencil behind than there is in the other picture with the B. When the visual cortex sees these differences it knows from an innate algorithm that the objects are arranged in space in a specific way. This algorithm can then ascribe a distance value based on the differences between the two eye images. It then can tell us, the owner of the eyes, how these objects are arranged in front of us. We can see the world in 3D. Millions of years ago such information about the immediate world was very useful swinging from one tree branch to another. Our ancestors could just look and tell exactly how far away the next branch was. A pretty important skill if you wanted to survive. When they came down out of the trees this 3D vision was a great asset in tool making and also when using those tools. I think 3D vision was probably as important as our opposable thumbs in our evolution.
We can create an impression of 3D by presenting two different images to the brain, one through each eye. We all can remember 3D viewers when we were children. We look though them, held them up to the light and could see Mickey Mouse or Grand Canyon as though we were there. By presenting the eyes with two separate images the brain could “do the maths” and give us the impression of 3D. These images were static but quite recently it have been possible to present moving images. The result is that we can perceive a moving scene in 3D. In 3D cinema and 3D TV there is the problem of how do we present the different images to each eye while looking at a screen far away. The crudest method is to use red/green glasses. This works by having two images projected on the cinema screen or TV. All 3D footage is shot using two cameras. These cameras are mounted together on a single tripod and are separated by the same distance as there is between the human eye, about 65mm. Therefore each camera records a slightly different image as a result. Now we put a red filter in front of one camera and a green one in front of the other. When we play the film back we project the images from each camera simultaneously onto the same screen and watch them through red/green goggles. What happens is that the eye that looks through the red goggle sees the red camera best and can’t see the footage from the green camera and vice versa. In this way the brain gets two different images with all the spacial clues to build a 3D image. Another way is the instead of red and green filters in front of the camera the cameraman can use polaroid filters. One filter is vertical and the other horizontal. Then we can view the footage through polaroid glasses so we see the footage from the vertical polarized camera with our vertical polarized glass and the other eye sees the footage from the horizontal polarized camera. And again, the brain is able to make a single 3D image from the two separate images it gets.
So, in VR goggles each eye is shown a different image of the same scene shot from a slightly different angle. The brain then uses that data to construct a three dimensional image which we perceive as though we were really there. If you take your phone out of the goggles you will see that there are two images on the screen, each one slightly different. In the goggles your right eye sees the right image and the left sees the left image, and hey presto!
And the very latest cameras can also shot the two images in 360 degrees which means that as you move your head around and look up and down the movement sensor tracks the movement and changes the view to mimic your head movement. And if you have read (the book is much better) or seen Ready Player One you will know that we are not far off being able to place an avatar of ourselves into that virtual world which will interact with the avatars of other people. So we could soon live another life in the digisphere.
A similar algorithm is used by the brain to locate the origin of sounds. We all know that we can hear a sound and tell pretty accurately where it has come from. This is because the sound reaches each ear at a slightly different time. The ear that is nearest the sound hears it first and then the same sound reaches the other ear. The tiny time difference is used by the brain to calculate where the origin if the sound is. That’s why an animal will cock one ear toward the sound to try to locate it. By placing one ear closer to the sound and the other further away the time difference is maximised (as well as the loudness ) so that the brain’s algorithm is more easily able to ascribe a location to the sound origin. This is clearly a very important skill.