I am not a biologist or physiologist and do not know about the brain at any cellular level. My understanding of the brain is from observations of the how human beings interact with real world events assuming that all or most of them use their brains, together with the physics and backgrounds of what the events are.
Based on such interactions and correlations, I have come up with some conclusions that might be useful in further understanding various displays and how they may be designed and used for various purposes.
This is the first part of a 2-part series of articles about how our brains interact with the real world, and how that makes us think that the virtual world is actually real.
In recent years, there has been a lot of work done on 2D, 3D, light field, VR, AR, and immersive display systems. Besides the advancements in the technology areas, people often forget that the most important part of the system is the human brain itself. It is the human brain that interprets the inputs from the eyes, which are the pixel sensors, that allow human to perceive the “image”.
An Early Lesson in Communication
Way back in college, I was an EE major and was taking a class in communication from a Professor named the “Father of Satellite Communications”. The topic was about wireless, satellites, coding, etc. During the first lecture, the Professor played a sentence read by a person with a very heavy foreign accent and asked us to write the sentence down on a piece of paper. No one understood what was read. He played it over and over again.
Finally, after a few times, most of us, but not all, understood and were able to come up with the correct sentence. After that, the Professor played another sentence read by the same person and this time, he played it only once. Surprisingly, most of us understood and wrote down the correct sentence after hearing it only once.
It was my first introduction to communication. The moral of the exercise was that in communication, it is not WHAT the message is; it is what we THINK the message is. I think this can be considered as an axiom, but no one actually stated it as such. If there is such an axiom, it would have affected all the history and development of communication systems, which are all limited by real world boundaries such as bandwidth, noise, error rates, signal power, etc.
The Brain’s Job
This is where the brain does its job. The brain has to work hard to “collect data”, and then store the data into the brain cells, so that it can be used in future. When new inputs are received, immediately the brain goes to work trying to combine the inputs with the stored data and comes up with an interpretation of the inputs. The processing requires a certain amount of brainpower, which is determined by the amount of inputs received and how much data should be stored.
This is why we feel tired if the inputs are not complete and the stored data are not sufficient. We all have experience that when we meet someone new with a strong accent, or listen to a lecture from a foreign Professor, we will get tired easily. The modern equivalent of the description is self-learning in Artificial Intelligence (AI). It takes a lot of processing power and the processor gets very hot.
I was a Hi-Fi enthusiast when I was young and I was always puzzled by the quality of sound transmitted by wireless systems, such as spacecraft, ships, police radios, etc. I sometimes had a hard time understanding what was said by the Apollo astronauts. Why couldn’t they sound with the same quality as radio broadcasts? I was ignorant and asked people around me as much as I could. I recalled one of the answers was quite simple, “As long as Houston understands them.” I did not realize the issues with bandwidth (the old way of saying bit rate) and signal power at that time until much later.
The need for the brain also applies to the processing of visual images as well as audio information. When someone reads a paragraph or looks at an image, depending on what the purpose is, data from previous experiences are also store to be used in future.
Some Brain Excercise
One common exercise in management workshops is to count the number of F’s in a sentence such as:
“FEATURE FILMS ARE THE RESULT OF YEARS OF SCIENTIFIC STUDY COMBINED WITH THE EXPERIENCES OF YEARS.”
If you are a typical reader like me (and me – Man. Ed.), you would have counted three, forgetting the “of”s, which is wrong. The correct answer is six. Most people forget that “of” has an f in it. This simply shows that the data stored in the brain together with the visual inputs, disregarded the f’s in the “of’s” in the sentence. If we need to come up with the correct count of six, the brain has to work much harder.
For security purposes, when we forget our passwords and when we are asked to come up with a new password, often times, we need to be confirmed as humans and not robots. We do this by looking at an image with hidden letters in them. If we can read them correctly, the new password will be accepted, otherwise, it will be disregarded, as we will be considered to be robots. The assumption is that a person, not a robot, will have the common letters stored in their brains and will be able to interpret the crocked letters correctly while the robots will not.
What About 3D
How does the brain combine two separate 2D images from the two eyes and come up with a 3D interpretation? The human brain has a vast database of experiences with complex relationships of perspectives, focus/convergence of the eyeballs and various shapes, etc. of real life objects and sceneries. When the two images from the eyeballs are received, the brain will have to match the two 2D images based on the database of experiences and come up with the interpretation in 3D.
If you could cross your eyes and look at the following images, you would see 3D at the center between the two original images on each side. It is not easy to have the eyes crossed and not everyone can do it since it is not natural. The brain has to work hard to disconnect the focusing of the eyeballs at the focal plane and the natural convergence such that when the eyes are crossed, the left eye image and the right eye image overlap and this allows the 3D interpretation.
The eye muscles move in such a way with the signal from the brain based on the matching of the two images with the stored experiences of 3D objects. If you cannot cross your eyes, the help of a 3D viewer will be needed. This consists of two prisms changing the line of sights, thus the effective convergence of the eyeballs.
The modern version, Google Cardboard, uses cell phones as the image display. In this case, in addition to static images, video can be shown.
To be continued.
Part 2 will continue to cover the interaction of the brain with various types of displays and how do we (or our brains) perceive various displays and in particular, immersion in practice.
If you have any questions, please contact the author at email@example.com.