We need to start thinking about how we perceive the letters and words we see on a piece of paper. Also, let's ask ourself about how eyes move on the paper to collect information and most fascinatingly, how do those information integrate into letters and later words.
Jumping Photographer.
Let's start with eye behavior. Although it doesn't seem that way, the eyes move across the paper in small jumps and "snaps" the environment along with the text on paper.
The taken "imprint" is outlined on the retina of the eye, like the film on old cameras.
As expected, the highest resolution is in the central view, so the words are best outlined there. The retina of the eye has a certain number of "pixels" (cells; rods, cones) and each pixel would represent one dot on paper. After taking a photo, a small processing of the captured print is performed (reduces the quality and increases the contrast) and turns it into a neural signal.
This imprint transformed into a neural signal is projected onto the thalamus, as the first decoding station.
Encode then decode.
Enigma is a well-known machine for encrypting messages from the 2nd World War. In short, the desired message is translated into code and sent to someone who also has Enigma. To decipher the message, the receiver must be instructed on how to use his Enigma to understand the message. These principles are also used by the brain.
Neural encoders translate the external environment into a code that they transmit to a neural center (the so-called nucleus) where decoders who know how to "interpret" that code are waiting.
Moving on, that decoder carries that information to the next center, where as a new encoder it looks for its decoder, etc.
In our case, let's imagine a television studio where filming is in progress. The camera converts everything that records into code that it sends via cables to the monitors, which in their circuits have instructions on how to display the received code. Now, let's say we want to see what is being communicated (like the English) through the brain, cut the connection and start analyzing the code we receive. Like Turing, we decide to build a decoder machine (Figure 1C) and attach it to each cut link. I hope later figure 2 will explain the whole process well.
Dots, unite!
Our decoder is now trying to make out from the 2D code the shapes we call letters. Figure 2 shows a schematic diagram of one protocol that we follow on a horizontal longitudinal section of the brain. We presented a paper with some text to our eyes (hm...) and in 7 locations (A-G) we inserted our tireless decoder which tries to find and displays the text on the screen.
Therefore, if we insert the decoder between the eye and the thalamus (fig. 2A), we would not be able to detect at all whether the eye is looking at some text; i.e. does the photo it took have any text at all. The decoder still looks at the image of everything that the eye has captured, so it is impossible for him to figure out which "cables" to connect.
If we move our decoder (fig. 2B), the thalamus "sorts" the cables coming from the eye and compresses the image, reducing the work. Now we catch e.g. dots on top of the letter 'i'. If we continue, our thalamus is now the new encoder and sends a processed version of the "print" to the back of the brain (primary visual cortex). So far we have talked about cables, roads and projections, and now we are talking about networks and fields, or better, parcels.
Neural networks of the visual cortex give meaning to photo-imprints (~perception).
In the first parcel V1 (fig. 2C) we are already starting to group dots that in V2 (fig. 2D) decoder connects into slightly more meaningful contours of letters: dashes and curves. We don't go to V3, but turn to V4 (fig. 2E) where, by simple combinations of the previous ones, we recognize some first forms of letters (2 dashes give 'T') and as expected, by compounding we get all possible letters.
However, V8 is where we start to form meaning. So far we have only seen the shapes, but we did not "know" that 'E','e','E','E', 'e', 'e', 'E' or 'e' represent the same letter.
In V8, we give the first meaning by classifying all perceived shapes into letters (fig. 2F).
We now know that the 'e' mark is, strangely enough, a lowercase letter 'E', regardless of where it is or how it is written (eg. font, color, thickness, size, letter,...).
It may be interesting to note that in Figure 2, plots V1, V2 and V4 are colored in both hemispheres. As soon as we started making sense of shapes (V8), we stopped labeling the right hemisphere. In order not to "argue", and because the spatial accuity is not its strength, the right hemisphere leaves left hemisphere in charge of this task. Instead, the right hemisphere takes some of the work and sends to the left what it decodes and somewhat facilitates the work.