In this chapter we will discuss the visual perception and how we perceive objects like an everything.

We will go back to review some of the most relevant aspects of how our perceptual system unifies information, how it differentiates what the object is from the rest of the scene, the "background" and how the outline of the "proto-image" is elaborated, from the computational point of view, with the theory of David Marr and from the point of view of the emerging stimuli of Treisman and Biederman.

Finally we will see the importance of the context and how the scene remains constant despite changing the size or lighting conditions of the environment.

Visual perception of objects

The visual perception of objects has a marked psychophysical character, that is, starting from individual characteristics, we move on to groups of characteristics, the objects themselves and the scenes.

Currently most researchers admit that the visual process It is due to the combination of stimuli, the sensations that determine and their integration into different areas of the brain.

The first to understand this multiple process were the members of Gestalt. Max Wertheimer was the initiator of this current when studying the phenomenon of apparent movement, where a sequence of images, fixed and static, presented with a suitable frequency, were able to generate the sensation of movement. The Frankfurt school with its Psychology of Gestalt, put into question the structuralist movement that dominated at that time and proposed that perception is created from sensations.

Gestalt psychologists studied optical illusions, concluding that the whole is different from the sum of the parts. They studied the way in which the small elements of the objects are grouped, proposed what is known as Laws of the perceptive organization: the law of pregnancy, the law of similarity, the law of good continuation, the law of proximity, the law of destiny common and law of familiarity.

Although these organizing principles are called laws, for the majority of psychologists today, they correspond to heuristics or general rules that help solve a problem, unlike a law or algorithm, which is a process that always leads to the solution of a problem.

The consideration of heuristic to the principles of the Gestalt, obeys to that its application is not always perfect, as far as the information that contributes. Sometimes it can make us make mistakes when interpreting what we perceive, however, the use of heuristics is ampIn the brain, in the way we solve problems, it is much faster to use a heuristic than the application of algorithms, with a much higher energy cost.

It seems that the result of evolution, we have become accustomed to using heuristics in the decisions we make and when solving any problem.

Recently, three new principles of perceptive organization have been proposed (Stephen Palmer, 1992 and 1999):

  • The principle of the common region.
  • principle of the connection between elements.
  • The principle of synchrony.

One of the problems that interested Gestalt researchers in visual perception was the so-called phenomenon of segregation, which we see in cases of background-figure, as in the example of Rubin's glass and faces.

visual perception

What it is that makes us see or be aware of one thing or another, what determines that one or the other predominates.

It seems that there are a series of clues that motivate the predominance of something as a figure and something as a background. The symmetrical areas tend to be seen as figures, in the same way as the smaller areas compared to larger areas, the latter would constitute the background and the significant elements, they are also more likely to be perceived as figures.

Constitution of objects

Marr's computational approach

For David Marr, the visual system It would act as if it were a computer programmed to view objects. The scene that appears in front of us would penetrate the eyes and would be projected on the retina.

The first step of the analysis will be determine light and dark areas as well as areas where intensity changes occur. This starts a first sketch of the forms that are part of the scene, is what is called "primary sketch", which includes closed areas such as circles, ellipses or squares, which would define the objects.

This first step is to define the objects in the scene, not their details or the shadows or changes in lighting.

According to Marr, the visual system performs this function by mathematically analyzing the changes in intensity of the image and what the author calls the natural restrictions of the world.

The next step is group primitive stubs. In the same way as Gestalt psychologists propose.

Now the primitive sketches are grouped according to criteria that presuppose using the process from top to bottom, although the author does not finish it completely. A surface representation of the objects of the scene would be obtained (sketch in 2-1 / 2 D), to finish in a three-dimensional reconstruction of the scene (sketch 3D).

Theory of the integration of characteristics

The theory of integration of characteristics, also known as TIC and proposed by Treisman and Gelade (1987,1993 and 1998), affirms that the perception of objects is produced according to a sequence of stages that begins in a first phase of form "pre-attentional", Where the system analyzes the image and determines the existence of the characteristics that form the basic units of perception, such as curvature, orientation, color, movement, etc.

In a second phase, "stage of focused attention", The basic characteristics combine to give rise to the perception of the object.

Once the object is identified, it is compared with the data of similar objects previously categorized and stored in the memory.

The passage from the first to the second stage is the key to this theory. The visual system works by determining emerging edges between areas composed of different elements and by means of a visual search procedure. In a scene we can have two sets of elements, next to each other to create campTextural, as in the figure. If the two areas contain different characteristics, a limit "stands out" immediately between the two areas, "emerges", as in the figure, where the components have different orientations.

object recognition

In the following image we can see the same situation more clearly.

The limits are produced because one of the components has lines that cross each other and the other does not (a and b), while in figure c, this does not happen, the contextural pattern is the same and no emergent limit is produced (Nathdurf , 1990).

how we see the objects

The visual search process

El visual search process Follow two guidelines.

One quasi-automatic detection, as in the case of the figure, where the "O" stands out against the background of the other letters that surround it "V" and which is immediately perceived, both in situation (a), with few distractors and in (b ), with more distractors. In the other figure something different happens, a detection that requires greater attention effort, perceiving the "R" is more difficult and becomes more complicated when we go from a few to many distractors, which means a greater expenditure of time and energy.

perception of forms

In the previous examples, in the first case, detection is automatic, and is called highlight in "pop out".

Automatically detect objects Vs with greater visual effort

The highlight in pop up is characterized because the detection of the stimulus, in this case the letter O, was independent of the number of distractors that surround it, letters "V", while in the cases that did not emerge automatically, as in the example of the letter "R" , the number of distractors, now did influence the time required for their detection, the more distractors and more similar to the stimulus to be detected, the more time required for their identification, as can be seen in the graph, which represents a study typical of visual search, where line (a) represents the case of O between V and, line (b) that of R between P and Q, where the number of distractors does increase the detection time.

detection of objects and forms

Studying the stimuli that had the characteristic of being emergent, Treisman found that the most prominent were:

  • Curvature.
  • Alignment.
  • Movement.
  • Colour.
  • Brightness.
  • Address.
  • Illumination. 

These characteristics would be detected at the beginning of the pre-attentional stage.

The experimental works of Treisman showed that the emergent characteristics are perceived independently (lines, color, textures, movement, orientation, etc.) and it is at a later stage when they would be integrated to give shape to the elements that make up the scene.

This first phase, of emerging independence, is explained by the fact that each of these characteristics is processed in a different place in the brain, movements are recorded in parietal areas while faces in the inferotemporal region.

The stage of combination of the characteristics perceived in the preatencional phase, would take place thanks to the mechanisms of focused attention, that would act as the "glue" that unites the characteristics in a concrete location.

Recognition by components

Proposed by Irving Biederman in 1987 and in line with the contributions of Marr and Treisman. The main difference lies in the fact that the detected elements have volumetric character, three-dimensional shape, and constitute the parts of an object.

Biederman called these volumetric units, "geones", and came to define 36 basic forms, all of them fulfilling three fundamental properties:

  • Invariance in sight: in which the geones can be identified even if they change their visual angle.
  • Discriminability: each geon differs from the other although the point of view varies.
  • Resistance to visual noise: in which a geon can be identified even if half of its structure is erased or blocked by another geon.

The following figure shows basic geones (a), which are parts of the figures in (b), highlighting the fact that only with the combination of 2 or 3 geones, figures that are already recognizable are formed.

visual identification of forms

Visual perception and the recognition and visual identification of objects

Recognize or see an object

Recognize an object It is the experience of perceiving something as it was previously known.

Identify an object means giving a name to an object, classifying it correctly in some categorization scheme, knowing in what context it is usually found, that is, remembering something about the object, as well as simply having seen it before.

The more times we see an object, the greater the trace left in the memory and the more familiar that object is flipped to us, that is, the easier it is to identify it, we move on to the concept of having to recognize it alone.

Both processes occur through a memory and recall mechanism. When we see something, a perceptual image is generated and must be compared with other representations in memory, along with the connections that these other representations have to other information stored in memory.

Visual perception driven by data and driven by concepts

La Visual perception sends data from the retina, as color, brilliance, etc., which will be grouped, following the precepts of the Gestalt, until constituting a set that makes sense for the observer, that identifies that information as something concrete.

On the other hand, we can process by concepts, through a top-down mechanism, based on previous experiences, emotions, previous knowledge, etc. that actively guide the search for certain patterns in the incoming stimuli. If we pass a playground where children play and we see flying a not very large object, the first thing that our visual system will do is to check if it is a ball, something very different if we were in a park, then we would check if it is a bird. It is about checking if what we see is what we hope to see.

Visual perception of large and small objects

In general, parce that globality has priority over local aspects, within limits.

The presence of stimuli, ie letters, formed by smaller stimuli, reveals this fact. First, the shape of the large letter is perceived and then the stimuli, letters, that make up this one (Navon 1977). A global stimulus is detected faster than the local stimulus, provided that the ordering pattern is not broken, that is, if the elements that make up the large figure are very far apart, the small elements that constitute it will be perceived more quickly, so that the perception of the big figure, global stimulus, be faster, small stimuli should be grouped sufficiently close to each other. 

The absolute size of the figure is also decisive, if it is very large tampoco is perceived first with respect to small stimuli. The interpretation of this phenomenon must be placed in the mechanisms of attention.

Context and identification

Sometimes, identical stimuli are perceived as different, depending on the context in which they appear, such as in the figure where the letter B and the 13 are identical but we interpret them as a letter or a number depending on the context in which they are found.

In the same way, when an inappropriate object appears in a certain scene, if it does not have very significant characters of relevance, it is most likely to go unnoticed, unlike the objects that are appropriate in that scene, which will be remembered with greater ease.

visual perception of objects

Perceptual constancy in visual perception

We have all experienced how things, the objects of the visual scene, remain constant or we continue to identify them as what they are, even if the situation of light, distance or orientation changes. Not due to the fact that something approaches us and increases its size, we think that it is another object different from the first, there is a constancy of perception in consciousness even if our vision conditions change.

There are three conditions of perceptual constancy:

  • Size and shape
  • Whiteness or color of the surface.
  • Location of the object in space in relation to the observer.

The constancy has two fundamental phases:

The first is that of record, the process by which the changes of the proximal stimuli are coded to process them (unconscious).

The second, of apprehension, of a conscious nature, in which we become aware of the properties of the object, of the focal stimulus that tends to remain constant and the properties of the situation, which indicates more changing aspects of the environment, as the subject's position.

Proof of size

We must remember that the size of objects in the retina varies according to the distance they are in but the perception of size does not depend only on the object we observe, its perception depends on the environment in which it is located, the objects that surround it and of the absolute and relative distances.

The constancy of the size is helped by the pictorial keys, especially those of distance and depth and by the dynamics of the eyes, the information sent by the extraocular muscles in the movements of convergence or divergence and in accommodation.

In the figure we see how in (a), the three men diminish in size adjusting to the perspective, which indicates a distance with respect to the observer, therefore the size must be lower, as it happens in the retina, however in (b) If the size does not adapt, even if the three men are of the same size, it seems that the last one, the farthest, is bigger.

visual perception and size

For constructivists like the Gibsonians, we have constancy in a direct way, the scene, the stimuli that constitute it have all the necessary information so that perceptual constancy is possible.

Cognitive factors must be located next to the keys. We know objects that we know how they are, their size and general characteristics, so that they are familiar to us, a fact that facilitates their identification in the scene and that makes it easier to maintain their perceptual constancy (Predemon 1993).

Visual perception and consistency of form

The constancy of form can be defined as the relative constancy of the perceived shape of an object regardless of the variations in its orientation.

For shape constancy to be given, the visual system must compensate for changes in a similar way as it did in size constancy, in fact there is a great relationship between shape constancy and size, both are related to the perception of distance. However, for shape, distance refers to the relative distance of the different parts of the object in relation to the observer, its orientation in space or its inclination. The help of the distance and depth keys and the context that could indicate the degree of inclination is essential.

Visual perception, luminosity and whiteness

The amount of light that reaches the retina from an object depends on the source that illuminates it, external illuminance and, of the light that reflects that object, the reflectance.

A white surface will reflect almost an 90% of the light that falls on it, in contrast, a black surface will absorb most of the light that arrives, reflecting only a small proportion.

The brilliance

We speak of brilliance at the apparent intensity of the source of light that illuminates a portion of the campor visual, ie, a part of a room that is illuminated by the sun, in comparison with another part of the room that has a dimmer lighting. 

The luminosity

Another important concept is that of luminosity, which refers to the apparent reflectance of a surface, in which black objects reflect little light and white objects reflect a lot of light.

The luminosity determines the color of the object, in a scale that goes from white to black, is what we call whiteness. Do not confuse the concepts of whiteness and brilliance. A white sheet of paper will have a different brilliance depending on whether it is observed with a dim or bright light but, it will always have the same shade of white, which will maintain its constant whiteness.

Constancy of luminosity

There are two explanations that have been given for the constancy of luminosity. The first holds that constancy follows from the relationship between stimuli. In the same way that the constancy of size and shape of an object was maintained in reference to the context and the other objects in the scene, the same thing happens with luminosity, it is a ratio between the luminosity of the scene, of the object and of what that surrounds you. If the amount of the light source varies, such as the sun, the reflectance will be different, but the changes will be proportional in all the components of the scene, that is, the reflectance ratios that reach the retina are the same as in the initial situation.

Lateral inhibition

The physiological principle that would explain this phenomenon would be the lateral inhibition in the campreceptive Greater illumination supposes greater excitation of the stimulator zone but, at the same time, greater inhibition of the peripheral zone of the campor receptive. With less illumination, less excitation occurs in the center of the campoy, less inhibition of the campor peripheral so the overall response remains constant. 

This situation changes the moment we change the background conditions, the contrast will be different and so the perception of whiteness and brilliance varies.

Visual perception, color and hue

We have all experienced how a red apple is still red although we vary the intensity of light that falls on it or, even we vary the hue of this light (blue or yellow), is the constancy of color.

This constancy does not depend on the image of that object in the retina, in isolation, it depends on its relationship with the objects that surround it and the context in which it is located.

It is very similar to what we have just seen in whiteness and luminosity. The constancy of color would be related to the adaptation processes.

Summary
Visual perception
Article name
Visual perception
Description
We explain visual perception in detail and how we perceive the objects that surround us. This is one of the chapters on vision, the eye and how we see.
Author
Name of the editor
Área Oftalmológica Avanzada
Editor's logo