mobile icon

Influence of spoken text on cognitive processing of complex pictorial presentations

WorkgroupRealistic depictions Lab
FundingDeutsche Forschungsgemeinschaft
Project description

The DFG-project "Influence of spoken text on cognitive processing of complex pictorial presentations" examined the influence of audio information texts on cognitive processing of art works as can be found in television documentaries or museums and exhibitions.

Based on models of multimedia learning and psychological findings on picture and scene perception, we examined in this DFG-project the influence of accompanying audio texts on the perception and processing of complex pictorial contents and related knowledge acquisition processes. In four laboratory experiments we analyzed to which extent audio texts (1) systematically reduce the interpretational openness of pictures, (2) foster the perception and processing of neglected pictorial elements, (3) perceptually and cognitively relate pictorial elements, and (4) contribute to the dual coding of the pictorial contents. Research objects were complex and detailed history paintings about historical events, which are typically used to communicate historical knowledge. In the four studies, we examined the influence of auditive intro texts, named salient and non-salient pictorial elements, named relations between pictorial elements as well as the influence of text structure on the allocation of attention (gaze coherence, fixation times, amount of transitions, measured by eye tracking) and knowledge acquisition (free visual recall, text-picture-relation, transfer, memory related interindividual consistency). The results showed an indication that contents of pictures with titles are better linked to existing prior knowledge structures and can thus be retrieved better than without picture titles, which speaks for a reduction of the interpretational openness of pictures through picture titles. Furthermore, the results showed that naming picture elements induces the viewer to focus his attention on the particular picture element in a given scene. The finding that this naming effect in visual inspection is more pronounced for high salience picture elements than for low salience picture elements indicates that the visual properties of paintings, such as centrality or composition, nevertheless play a dominant role in the course of visual inspection. However, naming specific pictorial elements in the verbal explanations of personal or audio guides can help viewers counteract automatic tendencies to focus on central pictorial elements of paintings or on human faces, and instead pay attention to otherwise unnoticed pictorial elements of a painting, as expert viewers do. The effects of naming picture elements also extend beyond visual inspection to improved retention and, shown for the first time in previous research, better identification or recall of named picture elements in other artworks depicting the same subject. Furthermore, how to interpret specific pictorial content can be conveyed by localizing, naming, and interpreting the individual pictorial elements in the audio text in immediate temporal succession, as this supports the linking of pictorial and audio text information in the sense of dual coding. For practical purposes, it can be concluded that the naming of picture elements by s personal or audio guide guides museum visitors in their visual inspection of paintings and helps them to build up a mental representation of the painting, to develop an adequate understanding of it, and to transfer this understanding to their engagement with other similar works of art.

In addition to this project, several other studies were conducted which examined not only the influence of audio text design, but also the influence of visual design elements on the processing of paintings. Thereby, cinematographic techniques such as camera zooms and pans as well as cueing techniques from multimedia research played an important role. Here, the results showed that visual cueing increased the difference between the named and the less well-remembered unnamed picture elements in the audio text and compensated for a disadvantage of naming with respect to localization performance. Both effects were evident for explicit visual cueing in the form of red frames and for implicit visual cueing in the form of zoom-ins. In terms of retention and localization, explicit and implicit cueing were equally effective.


Glaser, M., Knoos, M., & Schwan, S. (in press). How verbal cues help to see and understand art. Psychology of Aesthetics, Creativity, and the Arts.

Glaser, M., Knoos, M., & Schwan, S. (2022). Localizing, describing, interpreting: Effects of different audio text structures on attributing meaning to digital pictures. Instructional Science, 50(5), 729-748. Open Access

Glaser, M., & Schwan, S. (2020). Combining verbal and visual cueing: Fostering learning pictorial content by coordinating verbal explanations with different types of visual cueing. Instructional Science, 48(2), 159-182.

Glaser, M., Knoos, M., & Schwan, S. (2020). The Closer, the better? Processing relations between picture elements in historical paintings. Journal of Eye Movement Research, 13(2), Article 11. Open Access