• search hit 2 of 16
Back to Result List

The structure of the visual world in the human mind : measuring object statistics from large sets of real-world images and their impact on behaviour

  • Our mind has the function of representing the physical and social world we are in, so that we can efficiently interact with it. This results in a constant and dynamic interaction between mind and world that produces a balance when representations are at the same time accurate with respect to what the world is communicating to our organism, but also compatible with how our mind works. A paradigmatic case of this interaction is offered by perception, which is the mental function that represents contingent aspects of the world built from what is captured by our senses. Indeed, the dominant philosophical view in cognitive science is that our perceptual states are representations of the world and not direct access to that world. These representational perceptual states therefor include the aspects of the world they represent and that initiate the perception by stimulating our sensory organs. Perceptual representations are built using information from the sensory system, i.e., bottom-up information, but are also integrated with information previously acquired, i.e., top-down information, so that perception interacts with memory through language and other mental functions. Such organization is believed to reflect a general mechanism of our mind/brain, which is to acquire and use information to make efficient predictions about the future, continuously updating older information with present information. This predictive processing works because the world is not random, but shows a regular structure from which reliable expectations can be built. One way that our minds make these predictions is by adapting to the structure of the world in an implicit, automatic and unconscious way, a process that has been called Implicit Statistical Learning (ISL). ISL is a learning process that does not require awareness and happens in an incidental and spontaneous way, with mere exposure to statistical regularities of the world. It is what happens when we learn a language during early childhood, and that allows us to be implicitly sensitive to the phonological structure of speech, or to associate speech patterns with objects and events to learn word meaning. A specific case of ISL is the learning of spatial configuration in the visual world, which we apply to abstract arrays of items, but most importantly, also to more ecological settings such as the visual scenes we are immersed in during our everyday life. The knowledge we acquire about the structure of visual scenes has been called “Scene Grammar”, because it informs about presence and position of objects in a similar way to what linguistic grammar tells us about the presence and position of words. So, we implicitly acquire the semantics of scenes, learning which objects are consistent with a certain scene, as well as the syntax of scenes, learning where objects are positioned in a consistent way within a certain scene. More recent developments have proposed that scene grammar knowledge might be organized based on a hierarchical system: objects are arranged in the scene, which offers the more general context, but within a scene we can identify different spatial and functional clusters of objects, called “phrases”, that offer a second level of context; within every phrase, then, objects have different status, with usually one object (“anchor object”) offering strong prediction of where and which are the other objects within the phrase (“local objects”). However, these further aspects of the organization of objects In scenes remain poorly understood. Another problem relates to the way we measure the structure of scenes to compare the organization of the visual world with the organization in the mind. Typically, to decide if an object appears or not in a certain scene, and whether or not it appears in a certain position within a scene, researchers based their decision on intuition and common-sense, maybe validating those decisions with independent raters. But it has been shown that often these decisions can be limited and more complex information about objects’ arrangement in scenes can be lost. A potential solution to this problem might be using large set of real-world images, that have annotations and segmentations of objects, to measures statistics about how objects are arranged in the environment. This idea exploits the nowadays larger availability of this kind of datasets due to increasing developments of computer vision algorithms, and also parallels with the established usage of large text corpora in language research. The goals of the current investigation were to extract object statistics from this image datasets and test if they reliably predict behavioural responses during object processing, as well as to use these statistics to investigate more complex aspects of scene grammar, such as its hierarchical organization, to see if this organization is reflected in the organization of objects in our mind.

Download full text files

Export metadata

Metadaten
Author:Jacopo TuriniGND
URN:urn:nbn:de:hebis:30:3-795884
DOI:https://doi.org/10.21248/gups.79588
Publisher:Johann Wolfgang Goethe-Universität
Place of publication:Frankfurt
Referee:Melissa Lê-Hoa VõORCiDGND, Yee Lee ShingORCiDGND
Document Type:Doctoral Thesis
Language:English
Date of Publication (online):2023/12/13
Date of first Publication:2023/11/16
Publishing Institution:Universitätsbibliothek Johann Christian Senckenberg
Granting Institution:Johann Wolfgang Goethe-Universität
Date of final exam:2023/04/19
Release Date:2023/12/13
Tag:cognitive psychology; memory; vision
Page Number:276
Note:
Kumulative Dissertation - enthält die eingereichten Manuskriptversionen (Author Accepted Manuscripts) der folgenden Artikel:

Gregorova, Klara; Turini, Jacopo; Gagl, Benjamin; Le-Hoa Vo, Melissa (2022): Access to meaning from visual input: Object and word frequency effects in categorization behavior. OFS home: https://osf.io/d3j9h/files/osfstorage. Später erschienen in: Journal of Experimental Psychology; General 2023, Vol 152 (10), S. 2861-2881, eISSN 1939-2222. DOI 10.1037/xge0001342

Turini, Jacopo; Le-Hoa Vo, Mellissa (2022): Hierarchical organization of objects in scenes is reflected in mental representations of objects. Später erschienen in: Scientific Reports 2022, 12 Artikel Nummer 20068 (2022), ISSN 2045-2322, DOI /10.1038/s41598-022-24505-x

Turini, Jacopo; Le-Hoa Vo, Mellissa (2022): Scene hierarchy structures mental object representations while flexibly adapting to varying task demands. Später erschienen in: Journal of Vision 2022, Vol. 22(14), Art. 3467, eISSN 1534-7362 . DOI 10.1167/jov.22.14.3467
HeBIS-PPN:51403176X
Institutes:Psychologie und Sportwissenschaften
Dewey Decimal Classification:1 Philosophie und Psychologie / 15 Psychologie / 150 Psychologie
Sammlungen:Universitätspublikationen
Licence (German):License LogoCreative Commons - CC BY - Namensnennung 4.0 International