Refine
Has Fulltext
- yes (17)
Is part of the Bibliography
- no (17)
Keywords
- Human behaviour (4)
- virtual reality (3)
- visual search (3)
- Object vision (2)
- gaze-contingent protocol (2)
- visual attention (2)
- Berlin Affective Word List (BAWL) (1)
- Electroencephalography – EEG (1)
- Long-term memory (1)
- Mixed models (1)
Institute
- Psychologie (13)
- Psychologie und Sportwissenschaften (3)
- Präsidium (1)
Viewpoint effects on object recognition interact with object-scene consistency effects. While recognition of objects seen from “accidental” viewpoints (e.g., a cup from below) is typically impeded compared to processing of objects seen from canonical viewpoints (e.g., the string-side of a guitar), this effect is reduced by meaningful scene context information. In the present study we investigated if these findings established by using photographic images, generalise to 3D models of objects. Using 3D models further allowed us to probe a broad range of viewpoints and empirically establish accidental and canonical viewpoints. In Experiment 1, we presented 3D models of objects from six different viewpoints (0°, 60°, 120°, 180° 240°, 300°) in colour (1a) and grayscaled (1b) in a sequential matching task. Viewpoint had a significant effect on accuracy and response times. Based on the performance in Experiments 1a and 1b, we determined canonical (0°-rotation) and non-canonical (120°-rotation) viewpoints for the stimuli. In Experiment 2, participants again performed a sequential matching task, however now the objects were paired with scene backgrounds which could be either consistent (e.g., a cup in the kitchen) or inconsistent (e.g., a guitar in the bathroom) to the object. Viewpoint interacted significantly with scene consistency in that object recognition was less affected by viewpoint when consistent scene information was provided, compared to inconsistent information. Our results show that viewpoint-dependence and scene context effects generalize to depth rotated 3D objects. This supports the important role object-scene processing plays for object constancy.
Objects that are semantically related to the visual scene context are typically better recognized than unrelated objects. While context effects on object recognition are well studied, the question which particular visual information of an object’s surroundings modulates its semantic processing is still unresolved. Typically, one would expect contextual influences to arise from high-level, semantic components of a scene but what if even low-level features could modulate object processing? Here, we generated seemingly meaningless textures of real-world scenes, which preserved similar summary statistics but discarded spatial layout information. In Experiment 1, participants categorized such textures better than colour controls that lacked higher-order scene statistics while original scenes resulted in the highest performance. In Experiment 2, participants recognized briefly presented consistent objects on scenes significantly better than inconsistent objects, whereas on textures, consistent objects were recognized only slightly more accurately. In Experiment 3, we recorded event-related potentials and observed a pronounced mid-central negativity in the N300/N400 time windows for inconsistent relative to consistent objects on scenes. Critically, inconsistent objects on textures also triggered N300/N400 effects with a comparable time course, though less pronounced. Our results suggest that a scene’s low-level features contribute to the effective processing of objects in complex real-world environments.
While scene context is known to facilitate object recognition, little is known about which contextual “ingredients” are at the heart of this phenomenon. Here, we address the question of whether the materials that frequently occur in scenes (e.g., tiles in a bathroom) associated with specific objects (e.g., a perfume) are relevant for the processing of that object. To this end, we presented photographs of consistent and inconsistent objects (e.g., perfume vs. pinecone) superimposed on scenes (e.g., a bathroom) and close-ups of materials (e.g., tiles). In Experiment 1, consistent objects on scenes were named more accurately than inconsistent ones, while there was only a marginal consistency effect for objects on materials. Also, we did not find any consistency effect for scrambled materials that served as color control condition. In Experiment 2, we recorded event-related potentials and found N300/N400 responses—markers of semantic violations—for objects on inconsistent relative to consistent scenes. Critically, objects on materials triggered N300/N400 responses of similar magnitudes. Our findings show that contextual materials indeed affect object processing—even in the absence of spatial scene structure and object content—suggesting that material is one of the contextual “ingredients” driving scene context effects.
Visual search in natural scenes is a complex task relying on peripheral vision to detect potential targets and central vision to verify them. The segregation of the visual fields has been particularly established by on-screen experiments. We conducted a gaze-contingent experiment in virtual reality in order to test how the perceived roles of central and peripheral visions translated to more natural settings. The use of everyday scenes in virtual reality allowed us to study visual attention by implementing a fairly ecological protocol that cannot be implemented in the real world. Central or peripheral vision was masked during visual search, with target objects selected according to scene semantic rules. Analyzing the resulting search behavior, we found that target objects that were not spatially constrained to a probable location within the scene impacted search measures negatively. Our results diverge from on-screen studies in that search performances were only slightly affected by central vision loss. In particular, a central mask did not impact verification times when the target was grammatically constrained to an anchor object. Our findings demonstrates that the role of central vision (up to 6 degrees of eccentricities) in identifying objects in natural scenes seems to be minor, while the role of peripheral preprocessing of targets in immersive real-world searches may have been underestimated by on-screen experiments.
Finding a bottle of milk in the bathroom would probably be quite surprising to most of us. Such a surprised reaction is driven by our strong expectations, learned through experience, that a bottle of milk belongs in the kitchen. Our environment is not randomly organized but governed by regularities that allow us to predict what objects can be found in which types of scene. These scene semantics are thought to play an important role in the recognition of objects. But when during development are the semantic predictions so far implemented that such scene-object inconsistencies would lead to semantic processing difficulties? Here we investigated how toddlers perceive their environments, and what expectations govern their attention and perception. To this aim, we used a purely visual paradigm in an ERP experiment and presented 24-month-olds with familiar scenes in which either a semantically consistent or an inconsistent object would appear. The scene-inconsistency effect has been previously studied in adults by means of the N400, a neural marker responding to semantic inconsistencies across many types of stimuli. Our results show that semantic object-scene inconsistencies indeed elicited an enhanced N400 over the left anterior brain region between 750 and 1150 ms post stimulus onset. This modulation of the N400 marker provides first indications that by the age of two toddlers have already established their scene semantics allowing them to detect a purely visual, semantic object-scene inconsistency. Our data suggest the presence of specific semantic knowledge regarding what objects occur in a certain scene category.
Objects that are congruent with a scene are recognised more efficiently than objects that are incongruent. Further, semantic integration of incongruent objects elicits a stronger N300/N400 EEG component. Yet, the time course and mechanisms of how contextual information supports access to semantic object information is unclear. We used computational modelling and EEG to test how context influences semantic object processing. Using representational similarity analysis, we established that EEG patterns dissociated between objects in congruent or incongruent scenes from around 300 ms. By modelling semantic processing of objects using independently normed properties, we confirm that the onset of semantic processing of both congruent and incongruent objects is similar (∼150 ms). Critically, after ∼275 ms, we discover a difference in the duration of semantic integration, lasting longer for incongruent compared to congruent objects. These results constrain our understanding of how contextual information supports access to semantic object information.
Objects that are congruent with a scene are recognised more efficiently than objects that are incongruent. Further, semantic integration of incongruent objects elicits a stronger N300/N400 EEG component. Yet, the time course and mechanisms of how contextual information supports access to semantic object information is unclear. We used computational modelling and EEG to test how context influences semantic object processing. Using representational similarity analysis, we established that EEG patterns dissociated between objects in congruent or incongruent scenes from around 300 ms. By modelling semantic processing of objects using independently normed properties, we confirm that the onset of semantic processing of both congruent and incongruent objects is similar (∼150 ms). Critically, after ∼275 ms, we discover a difference in the duration of semantic integration, lasting longer for incongruent compared to congruent objects. These results constrain our understanding of how contextual information supports access to semantic object information.
Scene grammar shapes the way we interact with objects, strengthens memories, and speeds search
(2017)
Predictions of environmental rules (here referred to as "scene grammar") can come in different forms: seeing a toilet in a living room would violate semantic predictions, while finding a toilet brush next to the toothpaste would violate syntactic predictions. The existence of such predictions has usually been investigated by showing observers images containing such grammatical violations. Conversely, the generative process of creating an environment according to one’s scene grammar and its effects on behavior and memory has received little attention. In a virtual reality paradigm, we either instructed participants to arrange objects according to their scene grammar or against it. Subsequently, participants’ memory for the arrangements was probed using a surprise recall (Exp1), or repeated search (Exp2) task. As a result, participants’ construction behavior showed strategic use of larger, static objects to anchor the location of smaller objects which are generally the goals of everyday actions. Further analysis of this scene construction data revealed possible commonalities between the rules governing word usage in language and object usage in naturalistic environments. Taken together, we revealed some of the building blocks of scene grammar necessary for efficient behavior, which differentially influence how we interact with objects and what we remember about scenes.
Repeated search studies are a hallmark in the investigation of the interplay between memory and attention. Due to a usually employed averaging, a substantial decrease in response times occurring between the first and second search through the same search environment is rarely discussed. This search initiation effect is often the most dramatic decrease in search times in a series of sequential searches. The nature of this initial lack of search efficiency has thus far remained unexplored. We tested the hypothesis that the activation of spatial priors leads to this search efficiency profile. Before searching repeatedly through scenes in VR, participants either (1) previewed the scene, (2) saw an interrupted preview, or (3) started searching immediately. The search initiation effect was present in the latter condition but in neither of the preview conditions. Eye movement metrics revealed that the locus of this effect lies in search guidance instead of search initiation or decision time, and was beyond effects of object learning or incidental memory. Our study suggests that upon visual processing of an environment, a process of activating spatial priors to enable orientation is initiated, which takes a toll on search time at first, but once activated it can be used to guide subsequent searches.