Refine
Language
- English (17)
Has Fulltext
- yes (17)
Is part of the Bibliography
- no (17)
Keywords
- Acoustics (2)
- Speech (2)
- Auditory cortex (1)
- Auditory system (1)
- Behavior (1)
- Cognitive science (1)
- Cortex (1)
- Human behaviour (1)
- Language (1)
- Left hemisphere (1)
Institute
- MPI für empirische Ästhetik (13)
- Ernst Strüngmann Institut (10)
- Biowissenschaften (4)
- Medizin (3)
- Psychologie (2)
- MPI für Hirnforschung (1)
The lateralization of neuronal processing underpinning hearing, speech, language, and music is widely studied, vigorously debated, and still not understood in a satisfactory manner. One set of hypotheses focuses on the temporal structure of perceptual experience and links auditory cortex asymmetries to underlying differences in neural populations with differential temporal sensitivity (e.g., ideas advanced by Zatorre et al. (2002) and Poeppel (2003). The Asymmetric Sampling in Time theory (AST) (Poeppel, 2003), builds on cytoarchitectonic differences between auditory cortices and predicts that modulation frequencies within the range of, roughly, the syllable rate, are more accurately tracked by the right hemisphere. To date, this conjecture is reasonably well supported, since – while there is some heterogeneity in the reported findings – the predicted asymmetrical entrainment has been observed in various experimental protocols. Here, we show that under specific processing demands, the rightward dominance disappears. We propose an enriched and modified version of the asymmetric sampling hypothesis in the context of speech. Recent work (Rimmele et al., 2018b) proposes two different mechanisms to underlie the auditory tracking of the speech envelope: one derived from the intrinsic oscillatory properties of auditory regions; the other induced by top-down signals coming from other non-auditory regions of the brain. We propose that under non-speech listening conditions, the intrinsic auditory mechanism dominates and thus, in line with AST, entrainment is rightward lateralized, as is widely observed. However, (i) depending on individual brain structural/functional differences, and/or (ii) in the context of specific speech listening conditions, the relative weight of the top-down mechanism can increase. In this scenario, the typically observed auditory sampling asymmetry (and its rightward dominance) diminishes or vanishes.
Natural sounds convey perceptually relevant information over multiple timescales, and the necessary extraction of multi-timescale information requires the auditory system to work over distinct ranges. The simplest hypothesis suggests that temporal modulations are encoded in an equivalent manner within a reasonable intermediate range. We show that the human auditory system selectively and preferentially tracks acoustic dynamics concurrently at 2 timescales corresponding to the neurophysiological theta band (4–7 Hz) and gamma band ranges (31–45 Hz) but, contrary to expectation, not at the timescale corresponding to alpha (8–12 Hz), which has also been found to be related to auditory perception. Listeners heard synthetic acoustic stimuli with temporally modulated structures at 3 timescales (approximately 190-, approximately 100-, and approximately 30-ms modulation periods) and identified the stimuli while undergoing magnetoencephalography recording. There was strong intertrial phase coherence in the theta band for stimuli of all modulation rates and in the gamma band for stimuli with corresponding modulation rates. The alpha band did not respond in a similar manner. Classification analyses also revealed that oscillatory phase reliably tracked temporal dynamics but not equivalently across rates. Finally, mutual information analyses quantifying the relation between phase and cochlear-scaled correlations also showed preferential processing in 2 distinct regimes, with the alpha range again yielding different patterns. The results support the hypothesis that the human auditory system employs (at least) a 2-timescale processing mode, in which lower and higher perceptual sampling scales are segregated by an intermediate temporal regime in the alpha band that likely reflects different underlying computations.
Speech perception is mediated by both left and right auditory cortices but with differential sensitivity to specific acoustic information contained in the speech signal. A detailed description of this functional asymmetry is missing, and the underlying models are widely debated. We analyzed cortical responses from 96 epilepsy patients with electrode implantation in left or right primary, secondary, and/or association auditory cortex (AAC). We presented short acoustic transients to noninvasively estimate the dynamical properties of multiple functional regions along the auditory cortical hierarchy. We show remarkably similar bimodal spectral response profiles in left and right primary and secondary regions, with evoked activity composed of dynamics in the theta (around 4–8 Hz) and beta–gamma (around 15–40 Hz) ranges. Beyond these first cortical levels of auditory processing, a hemispheric asymmetry emerged, with delta and beta band (3/15 Hz) responsivity prevailing in the right hemisphere and theta and gamma band (6/40 Hz) activity prevailing in the left. This asymmetry is also present during syllables presentation, but the evoked responses in AAC are more heterogeneous, with the co-occurrence of alpha (around 10 Hz) and gamma (>25 Hz) activity bilaterally. These intracranial data provide a more fine-grained and nuanced characterization of cortical auditory processing in the 2 hemispheres, shedding light on the neural dynamics that potentially shape auditory and speech processing at different levels of the cortical hierarchy.
A body of research demonstrates convincingly a role for synchronization of auditory cortex to rhythmic structure in sounds including speech and music. Some studies hypothesize that an oscillator in auditory cortex could underlie important temporal processes such as segmentation and prediction. An important critique of these findings raises the plausible concern that what is measured is perhaps not an oscillator but is instead a sequence of evoked responses. The two distinct mechanisms could look very similar in the case of rhythmic input, but an oscillator might better provide the computational roles mentioned above (i.e., segmentation and prediction). We advance an approach to adjudicate between the two models: analyzing the phase lag between stimulus and neural signal across different stimulation rates. We ran numerical simulations of evoked and oscillatory computational models, showing that in the evoked case,phase lag is heavily rate-dependent, while the oscillatory model displays marked phase concentration across stimulation rates. Next, we compared these model predictions with magnetoencephalography data recorded while participants listened to music of varying note rates. Our results show that the phase concentration of the experimental data is more in line with the oscillatory model than with the evoked model. This finding supports an auditory cortical signal that (i) contains components of both bottom-up evoked responses and internal oscillatory synchronization whose strengths are weighted by their appropriateness for particular stimulus types and (ii) cannot be explained by evoked responses alone.
Natural sounds contain information on multiple timescales, so the auditory system must analyze and integrate acoustic information on those different scales to extract behaviorally relevant information. However, this multi-scale process in the auditory system is not widely investigated in the literature, and existing models of temporal integration are mainly built upon detection or recognition tasks on a single timescale. Here we use a paradigm requiring processing on relatively ‘local’ and ‘global’ scales and provide evidence suggesting that the auditory system extracts fine-detail acoustic information using short temporal windows and uses long temporal windows to abstract global acoustic patterns. Behavioral task performance that requires processing fine-detail information does not improve with longer stimulus length, contrary to predictions of previous temporal integration models such as the multiple-looks and the spectro-temporal excitation pattern model. Moreover, the perceptual construction of putatively ‘unitary’ auditory events requires more than hundreds of milliseconds. These findings support the hypothesis of a dual-scale processing likely implemented in the auditory cortex.
In natural environments, background noise can degrade the integrity of acoustic signals, posing a problem for animals that rely on their vocalizations for communication and navigation. A simple behavioral strategy to combat acoustic interference would be to restrict call emissions to periods of low-amplitude or no noise. Using audio playback and computational tools for the automated detection of over 2.5 million vocalizations from groups of freely vocalizing bats, we show that bats (Carollia perspicillata) can dynamically adapt the timing of their calls to avoid acoustic jamming in both predictably and unpredictably patterned noise. This study demonstrates that bats spontaneously seek out temporal windows of opportunity for vocalizing in acoustically crowded environments, providing a mechanism for efficient echolocation and communication in cluttered acoustic landscapes.
One Sentence Summary Bats avoid acoustic interference by rapidly adjusting the timing of vocalizations to the temporal pattern of varying noise.
In natural environments, background noise can degrade the integrity of acoustic signals, posing a problem for animals that rely on their vocalizations for communication and navigation. A simple behavioral strategy to combat acoustic interference would be to restrict call emissions to periods of low-amplitude or no noise. Using audio playback and computational tools for the automated detection of over 2.5 million vocalizations from groups of freely vocalizing bats, we show that bats (Carollia perspicillata) can dynamically adapt the timing of their calls to avoid acoustic jamming in both predictably and unpredictably patterned noise. This study demonstrates that bats spontaneously seek out temporal windows of opportunity for vocalizing in acoustically crowded environments, providing a mechanism for efficient echolocation and communication in cluttered acoustic landscapes.
One Sentence Summary: Bats avoid acoustic interference by rapidly adjusting the timing of vocalizations to the temporal pattern of varying noise.
The ability to extract regularities from the environment is arguably an adaptive characteristic of intelligent systems. In the context of speech, statistical learning is thought to be an important mechanism for language acquisition. By considering individual differences in speech auditory-motor synchronization, an independent component analysis of fMRI data revealed that the neural substrates of statistical word form learning are not fully shared across individuals. While a network of auditory and superior pre/motor regions is universally activated in the process of learning, a fronto-parietal network is instead additionally and selectively engaged by some individuals, boosting their performance. Furthermore, interfering with the use of this network via articulatory suppression (producing irrelevant speech during learning) normalizes performance across the entire sample. Our work provides novel insights on language-related statistical learning and reconciles previous contrasting findings, while highlighting the need to factor in fundamental individual differences for a precise characterization of cognitive phenomena.
When speech is too fast, the tracking of the acoustic signal along the auditory pathway deteriorates, leading to suboptimal speech segmentation and decoding of speech information. Thus, speech comprehension is limited by the temporal constraints of the auditory system. Here we ask whether individual differences in auditory-motor coupling strength in part shape these temporal constraints. In two behavioral experiments, we characterize individual differences in the comprehension of naturalistic speech as function of the individual synchronization between the auditory and motor systems and the preferred frequencies of the systems. Obviously, speech comprehension declined at higher speech rates. Importantly, however, both higher auditory-motor synchronization and higher spontaneous speech motor production rates were predictive of better speech-comprehension performance. Furthermore, performance increased with higher working memory capacity (Digit Span) and higher linguistic, model-based sentence predictability – particularly so at higher speech rates and for individuals with high auditory-motor synchronization. These findings support the notion of an individual preferred auditory– motor regime that allows for optimal speech processing. The data provide evidence for a model that assigns a central role to motor-system-dependent individual flexibility in continuous speech comprehension.
Music, like language, is characterized by hierarchically organized structure that unfolds over time. Music listening therefore requires not only the tracking of notes and beats but also internally constructing high-level musical structures or phrases and anticipating incoming contents. Unlike for language, mechanistic evidence for online musical segmentation and prediction at a structural level is sparse. We recorded neurophysiological data from participants listening to music in its original forms as well as in manipulated versions with locally or globally reversed harmonic structures. We discovered a low-frequency neural component that modulated the neural rhythms of beat tracking and reliably parsed musical phrases. We next identified phrasal phase precession, suggesting that listeners established structural predictions from ongoing listening experience to track phrasal boundaries. The data point to brain mechanisms that listeners use to segment continuous music at the phrasal level and to predict abstract structural features of music.