Read by QxMD icon Read

Auditory scene classification

Angela Josupeit, Volker Hohmann
This study introduces a model for solving three different auditory tasks in a multi-talker setting: target localization, target identification, and word recognition. The model was used to simulate psychoacoustic data from a call-sign-based listening test involving multiple spatially separated talkers [Brungart and Simpson (2007). Percept. Psychophys. 69(1), 79-91]. The main characteristics of the model are (i) the extraction of salient auditory features ("glimpses") from the multi-talker signal and (ii) the use of a classification method that finds the best target hypothesis by comparing feature templates from clean target signals to the glimpses derived from the multi-talker mixture...
July 2017: Journal of the Acoustical Society of America
Marzieh Haghighi, Mohammad Moghadamfalahi, Murat Akcakaya, Barbara G Shinn-Cunningham, Deniz Erdogmus
Recent findings indicate that brain interfaces have the potential to enable attention-guided auditory scene analysis and manipulation in applications, such as hearing aids and augmented/virtual environments. Specifically, noninvasively acquired electroencephalography (EEG) signals have been demonstrated to carry some evidence regarding, which of multiple synchronous speech waveforms the subject attends to. In this paper, we demonstrate that: 1) using data- and model-driven cross-correlation features yield competitive binary auditory attention classification results with at most 20 s of EEG from 16 channels or even a single well-positioned channel; 2) a model calibrated using equal-energy speech waveforms competing for attention could perform well on estimating attention in closed-loop unbalanced-energy speech waveform situations, where the speech amplitudes are modulated by the estimated attention posterior probability distribution; 3) such a model would perform even better if it is corrected (linearly, in this instance) based on EEG evidence dependence on speech weights in the mixture; and 4) calibrating a model based on population EEG could result in acceptable performance for new individuals/users; therefore, EEG-based auditory attention classifiers may generalize across individuals, leading to reduced or eliminated calibration time and effort...
November 2017: IEEE Transactions on Neural Systems and Rehabilitation Engineering
Adam Bednar, Francis M Boland, Edmund C Lalor
The human ability to localize sound is essential for monitoring our environment and helps us to analyse complex auditory scenes. Although the acoustic cues mediating sound localization have been established, it remains unknown how these cues are represented in human cortex. In particular, it is still a point of contention whether binaural and monaural cues are processed by the same or distinct cortical networks. In this study, participants listened to a sequence of auditory stimuli from different spatial locations while we recorded their neural activity using electroencephalography (EEG)...
January 19, 2017: European Journal of Neuroscience
Radoslaw Martin Cichy, Santani Teng
In natural environments, visual and auditory stimulation elicit responses across a large set of brain regions in a fraction of a second, yielding representations of the multimodal scene and its properties. The rapid and complex neural dynamics underlying visual and auditory information processing pose major challenges to human cognitive neuroscience. Brain signals measured non-invasively are inherently noisy, the format of neural representations is unknown, and transformations between representations are complex and often nonlinear...
February 19, 2017: Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences
Dana Barniv, Israel Nelken
When human subjects hear a sequence of two alternating pure tones, they often perceive it in one of two ways: as one integrated sequence (a single "stream" consisting of the two tones), or as two segregated sequences, one sequence of low tones perceived separately from another sequence of high tones (two "streams"). Perception of this stimulus is thus bistable. Moreover, subjects report on-going switching between the two percepts: unless the frequency separation is large, initial perception tends to be of integration, followed by toggling between integration and segregation phases...
2015: PloS One
Brian J Malone, Brian H Scott, Malcolm N Semple
The temporal coherence of amplitude fluctuations is a critical cue for segmentation of complex auditory scenes. The auditory system must accurately demarcate the onsets and offsets of acoustic signals. We explored how and how well the timing of onsets and offsets of gated tones are encoded by auditory cortical neurons in awake rhesus macaques. Temporal features of this representation were isolated by presenting otherwise identical pure tones of differing durations. Cortical response patterns were diverse, including selective encoding of onset and offset transients, tonic firing, and sustained suppression...
April 1, 2015: Journal of Neurophysiology
Inyong Choi, Siddharth Rajaram, Lenny A Varghese, Barbara G Shinn-Cunningham
Selective auditory attention is essential for human listeners to be able to communicate in multi-source environments. Selective attention is known to modulate the neural representation of the auditory scene, boosting the representation of a target sound relative to the background, but the strength of this modulation, and the mechanisms contributing to it, are not well understood. Here, listeners performed a behavioral experiment demanding sustained, focused spatial auditory attention while we measured cortical responses using electroencephalography (EEG)...
2013: Frontiers in Human Neuroscience
Kun Han, DeLiang Wang
A key problem in computational auditory scene analysis (CASA) is monaural speech segregation, which has proven to be very challenging. For monaural mixtures, one can only utilize the intrinsic properties of speech or interference to segregate target speech from background noise. Ideal binary mask (IBM) has been proposed as a main goal of sound segregation in CASA and has led to substantial improvements of human speech intelligibility in noise. This study proposes a classification approach to estimate the IBM and employs support vector machines to classify time-frequency units as either target- or interference-dominant...
November 2012: Journal of the Acoustical Society of America
Jackson C Liang, Anthony D Wagner, Alison R Preston
Current theories of medial temporal lobe (MTL) function focus on event content as an important organizational principle that differentiates MTL subregions. Perirhinal and parahippocampal cortices may play content-specific roles in memory, whereas hippocampal processing is alternately hypothesized to be content specific or content general. Despite anatomical evidence for content-specific MTL pathways, empirical data for content-based MTL subregional dissociations are mixed. Here, we combined functional magnetic resonance imaging with multiple statistical approaches to characterize MTL subregional responses to different classes of novel event content (faces, scenes, spoken words, sounds, visual words)...
January 2013: Cerebral Cortex
B S Kasper, E M Kasper, E Pauli, H Stefan
In partial epilepsy, a localized hypersynchronous neuronal discharge evolving into a partial seizure affecting a particular cortical region or cerebral subsystem can give rise to subjective symptoms, which are perceived by the affected person only, that is, ictal hallucinations, illusions, or delusions. When forming the beginning of a symptom sequence leading to impairment of consciousness and/or a classic generalized seizure, these phenomena are referred to as an epileptic aura, but they also occur in isolation...
May 2010: Epilepsy & Behavior: E&B
Guoning Hu, DeLiang Wang
Monaural speech segregation has proven to be extremely challenging. While efforts in computational auditory scene analysis have led to considerable progress in voiced speech segregation, little attention has been given to unvoiced speech, which lacks harmonic structure and has weaker energy, hence more susceptible to interference. This study proposes a new approach to the problem of segregating unvoiced speech from nonspeech interference. The study first addresses the question of how much speech is unvoiced...
August 2008: Journal of the Acoustical Society of America
James A Simmons, Nicola Neretti, Nathan Intrator, Richard A Altes, Michael J Ferragamo, Mark I Sanderson
Big brown bats (Eptesicus fuscus) emit wideband, frequency-modulated biosonar sounds and perceive the distance to objects from the delay of echoes. Bats remember delays and patterns of delay from one broadcast to the next, and they may rely on delays to perceive target scenes. While emitting a series of broadcasts, they can detect very small changes in delay based on their estimates of delay for successive echoes, which are derived from an auditory time/frequency representation of frequency-modulated sounds...
March 9, 2004: Proceedings of the National Academy of Sciences of the United States of America
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"