Monday, August 26th, 09:30-10:30, Amphithéâtre
"My adventures with speech"
I intend to mention some techniques I got involved in during the past 40 years. I will not dwell too much on details of the techniques. These are documented in various publications. Rather, I will try to talk about things which we, researchers, may say in private but seldom write about: about personal intuitions and beliefs, about excitements, frustrations, surprises, and interesting encounters on the road, while struggling to understand and emulate one of the most significant achievements of human race, the ability to communicate by speech.
Monday, August 26th, 14:30-15:30, Amphithéâtre
Department of Speech-Language-Hearing Sciences University of Minnesota, Minneapolis, USA
On the interaction of social and linguistic factors in phonetic variation in typical and atypical speakers
The speech signal is remarkably rich. As discussed by Munson, Edwards, and Beckman (2012), a single production of the word cat can index not only the regular semantic features of felis catus, but also the word’s position in utterance’s larger prosodic structure, the speaker’s stance toward the topic being discussed, the speaker’s intentions for how the word should be interpreted relative to the ongoing discourse, and aspects of the speaker’s social identity (such as their gender and sexuality) and emotional state. Humans and automatic speech processing systems must be able to unpack these different messages from this complex signal. In this talk, I discuss how different types of information interact in speech production and perception. I give special attention to contrasting typical speakers and listeners with atypical populations, i.e., populations other than native language speaking adults with no history of speech, language, or hearing impairments. Together, the results I present are a ’call to action’ for the INTERSPEECH community to consider a broader set of sources of variability when modeling spoken language production and comprehension.
Tuesday, August 27th, 09:00-10:00, Amphithéâtre
University of Geneva, Switzerland
Are cortical oscillations a useful ingredient of speech perception?
Neuronal oscillations are ubiquitous in the brain and may contribute to cognition in a number of ways, for
example by segregating information and organizing spike timing. Recent data show that delta, theta, and
gamma oscillations are specifically engaged by the multi-timescale, quasi-rhythmic properties of speech and can
track its dynamics. I will present theoretical and experimental data suggesting that auditory cortical oscillatory
neural behaviour play a foundational role in speech and language processing by ‘packaging’ incoming information
into units of the appropriate temporal granularity, and enabling their readout by higher order brain areas.
Wednesday, August 28th, 09:00-10:00, Amphithéâtre
INRIA - Sophia Antipolis, France
Verbal communication through brain computer interfaces
Brain Computer Interfaces (BCI) provide a way of communicating directly from brain activity, bypassing muscular control. Research on BCI is concerned with designing reliable interaction protocols, and embedding them in systems that are both automatic and adaptive. Many types of brain activity are considered for BCI: some that are related to actual activity, such as imaginary motion or speech, and others that are not, such as evoked potentials or slow cortical potentials. This brain activity is measured through a diversity of modalities, either invasive or non-invasive. In this talk I report some recent advances in a BCI communication system called the P300 speller, which is a virtual brain-operated keyboard. This system relies on electroencephalographic activity time-locked to the flashing of the desired letters. It requires calibration of the system, but very little training from the user. Clinical tests are being conducted on a target population of patients suffering from Amyotrophic Lateral Sclerosis, in order to confirm the usability of the P300 speller for reliable communication.