Special Sessions

Special sessions at Interspeech are intended to stimulate particular topics, identified by colleagues as bound to deserve a specific focus during the conference.

The following special session are scheduled at Interspeech:

» Speech science in end-user applications
» Intelligibility-enhancing speech modifications
» Spoofing and countermeasures for automatic speaker verification
» Articulatory data acquisition and processing
» Child computer interaction
» Computational Paralinguistics Challenge

Detailed description of Special Sessions:

Speech science in end-user applications

   Felix Burkhardt Felix.Burkhardt@telekom.de Deutsche Telekom Laboratories
   Juergen Schroeter jsh@research.att.com AT&T
   Björn Schuller schuller@tum.de Technische Universität München

To cast a balance between science and technology applications, this special session focuses on applications of speech technologies.  Junior researchers in particular will appreciate getting an idea on how speech technology is used in industrial products designed with the end user in mind.  We felt that a special session would be a good place to bring the academic and industrial world closer together, and exchange experiences in the overlapping areas of science and technology. We encourage contributions in all related fields, e.g. recognition, synthesis, semantics, classification, analytics, etc., but with a focus on real world applications and the problems detected in user studies or extracted from real-use log files. A poster session will allow for good individual exchange. The session will begin with an introduction by the organizers followed by a short presentation to introduce the posters at display. A brief panel discussion at the end will wrap up the session highlighting common grounds.

» More details

Intelligibility-enhancing speech modifications

   Martin Cooke m.cooke@ikerbasque.org  Ikerbasque
   Yannis Stylianou yannis@csd.uoc.gr Toshiba Research Laboratory
   Catherine Mayo catherin@inf.ed.ac.uk Centre for Speech Technology Research, University of Edinburgh

Natural (live and recorded) and synthetic speech are deployed increasingly in applications involving speech technology, many of which need to function under non-ideal listening conditions. In order to ensure correct message reception, existing systems are forced to rely upon excessive output level or repetition. An alternative approach is to manipulate speech or the message generation process to achieve adequate intelligibility levels while ideally reducing output intensity. A number of algorithms for intelligibility-enhancing speech modification have been proposed in recent years, with claims of improvements equivalent to reducing output levels by up to 5 dB. The purpose of the Special Session is to compare the effectiveness of algorithms whose goal is to increase the intelligibility of natural and synthetic speech in known noise conditions. If you are a researcher working on speech modifications which boost intelligibility, we welcome your submission

Spoofing and countermeasures for automatic speaker verification

   Nicholas Evans evans@eurecom.fr EURECOM
   Tomi Kinnunen tomi.kinnunen@uef.fi University of Eastern Finland
   Junichi Yamagishi jyamagis@inf.ed.ac.uk University of Edinburgh
   Sebastien Marcel marcel@idiap.ch Idiap Research Institute

It is widely acknowledged that most biometric systems are vulnerable to imposture or spoofing attacks. While vulnerabilities and countermeasures for other biometric modalities have been widely studied, automatic speaker verification systems remain vulnerable. This special session aims to promote the study of spoofing and countermeasures for the speech modality. We invite submissions with an emphasis on new countermeasures in addition to papers with a focus on previously unconsidered vulnerabilities, new databases, evaluation protocols and metrics for the assessment of automatic speaker verification in the face of spoofed samples. In particular, we aim to stimulate new interest from colleagues working in related fields, e.g. voice conversion and speech synthesis, whose participation is sought for the design of future evaluations.

Articulatory data acquisition and processing

   Slim Ouni Slim.Ouni@loria.fr LORIA
   Korin Richmond korin@cstr.ed.ac.uk CSTR, University of Edinburgh
   Asterios Toutios toutios@sipi.usc.edu Signal and Image Processing Institute, University of Southern California

In recent years, the techniques available for acquiring articulatory data, such as  electromagnetic articulography, magnetic resonance imaging, ultrasound tongue imaging, electropalatography, electroglottography, video recording, optical motion capture, air flow and pressure measurements, have matured steadily, driving great advances in speech production research and other related fields. Nevertheless, using such methods tends to involve a large duplication of effort, with each research group developing their own data analysis and processing tools, while there is little exchange of knowledge regarding the practical details of data acquisition and processing methods. It would be enormously beneficial to the scientific community to actively encourage the identification of best practices and to establish guidelines regarding acquisition protocols. The aim of this special session is to meet this need, focusing on the technical aspects of articulatory data acquisition.

» More details

Child computer interaction

   Kay Berkling Berkling@dhbw-karlsruhe.de Duale Hochschule Karlsruhe
   Shrikanth Narayanan narayanan.shri@gmail.com University of Southern California
   Keelan Evanini KEvanini@ETS.ORG ETS
   Johan Schalkwyk johan.schalkwyk@gmail.com Google
   Arthur Kantor arthur_kantor@cz.ibm.com IBM
   Takayuki Arai arai@hoffman.cc.sophia.ac.jp Sophia University
   Stefan Steidl steidl@informatik.uni-erlangen.de University of Erlangen

This special session aims to bring together researchers and practitioners from universities and industry working in all aspects of multimodal child-computer interaction with a particular emphasis on, but not limited to, interactive spoken language interfaces. Examples of targeted domains where speech technology applications involving child-computer interaction are becoming increasingly important include healthcare and education, especially with the spread of mobile devices into the lives of children. Of special interest for Interspeech 2013, in the light of the humanistic view point, will be to consider the issue of global accessibility to these technologies. One challenge for the next two decades will be to employ affordable mobile technology and remove barriers caused by health issues or remoteness in order to grant accessibility to children around the globe. We look forward to receiving a large variety of submissions addressing issues related to child-computer interaction from the areas of automatic speech recognition, linguistics, multimedia, robotics, human computer interaction and related disciplines.

» More details

Computational Paralinguistics Challenge

Björn Schuller schuller@tum.de  Technische Universität München
Stefan Steidl stefan.steidl@fau.de Friedrich-Alexander-University
Anton Batliner  Anton.Batliner@lrz.uni-muenchen.de Technische Universität München
Alessandro Vinciarelli alessandro.vinciarelli@glasgow.ac.uk University of Glasgow
Klaus Scherer  Klaus.Scherer@unige.ch Swiss Center for Affective Sciences
Fabien Ringeval fabien.ringeval@unifr.ch University of Fribourg
Mohamed Chetouani  mohamed.chetouani@upmc.fr Université Pierre et Marie Curie

After four consecutive Challenges at INTERSPEECH, there still exists a multiplicity of not yet covered, but highly relevant paralinguistic phenomena. In the last instalments, we focused on single speakers. With a new task, we now want to broaden to analysing discussion of multiple speakers in the Conflict Sub-Challenge. A further novelty is introduced by the Social Signals Sub-Challenge: For the first time, non-linguistic events have to be classified and localised – laughter and fillers. In the Emotion Sub-Challenge we are literally “going back to the roots”. However, by intention, we use acted material for the first time to fuel the ever on-going discussion on differences between naturalistic and acted material and hope to highlight the differences. Finally, the Autism Sub-Challenge picks up on Autism Spectrum Condition in children’s speech in this year. Apart from intelligent and socially competent future agents and robots, main applications are found in the medical domain and surveillance.

This special session is dedicated to participants to the Computational Paralinguistic Challenge, a special event of Interspeech 2013.

» More details