Special sessions at Interspeech are intended to stimulate particular topics, identified by colleagues as bound to deserve a specific focus during the conference.
The following special session are scheduled at Interspeech:
» Speech science in end-user applications
» Intelligibility-enhancing speech modifications
» Spoofing and countermeasures for automatic speaker verification
» Articulatory data acquisition and processing
» Child computer interaction
» Computational Paralinguistics Challenge
Detailed description of Special Sessions:
To cast a balance between science and technology applications, this special session focuses on applications of speech technologies. Junior researchers in particular will appreciate getting an idea on how speech technology is used in industrial products designed with the end user in mind. We felt that a special session would be a good place to bring the academic and industrial world closer together, and exchange experiences in the overlapping areas of science and technology. We encourage contributions in all related fields, e.g. recognition, synthesis, semantics, classification, analytics, etc., but with a focus on real world applications and the problems detected in user studies or extracted from real-use log files. A poster session will allow for good individual exchange. The session will begin with an introduction by the organizers followed by a short presentation to introduce the posters at display. A brief panel discussion at the end will wrap up the session highlighting common grounds.
» More details
Martin Cooke email@example.com Ikerbasque
Yannis Stylianou firstname.lastname@example.org Toshiba Research Laboratory
Catherine Mayo email@example.com Centre for Speech Technology Research, University of Edinburgh
Natural (live and recorded) and synthetic speech are deployed increasingly in applications involving speech technology, many of which need to function under non-ideal listening conditions. In order to ensure correct message reception, existing systems are forced to rely upon excessive output level or repetition. An alternative approach is to manipulate speech or the message generation process to achieve adequate intelligibility levels while ideally reducing output intensity. A number of algorithms for intelligibility-enhancing speech modification have been proposed in recent years, with claims of improvements equivalent to reducing output levels by up to 5 dB. The purpose of the Special Session is to compare the effectiveness of algorithms whose goal is to increase the intelligibility of natural and synthetic speech in known noise conditions. If you are a researcher working on speech modifications which boost intelligibility, we welcome your submission
Nicholas Evans firstname.lastname@example.org EURECOM
Tomi Kinnunen email@example.com University of Eastern Finland
Junichi Yamagishi firstname.lastname@example.org University of Edinburgh
Sebastien Marcel email@example.com Idiap Research Institute
It is widely acknowledged that most biometric systems are vulnerable to imposture or spoofing attacks. While vulnerabilities and countermeasures for other biometric modalities have been widely studied, automatic speaker verification systems remain vulnerable. This special session aims to promote the study of spoofing and countermeasures for the speech modality. We invite submissions with an emphasis on new countermeasures in addition to papers with a focus on previously unconsidered vulnerabilities, new databases, evaluation protocols and metrics for the assessment of automatic speaker verification in the face of spoofed samples. In particular, we aim to stimulate new interest from colleagues working in related fields, e.g. voice conversion and speech synthesis, whose participation is sought for the design of future evaluations.
Slim Ouni Slim.Ouni@loria.fr LORIA
Korin Richmond firstname.lastname@example.org CSTR, University of Edinburgh
Asterios Toutios email@example.com Signal and Image Processing Institute, University of Southern California
In recent years, the techniques available for acquiring articulatory data, such as electromagnetic articulography, magnetic resonance imaging, ultrasound tongue imaging, electropalatography, electroglottography, video recording, optical motion capture, air flow and pressure measurements, have matured steadily, driving great advances in speech production research and other related fields. Nevertheless, using such methods tends to involve a large duplication of effort, with each research group developing their own data analysis and processing tools, while there is little exchange of knowledge regarding the practical details of data acquisition and processing methods. It would be enormously beneficial to the scientific community to actively encourage the identification of best practices and to establish guidelines regarding acquisition protocols. The aim of this special session is to meet this need, focusing on the technical aspects of articulatory data acquisition.
» More details
Kay Berkling Berkling@dhbw-karlsruhe.de Duale Hochschule Karlsruhe
Shrikanth Narayanan firstname.lastname@example.org University of Southern California
Keelan Evanini KEvanini@ETS.ORG ETS
Johan Schalkwyk email@example.com Google
Arthur Kantor firstname.lastname@example.org IBM
Takayuki Arai email@example.com Sophia University
Stefan Steidl firstname.lastname@example.org University of Erlangen
This special session aims to bring together researchers and practitioners from universities and industry working in all aspects of multimodal child-computer interaction with a particular emphasis on, but not limited to, interactive spoken language interfaces. Examples of targeted domains where speech technology applications involving child-computer interaction are becoming increasingly important include healthcare and education, especially with the spread of mobile devices into the lives of children. Of special interest for Interspeech 2013, in the light of the humanistic view point, will be to consider the issue of global accessibility to these technologies. One challenge for the next two decades will be to employ affordable mobile technology and remove barriers caused by health issues or remoteness in order to grant accessibility to children around the globe. We look forward to receiving a large variety of submissions addressing issues related to child-computer interaction from the areas of automatic speech recognition, linguistics, multimedia, robotics, human computer interaction and related disciplines.
» More details
Björn Schuller email@example.com Technische Universität München
Stefan Steidl firstname.lastname@example.org Friedrich-Alexander-University
Anton Batliner Anton.Batliner@lrz.uni-muenchen.de Technische Universität München
Alessandro Vinciarelli email@example.com University of Glasgow
Klaus Scherer Klaus.Scherer@unige.ch Swiss Center for Affective Sciences
Fabien Ringeval firstname.lastname@example.org University of Fribourg
Mohamed Chetouani email@example.com Université Pierre et Marie Curie
After four consecutive Challenges at INTERSPEECH, there still exists a multiplicity of not yet covered, but highly relevant paralinguistic phenomena. In the last instalments, we focused on single speakers. With a new task, we now want to broaden to analysing discussion of multiple speakers in the Conflict Sub-Challenge. A further novelty is introduced by the Social Signals Sub-Challenge: For the first time, non-linguistic events have to be classified and localised – laughter and fillers. In the Emotion Sub-Challenge we are literally “going back to the roots”. However, by intention, we use acted material for the first time to fuel the ever on-going discussion on differences between naturalistic and acted material and hope to highlight the differences. Finally, the Autism Sub-Challenge picks up on Autism Spectrum Condition in children’s speech in this year. Apart from intelligent and socially competent future agents and robots, main applications are found in the medical domain and surveillance.
This special session is dedicated to participants to the Computational Paralinguistic Challenge, a special event of Interspeech 2013.
» More details