RBC
The Restaurant Booking Corpus
Related Paper: Siegert, Ingo; Nietzold, Jannik; Heinemann, Ralph; Wendemuth, Andreas The Restaurant Booking Corpus - content-identical comparative human-human and human-computer simulated telephone conversations. In: Elektronische Sprachsignalverarbeitung 2019 - Dresden: TUDpress, p. 126-133 - (Studientexte zur Sprachkommunikation; 93)
Brief Information
Collected Data
Recording Setup
Content of the Corpus
Participant's characterization
Related Publications
Brief Information
The Restaurant Booking Corpus is a dataset that lets the participants perform the same task with a human being or a technical system as conversational partner. RBC explicitly assigns the same role to the human conversational partner as wellto the technical system. To support this role model, certain influencing factors are eliminated: the effect of a visible counterpart, the speech content, and the dialog domain. Additionally, the statements given by the human conversational partner and the technical systems is identical.
The Requirements for identical speaker role are
- Similar speech commands that feels “natural
- Similar speech answers given by human and system
- No visible counterpart, neither as a GUI, an avatar, nor a “body”
This is implemented due:
- Limited task of restaurant booking with some additional constraints
- Sripted answers by human agent and technical system
- Conversations over simulated telephone
Collected Data
- Audio recordings
- from the participant
- from the human agent/ technical system
- Questionnaires
- Socio-demographic information (before the experiment)
- Experiences with technical systems in general (before the experiment)
- Perception of ALEXA and the interlocutor (after the experiment)
Recording Setup
The recordings took place at the Institute of Information Technology and Communications. They were conducted in a living-room-like surrounding. The aim of this setting was to enable the participant to get into a natural communication atmosphere (in contrast to the distraction of laboratory surroundings).
The recordings were conducted using two high-quality neckband microphones (Sennheiser HSP 2-EW-3) to capture the voices of the participant and the interlocutor as well as one high-quality shotgun microphone (Sennheiser ME 66) to capture the overall acoustic scene and especially the output of the voice assistant. The recordings were stored uncompressed in WAV-format with 44.1 kHz sample rate and 16 bit resolution.
Content of the Corpus
Participants | 30 German-speaking students |
Distribution of Sex | 10 men 20 women |
Distribution of Age | MW 24 years STD 3.45 years Min: 18, Max: 31 years |
Total amount of data | 5 hours, 37 minutes |
Mean duration per dialog | 196 secounds |
Annotation | utterances, type of utterances, transcripts, context |
Related Publications
Siegert, Ingo; Weißkirchen, Norman; Krüger, Julia; Akhtiamov, Oleg; Wendemuth, Andreas Admitting the addressee detection faultiness of voice assistants to improve the activation performance using a continuous learning framework. In: Cognitive Systems Research, Elsevier BV, 2021
Akhtiamov, Oleg; Siegert, Ingo; Karpov, Alexey; Minker, Wolfgang Using complexity-identical human- and machine-directed utterances to investigate addressee detection for spoken dialogue systems In: Sensors - Basel: MDPI, Volume 20(2020), issue 9, article 2740
Baumann, Timo; Siegert, Ingo Prosodic addressee-detection - ensuring privacy in always-on spoken dialog systems In: Mensch und Computer 2020 - Tagungsband - New York, New York: The Association for Computing Machinery, Inc. . - 2020, S. 195-198
Akhtiamov, Oleg; Siegert, Ingo; Karpov, Alexey; Minker, Wolfgang Cross-corpus data augmentation for acoustic addressee detection In: 20th Annual Meeting of the Special Interest Group on Discourse and Dialogue - Stroudsburg, PA, USA: Association for Computational Linguistics (ACL), S. 274-283, 2019 ; [Tagung: 20th Annual Meeting of theSpecial Interest Group on Discourse and Dialogue,SIGDIAL 2019, Stockholm, Sweden, 11-13 September 2019]