Turn-Taking Management for Multimodal Dialogue Systems
Natural dialogue involves the management of many communicative resources in a complex activity. Participants in a conversation transmit information, agree and disagree between each other, monitor the communicative status of their messages, make decisions about non-linguistic actions and, among other things, they deal with social conventions about who is to talk and when. Conversations are joint actions in which participants individually perform coordinated activities (Clark, 1996). It is quite illustrative the metaphor proposed by Clark about dialoguing as dancing waltz. Each individual action is coordinated in a choreographic way with the actions of the other person. How and when to talk is learned since childhood, it is known that face-to-face conversation is the matrix of language acquisition. It is common to listen parents telling to kids when is expected from them to talk (for instance, they motivate responses if a question, an order or a 'hello' was expressed, they show when is expected to say 'thank you' or 'you are welcome').
For the last twenty years, various efforts have sprung from the interest to develop annotation models that reflect the diverse ways in which participants' behaviors contribute to conversation (Ahrenberg, et al., 1995; HCRC, Carletta, et al., 1996; DAMSL, Allen & Core, 1997). These tagging schemes are based on dialogue act annotation. They differ from each other in the definition of the minimal units of analysis (utterances, segments or functional segments), each of them established its particular set of functions (speech acts, dialogue acts, communicative functions, intentions) and each one defined the dimensions in which those functions contribute to the conversation (understanding, task, semantic content); in despite of their differences, they all assign communicative values as offers, affirms, acknowledgements or accepts, to minimal units of verbal and nonverbal behaviors; and correlate the functions on the basis of adjacency relations as question-answer. Turn-taking management is a secondary order that emerges from the relations between dialogue acts or communicative functions and it is implicitly defined by the dialogue acts relations.
Two years ago was published the ISO 24617-2 standard for dialogue annotation (Bunt, et al., 2012). This standard is concerned with the exhaustive annotation of multimodal copora and it reveals that assigning comm5unicative values into dialogue stretches has matured well enough into a complex set of semantic, syntactic and pragmatic dimensions on dialogue study.
“ISO standard 24617-2 has been developed in recent years in view of the need for an application-independent dialogue act annotation scheme that is both empirically and theoretically well founded, that can adequately deal with typed, spoken, and multimodal dialogue, and that can be effectively used both by human annotators and by automatic annotation...