Topics: Speech recognition, Pragmatics, Multimodal interaction Pages: 15 (5035 words) Published: May 5, 2013
On the use of expectations for detecting and repairing human-machine miscommunication CSELT Centro Studi E Laboratori Telecomunicazioni S.p.A. Via G. Reiss Romoli 274 I-10148 Torino, Italy E-Mail: Morena.Danieli@cselt.stet.it In this paper I describe how miscommunication problems are dealt with in the spoken language system DIALOGOS. The dialogue module of the system exploits dialogic expectations in a twofold way: to model what future user utterance might be about (predictions), and to account how the user's next utterance may be related to previous ones in the ongoing interaction (pragmatic-based expectations). The analysis starts from the hypothesis that the occurrence of miscommunication is concomitant with two pragmatic phenomena: the deviation of the user from the expected behaviour and the generation of a conversational implicature. A preliminary evaluation of a large amount of interactions between subjects and DIALOGOS shows that the system performance is enhanced by the uses of both predictions and pragmatic-based expectations.

Morena Danieli


During the last few years it has been emerging that the success of spoken language systems is greatly improved by the contextual reasoning of dialogue modules. This tenet has spread through both the speech and the dialogue communities. Dialogue systems devoted to spoken language applications are able to detect partial communication breakdowns by other system modules, and that increases the robustness of human-machine interactions by speech. During oral interactions with computers, communication problems often arise after the occurrence of errors during the recognition phase. Sometimes these errors cannot be solved by the semantic module: the utterances containing them are interpreted by the semantic analyser, but with an information content different from the speaker's intended meaning. Detecting such miscommunications and repairing them through initialization of appropriate repair subdialogues is essential for the interaction to be successful. Most of the research in this area has been devoted to providing the recognition and understanding modules with information generated on the basis of the dialogue context. They predict what the next user's utterance

The problem

will probably be about: throughout this paper I will refer to them with the name predictions . Sometimes they are passed down to the acoustic recognition level in order to decrease the huge number of lexical choices, sometimes they are used to help in deciding on multiple semantic interpretations. More often they are used during the contextual interpretation phase to accept or reject parser output. Although predictions have proved useful, they only grasp one side of the miscommunication problem. Actually they are a means for reducing recognition errors, and their use allows the avoidance of one of the potential sources of miscommunication. However during spoken human-computer interactions, the detection of miscommunications may be outside the capabilities of the dialogue system, even though it uses predictions on the contrary, the user is usually able to detect any speech errors. For example, in travel inquiry applications words belonging to the same class, such as proper names of place, may be highly confusable. When the dialogue prediction says that next user's utterance is likely to be about a departure place, this does not exclude that the recognizer substitutes the actually uttered name with a phonetically similar one. Only the user is able to detect such kinds of errors. In this situation the dialogue system should identify the user's detection of miscommunication and provide appropriate repairs. In this paper I will argue that the dialogue module ability to detect user-initiated repairs is improved if the system is able to capture the pragmatic phenomena that accompany user's detection of miscommunication. The paper o ers an analysis of the pragmatic phenomena that occur when users...

References: Conclusions
Fraser, N.M. 1992. Human-Computer Conversational Maxims. SUNDIAL Project Working Paper. Gerbino, E. and Danieli, M. 1993. Managing Dialogue in a Continuous Speech Understanding System. In Proceedings of the Third European Conference on Speech Communication and Technology, 1661{1664. Berlin, Germany. Grice, H.P. 1967. Logic and Conversation. In Cole, P., and Morgan, J. eds. 1975. Syntax and Semantics, New York and London: Academic Press. Grosz, B. J. 1981. Focusing and Description in Natural Language Dialogue. In Webber, B., Joshi, A., and Sag, I. eds. Elements of Discourse Understanding, Cambridge: Cambridge Univ. Press. McRoy, S. and Hirst, G. 1995. The Repair of Speech Act Misunderstandings by Abductive Inference. In Computational Linguistics, Volume 21, Number 4, 435{478. Quazza, S. Salza, P. Sandri, S. and Spini, A. 1993. Prosodic Control of a Text-to-Speech System for Italian. In Proceedings of the European Speech Communication Association Workshop on Prosody, 78{81. Lund, Norway. Scheglo , E.A. 1992. Repair after next turn: The last structurally provided defense of intersubjectivity in conversation. In American Journal of Sociology, Volume 97, Number 5, 1295{1345 Smith, R.W. Hipp, D.R. and Biermann, A.W. 1995. An Architecture for Voice Dialogue Systems Based on Prolog-Style Theorem Proving. In Computational Linguistics, Volume 21, Number 3, 281{320. Suri, L.Z. and McCoy, K.F. 1995. A Methodology for Extending Focusing Frameworks. In Working Notes of the AAAI Spring Symposium on Empirical Methods in Discourse Interpretation and Generation, 149{155. Stanford, Ca.
Continue Reading

Please join StudyMode to read the full document

Become a StudyMode Member

Sign Up - It's Free