On Multi-Response Task-Oriented Dialog Systems

So in this article I discuss papers on multi-action dialog policies in task-oriented dialog systems. Task-oriented dialog systems are conversational systems that help you achieve some purpose — like booking flight tickets for instance. Some examples are Alexa, Siri etc.

Source: Unsplash

These systems have four components, of which the Dialog Policy component is used to determine system action based on the current user dialog and the dialog state.

Source: Medium

Usually, each user dialog is answered with a single system dialog. But often, multiple responses can make sense: for instance, “find me a restaurant” can be responded to with “Sure! What cuisine are you looking for?” or “Any particular areas you would prefer?”. New research now suggests that a one-to-many mapping of state and actions should be used to get more diverse and plausible responses. This is because human conversation is diverse, and there need not be only one correct way towards task completion [1].

What are some popular methods or strong baselines commonly used?

Task-oriented dialog systems with multiple plausible responses are mostly neglected in the research community. Most of the works on this topic focus on general dialog systems; only four papers were found that were specific to task-oriented dialog. A description of these four works is given below.

  1. This paper by Zhang et al. proposes two components: a framework for data augmentation, and a multi-decoder network. The framework helps learn a dialog policy that can generate diverse responses, and adds these multiple actions to the dataset through oversampling. The decoder model is an end-to-end dialog system that consists of three decoders: one for belief span, one for system action and one for system response. The system action decoder utilizes the aforementioned framework to generate multiple actions, which are then passed to the system response decoder.

Building multi-action, multi-response dialog systems is a challenging problem, often compounded by the lack of suitable annotated data and evaluation metrics. But this has strong potential in the dialog community, and should be pursued further in research.

Part-time graduate student at University of Washington | Software Engineer at Paytm, India | I try not to sweat it. Meanwhile, I write on NLP research!