A Data-driven method for Adaptive Referring Expression Generation in Automated Dialogue Systems: Maximising Expected Utility Srinivasan Janarthanam & Oliver Lemon School of Informatics, University of Edinburgh www.classic-project.org

Introduction

User Simulation

Adaptive generation of referring expressions in dialogue systems benefits grounding between dialogue partners (Issacs & Clark 1987). Although adapting to users is beneficial, adapting to an unknown user is tricky, and hand coding such adaptive REG policies is cumbersome work. We present a data-driven framework to automatically learn an adaptive REG (NLG) policy for spoken dialogue systems (fig. 1) using Reinforcement Learning. The learned policy tries to maximise the expected utility of the RE choices made by the system.

Step 1. P(CRu,t | REs,t, DKu,RE, H) Step 2a. P(Au,t | As,t , CRu,t) Step 2b. P(EAu,t | As,t , CRu,t) REs,t – Referring expression used in the system’s utterance at turn t. CRu.t – Clarification Request by the user u at turn t. DKu.RE – Domain Knowledge of the user u on referring expression RE. H – History of clarifications already given. As,t – System action at turn t. Au,t – User dialogue action at turn t. EAu,t – User’s environment action at turn t.

Fig 1. Adaptive dialogue system

NLG module

Fig 2. Reinforcement Learning Setup

 Translates dialogue acts into system utterances.  Identifies the RE to be used in the utterances to refer to the domain objects based on the REG policy: • Jargon – Use technical terms as referring expressions “Connect one end of the broadband cable to the broadband filter.”

User simulation model probabilities were populated from data collected using Wizard-of-Oz experiments with real users. See (Janarthanam and Lemon, ENLG 2009) for more information.

Training

• Descriptive – Use descriptive referring expressions “Connect one end of the thin cable with grey ends to the small white box.”

• Tutorial – Use both to teach technical terms “Connect one end of the broadband cable to the broadband filter. The broadband cable is the thin cable with grey ends. The broadband filter is the small white box.”

 The user model (UMs,u), based on which REs are chosen, is dynamically updated with information on the user’s domain knowledge.

Reinforcement Learning A basic Reinforcement Learning (RL) setup consists of a learning agent, its environment and a reward model (Sutton and Barto, 1998). The learning agent explores by taking different possible actions in different states and exploits the actions for which the environmental rewards are high. RL has been successfully used for learning dialogue management policies (Levin et al., 1997).  In our model, the learning agent is the NLG module of the dialogue system, whose objective is to learn an adaptive REG policy.  The environment consists of a user simulation which interacts with the dialogue system (fig 2).  The NLG module explores by choosing different expressions during learning.  The user simulation rewards the system when the system chooses the appropriate referring expressions.  The NLG module reinforces the choices that get more reward from the user and avoid the choices that get less reward. © Srini Janarthanam and Oliver Lemon

Fig 3. Training graph with mixed user population – experts and novices.

Task Completion Reward (TCR)= 1000 Cost of CR (CCR) = -n(CR) * 30 Final Reward = TCR + CCR

Evaluation Policy

Avg. Reward

Avg. CRs

LP

944.09*

1.8

Random

919.31*

2.69

Desc only

948.19

1.75

Jargon only

885.35*

3.82

The current Learned policy (LP) is better than the Random and Jargon-only policies. It is currently as good as a Desc-only policy. Future work will explore much longer training runs and different learning parameters to produce better policies. PRE-CogSci 2009, Amsterdam

Srinivasan Janarthanam & Oliver Lemon

Domain Knowledge of the user u on referring expression RE. H – History of ... The NLG module reinforces the choices that get more reward from the user and ...

197KB Sizes 1 Downloads 178 Views

Recommend Documents

Srinivasan Janarthanam & Oliver Lemon University of ...
Interface. Tool. Wizard. User utterance (audio). System utterance (audio). • User background ... Listen, interpret and annotate the user's utterance. References.

Srinivasan Engineering College Aeronautical Engineering ...
4. Explain the thermoforming process. 5. Explain induction and ultrasonic methods. 6. Explain working and principle of applications of. a. compression moulding. b. transfer moulding (16). Page 3 of 3. Main menu. Displaying Srinivasan Engineering Coll

LEMON ICEBOX PIE.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. LEMON ICEBOX ...

LEMON ICEBOX PIE.pdf
Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. LEMON ICEBOX PIE.pdf. LEMON ICEBOX PIE.pdf. Open. Extract.

SRINIVASAN KB - Resume (2017) - Dec.pdf
Page 1 of 2. SRINIVASAN KB. 153/4A5, INTUC Nagar, Rajapalayam-626117 | 9042284142 | [email protected]. eProfile with ratings from the companies I worked with : www.twenty19.com/@srinivasankb. Objective. To ensure the development of company by us

Srinivasan Engineering College AE Finite Element Method.pdf ...
Point collocation method. Sub domain collocation method. Least squares ... Displaying Srinivasan Engineering College AE Finite Element Method.pdf. Page 1 of ...

Srinivasan, Seow, Particle Swarm Inspired Evolutionary Algorithm ...
Tbe fifth and last test function is Schwefel function. given by: d=l. Page 3 of 6. Srinivasan, Seow, Particle Swarm Inspired Evolutionar ... (PS-EA) for Multiobjective ...

Oliver-Twist.pdf
Although I am not disposed to maintain that the being. born in a workhouse, is in itself the most fortunate and en- viable circumstance that can possibly befall a ...

HP-lemon-drop-PS-two_VGS.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. HP-lemon-drop-PS-two_VGS.pdf. HP-lemon-drop-PS-two_VGS.pdf. Open. Extract. Open with. Sign In. Main menu.

Mary Oliver Poems.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Mary Oliver ...

Oliver 1968 1080
Sword coastcodex.Kama Sutra For. Beginners. Applied mathematics for database professionals.LONDON HAS FALLEN(2015).Theamazing spider man 1080p 3d bluray.Oliver. 1968 1080.Strictly come dancing it takes two.Modern romanceaziz pdf.Against the wild 2014

Diana E. Lemon Bio.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Diana E. Lemon ...

HP-lemon-drop-PS-one_VGS.pdf
Lemon drop Planner stickers. Must do. get it done. reschedule. remember. we e k e n d. This Week. $ Bills to Pay. Page 1 of 1. HP-lemon-drop-PS-one_VGS.pdf.

Srinivasan et al 2014 TTP R2.pdf
estimate the duration of a stimulus) memory processes seem particularly relevant: the more. richly a stimulus is encoded the longer the stimulus is judged to ...

HP-Cool-Lemon-PS-one_VGS.pdf
Loading… Page 1. Whoops! There was a problem loading more pages. Retrying... HP-Cool-Lemon-PS-one_VGS.pdf. HP-Cool-Lemon-PS-one_VGS.pdf. Open.

Go Suck a Lemon
... for foreach in srv users serverpilot apps jujaitaly public index php on line 447 ... Improving Your Emotional Intelligence book in english language [download] ...

Lemon Cheesecake, Cluck Cluck Sew.pdf
the bottom), and move it onto the platter. 7. To make the topping, cream the cream cheese, and add the cream until it is thinned, but not too thin to be runny. Add.

HP-Cool-Lemon-PS-two_VGS.pdf
Loading… Page 1. Whoops! There was a problem loading more pages. Retrying... HP-Cool-Lemon-PS-two_VGS.pdf. HP-Cool-Lemon-PS-two_VGS.pdf. Open.

Lemon KTV Vol 7-V.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Lemon KTV Vol ...

pdf-15195\algorithms-data-structures-in-python-by-srinivasan ...
Page 1. Whoops! There was a problem loading more pages. pdf-15195\algorithms-data-structures-in-python-by-srinivasan-jagannathan-nareg-sinenian.pdf.

Srinivasan, Prashanth - 2006 - Preferential routes of bird dispersal to ...
Page 1 of 6. 114 Indian Birds Vol. 2 No. 5 (September–October 2006). Preferential routes of bird dispersal to the Western Ghats in India: An explanation for the avifaunal peculiarities of the Biligirirangan Hills. Umesh Srinivasan & Prashanth N.S..