Reinforcement Learning for Adaptive Dialogue Systems PART IV: Tools and Future Directions Oliver Lemon

Verena Rieser

School of Informatics University of Edinburgh For updated course materials see: http://sites.google.com/site/olemon/eacl09

EACL tutorial, March 2009

Outline

Tools

Research directions and open problems

Summary

Outline

Tools

Research directions and open problems

Summary

Tools for data collection

I

Wizard-of-Oz experiments [Fraser and Gilbert, 1991].

I

WAMI toolkit (Web-Accessible Multimodal Interfaces) http://wami.csail.mit.edu/.

I

DUDE rapid prototyping (Dialogue and Understanding Development Environment) [Lemon and Liu, 2006].

PARADISE: logistic regression for reward modelling I

Any toolkit which allows statistical data analysis like linear regression, curve fitting, . . . , e.g. I

SPSS

I

Matlab

I

R

I

GNUplot

I

etc.

RL packages/code I

Matlab tollboxes: I

Perseus for Matlab: randomized point-based approximate value iteration algorithm for Partially Observable Markov Decision Processes (POMDPs). http://staff. science.uva.nl/~mtjspaan/software/approx/

I

MDP Toolbox for Matlab (INRIA) http://www.inra.fr/ internet/Departements/MIA/T//MDPtoolbox/

I

C, Lisp, and Matlab code from [Sutton and Barto, 1998] http://www.cs.ualberta.ca/~sutton/book/ code/code.html, http://waxworksmath.com/ Authors/N_Z/Sutton/sutton.html

RL packages/code II

Java, C++: I

Reinforcement Learning Toolbox 2.0, C++ toolbox (Uni Graz), http://www.igi.tugraz.at/ril-toolbox/ general/overview.html

I

Java code (Q learning and SARSA), http://www.cse. unsw.edu.au/~cs9417ml/RL1/sourcecode.html

I

PIQLE platform in Java, http://sourceforge.net/projects/piqle/

I

...

Example: REALL toolkit, Edinburgh-Stanford-Link

tha to Daniel Shapio and Carl Tollander.

Dialogue toolkits I

Finite state: I

CSLU toolkit, AT&T FSM library

I

Voice XML, BeVocal Cafe (Nuance): http://cafe.bevocal.com/

Information state update (ISU): I

DIPPER: www.ltg.ed.ac.uk/dipper/

I

TRINDIKIT: www.ling.gu.se/projekt/trindi/trindikit/

...also see [McTear, 2004].

Outline

Tools

Research directions and open problems

Summary

Overview: Research directions and open issues

I

Tractability, and dimensionality reduction methods, e.g. Summary POMDPs [Williams and Young, 2005]

I

Available corpora, e.g [Rieser and Lemon, 2008b]

I

What makes a good user simulation?, e.g. [Ai and Litman, 2008]

I

Is RL suitable for commercial dialogue strategy development? [Paek and Pieraccini, 2008]

I

RL for Natural Language Generation [Lemon, 2008, Janarthanam and Lemon, 2008, Rieser and Lemon, 2009]

I

See [Lemon and Pietquin, 2007] for further discussion.

Announcement

Interspeech 2009, Brighton, special session on: “Machine Learning for Adaptivity in Spoken Dialogue Systems”, http://www.interspeech2009.org/conference/ specialsessions.php

Outline

Tools

Research directions and open problems

Summary

Summary: PART I+II (Oliver Lemon)

PART I: Introduction 1. Introduction to Dialogue Strategy Development 2. Introduction to Reinforcement Learning 3. RL-based dialogue system development Part II: Literature review on RL for DM

Summary: PART III +IV (Verena Rieser)

PART III: Simulation-based Dialogue Strategy Optimisation 1. Simulation-based RL 2. Data collection and Corpus requirements 3. Simulated envionments for dialogue optimisation I I I I

State-action space Noise model User simulation Data-driven reward modelling

4. Policy training and evaluation Part IV: Tools and Future directions

Summary: Take home messages

I

DM is a hard problem for human designers.

I

DM has a large decision space, long term effects of actions, stochastic/non-deterministic environment,...

I

RL is better than manually setting thresholds, hand coding, e.g. [Rieser and Lemon, 2008a].

I

Simulation-based training and testing allows effective “system-in-the-loop” development.

I

RL promises advances in adaptive and robust HCI.

I

Principled mathematical framework for dialogue management.

Ai, H. and Litman, D. (2008). Assessing dialog system user simulation evaluation measures using human judges. In Proc. of the 21st International Conference on Computational Linguistics and 46th Annual Meeting of the Association for Computational Linguistics (ACL/HLT). Fraser, N. M. and Gilbert, G. N. (1991). Simulating speech systems. Computer Speech and Language, 5:81–99. Janarthanam, S. and Lemon, O. (2008). User simulations for online adaptation and knowledge-alignment in troubleshooting dialogue systems. In Proc. of the 12th SEMdial Workshop on on the Semantics and Pragmatics of Dialogues. Lemon, O. (2008). Adaptive natural language generation in dialogue using Reinforcement Learning. In Proc. of the 12th SEMdial Workshop on on the Semantics and Pragmatics of Dialogues. Lemon, O. and Liu, X. (2006). DUDE: a dialogue and understanding development environment, mapping business process models to Information State Update dialogue systems. In Proc. of the Conference of the European Chapter of the ACL (EACL). Lemon, O. and Pietquin, O. (2007). Machine Learning for spoken dialogue systems. In Proc. of the International Conference of Spoken Language Processing (Interspeech/ICSLP).

McTear, M. F. (2004). Towards the Conversational User Interface. Springer Verlag. Paek, T. and Pieraccini, R. (2008). Automating spoken dialogue management design using machine learning: An industry perspective. Speech Communication, Special Issue on Evaluating New Methods and Models for Advanced Speech-Based Interactive Systems, 50(8-9):716–729. Rieser, V. and Lemon, O. (2008a). Does this list contain what you were searching for? Learning adaptive dialogue strategies for Interactive Question Answering. J. Natural Language Engineering, 15(1). Rieser, V. and Lemon, O. (2008b). Learning effective multimodal dialogue strategies from Wizard-of-Oz data: Bootstrapping and evaluation. In Proc. of the 21st International Conference on Computational Linguistics and 46th Annual Meeting of the Association for Computational Linguistics (ACL/HLT). Rieser, V. and Lemon, O. (2009). Natural Language Generation as Planning Under Uncertainty for Spoken Dialogue Systems. In Proc. of the Conference of the European Chapter of the ACL (EACL). Sutton, R. and Barto, A. (1998). Reinforcement Learning. MIT Press.

Williams, J. and Young, S. (2005). Scaling up POMDPs for dialogue management: The “Summary POMDP” method. In Proc. of the IEEE workshop on Automatic Speech Recognition and Understanding (ASRU).

PART IV: Tools and Future Directions

internet/Departements/MIA/T//MDPtoolbox/. ▻ C, Lisp, and Matlab code from [Sutton ... Is RL suitable for commercial dialogue strategy development? [Paek and ...

288KB Sizes 2 Downloads 168 Views

Recommend Documents

implications and future directions
advantages the result of the underlying individuals' abilities to make astute resource- ..... at differing levels of analysis, including industry, corporate, and business ...... been done looking at the turnover of key scientists in technology-motiva

implications and future directions
phone: (404) 727-6379 fax: (404) 727-6313 email: [email protected]. William S. Hesterly. David Eccles School of Business ...... (Boudon, 1998a, b) and answering the theoretical and causal question of why ..... between large and small companie

Reflecting on Current Challenges and Future Directions ...
Controversies related to the strengths and limitations of the controlled clinical trials ... Moreover, because the National Institute of Mental Health has been ... allow therapists to master the therapy in an artful manner (B. Arnow, J. Clarkin). ...

Current developments and future directions of bio ... - Semantic Scholar
The com- plexity of an expression is measured by the length of the ...... multiple classifiers is actually a very common practice in machine .... Information Fusion 6,.

Current developments and future directions of bio ... - Semantic Scholar
require close collaboration between computer scientists and ecologists. The rest of .... long as the maximum number of hits (i.e., 50 in this study) is achievable in ...

Current Trends and Future Directions in Data Curation Research ...
Current Trends and Future Directions in Data Curation Research and Education.pdf. Current Trends and Future Directions in Data Curation Research and ...

Current developments and future directions of bio ...
(1)–(3) could be potentially generated by GP. The fitness function determines how well an individual expression fits an observational galaxy profile. The overall.

Henry IV, Part II
SERVANT He, my lord: but he hath since done good service at Shrewsbury; and, as I hear, is now ... FALSTAFF I would it were otherwise; I would my means were greater, and my waist slenderer. ...... Stay but a little; for my cloud of dignity.

Future Directions in the Treatment of Anxiety Disorders
about basic science; Dr. C. Barr Taylor from Stanford University discussed clinical train- ... apy outcome and process; and Dr. Thomas D. Borkovec from The ...

SLCE Future Directions Project: Ways to Get Involved - WordPress.com
your thinking about the future of SLCE in the form of a video clip, artwork, ... Crafting a substantive thought piece (1500 - 2000 words) of your own for the website.

SLCE Future Directions Project: Ways to Get Involved - WordPress.com
your thinking about the future of SLCE in the form of a video clip, artwork, ... Crafting a substantive thought piece (1500 - 2000 words) of your own for the website ... to a piece we have already published and/or develop your own new topic as.

pdf-095\re-thinking-the-future-of-work-directions-and-visions ...
... the apps below to open or edit this item. pdf-095\re-thinking-the-future-of-work-directions-and- ... agement-work-and-organisations-by-colin-c-williams.pdf.

Part IV. SAT (Boolean Formula Satisfiability Problem) - WordPress.com
Department of Computer Science & Engineering ..... Let somebody gives us a certificate that a hamiltonian cycle, A-C-B-D-F-E-A exists in the above graph, it can ...

RenWeb Directions
the “First Time Users” tab and enter your email address in the box. Then simply click "New Parent Login". A password will be generated by RenWeb and emailed ...

Directions API
Takes traffic congestion and flow into account to determine the optimal route. ... Improve customer service ... CUSTOMERS WHO USE DIRECTIONS API ... or to learn more about how customizing Google Maps can impact your business,.

Future Continuous, Future perfec simple and Future perfect ...
Sign in. Page. 1. /. 6. Loading… Page 1 of 6. CAMBODIAN MEKONG UNIVERSITY EN 105. Is the school that cares for the value of education New English File (Upper). Tutor: VinhSovann. Future Form. I. Future Simple. There are four form of future simple.

Audiovisual tools as integrated part of information transfer ... - EFITA
and to create a lasting commitment, a dedicated communication platform such as a special news program ´Beet-TV´ is most effective. By means of such a platform with different audiovisual elements a more personalized, durable, comfortable, and perhap

Future Continuous, Future perfec simple and Future ...
Future Continuous, Future perfec simple and Future perfect continuous.pdf. Future Continuous, Future perfec simple and Future perfect continuous.pdf. Open.

Download [Epub] SolidWorks 2014 Part I - Basic Tools Full Pages
SolidWorks 2014 Part I - Basic Tools Download at => https://pdfkulonline13e1.blogspot.com/1585038539 SolidWorks 2014 Part I - Basic Tools pdf download, SolidWorks 2014 Part I - Basic Tools audiobook download, SolidWorks 2014 Part I - Basic Tools

Read [PDF] SolidWorks 2014 Part I - Basic Tools Full Pages
SolidWorks 2014 Part I - Basic Tools Download at => https://pdfkulonline13e1.blogspot.com/1585038539 SolidWorks 2014 Part I - Basic Tools pdf download, SolidWorks 2014 Part I - Basic Tools audiobook download, SolidWorks 2014 Part I - Basic Tools

Teacher Directions
explain complain trail train braid tail mail sail. Page 8. black flash slam rash flash ... mass drag. Page 10. chat blast clamp cramp crash gland glass. Graft. Page 11.

Directions For Use - GitHub
Page 7 of 46. 4. Using EMPOP to perform mtDNA haplotype frequency estimates. EMPOP follows the revised and extended guidelines for mitochondrial DNA typing issued by the DNA commission of the ISFG (Parson et al. 2014). See document for further detail