Integrating Learning into a BDI Agent for Environments with Changing Dynamics 1

1

Dhirendra Singh

Sebastian Sardina

1

Lin Padgham

2

Geoff James

1RMIT

University, Melbourne, Australia 2CSIRO Energy Technology, Sydney, Australia

Summary

BDI Learning Framework

This paper extends our earlier work integrating learning to improve plan selection in the popular Belief, Desire, Intentions (BDI) agent paradigm.

Our learning framework augments plan’s context conditions with decision trees, allowing plan applicability to be learnt from experience.

Here we address the problem that learning in deployed agents must be continuous rather than a one-off process.

Using a probabilistic plan selection function, the agent balances exploration and exploitation of plans.

Our main contribution is a novel confidence measure which allows the agent to adjust its reliance on the learning dynamically, facilitating in principle infinitely many (re)learning phases.

Record outcomes for chosen plans to train decision trees

We demonstrate the benefits of the approach in an example battery controller for energy management.

A building with local generation and loads is to restrict power consumption to a set range, using a modular battery system that can be charged or discharged as needed.

Probablistically select plans based on ongoing learning Acting and learning are interleaved in an online manner, i.e., current learning influences ongoing choices that impact subsequent learning.

Confidence in Learning BDI Architecture A plan is a rule e : ψ ← δ; program δ is a strategy for goal e when context condition ψ holds. Plans may perform primitive actions or post subgoals that are handled in a hierarchical manner.

We build confidence from observed performance of a plan by evaluating how well-informed were the recent decisions, or stability-based measure, and how well we know the worlds we are witnessing, or world-based measure. Plan selection weight, that dictates exploration, is then calculated using the predicted likelihood of success and the dynamic confidence measure.

events

Modular Battery Controller

A programmed solution is not ideal since battery performance is susceptible to change over time. We design a learning BDI controller that works to initial specification but also adapts to ongoing changes in the battery system. Scenario 1: Recovery from deterioration in module capacities at 5k episodes. 1 0.95 0.9 0.85 0.8 0.75

0

5k 10k 15k 20k 25k 30k 35k

Scenario 2: Recovery from individual module failures during [0, 20k], [20k, 40k] episodes.

P Pending Events

G1 Pa

Beliefs

Pb √

× BDI engine

G2 Pc

Pd

×

×

Pe √

Plan library

dynamic static Intention Stacks actions Traditionally, BDI agents have no learning ability, and cannot adjust to changes that cause previously successful approaches to fail.

Example: Say plan Pc no longer works for resolving goal G2 after execution 15, and plan Pe does instead. As plan Pc starts to fail, the perceived confidence (y-axis) drops, promoting new exploration and (re)learning. 1 0.8 0.6 0.4 0.2 0

1 0.9 0.8 0.7 0.6 0

3

6

9 12 15 18 21 24

20k

30k

40k

Scenario 3: Recovery from complete system failure during [0, 5k] episodes. 1 0.8 0.6 0.4 0.2 0 0

0

10k

5k

10k

15k

20k

The above experiments plot average success in configuring the battery correctly (y-axis) over the number of episodes (x-axis) for various changes in the environment dynamics.

D. Singh, S. Sardina, L. Padgham, G. James, Integrating Learning into a BDI Agent for Environments with Changing Dynamics. In Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), Barcelona, Spain, 2011.

Dhirendra Singh Sebastian Sardina Lin Padgham Geoff ...

CSIRO Energy Technology, Sydney, Australia. Summary. This paper extends our earlier work integrating learning to improve plan selection in the popular. Belief ...

494KB Sizes 0 Downloads 128 Views

Recommend Documents

Dhirendra Singh Sebastian Sardina Lin Padgham ...
School of Computer Science & Information Technology, RMIT University, ... tion of plans, while learning online. ... level plans may fail not because they were.

Sebastian Sardina Lavindra de Silva Lin Padgham
RMIT University [email protected] ... User provides (procedural) domain knowledge. – Some similarities with ... N is the agent name. 2. Π is a plan library ...

Nitin Yadav, John Thangarajah, and Sebastian Sardina ...
Coverage g/p : 2. 3. 4 p/g : 2. 3. 4. FastDownward McMAS NuSMV. Percentage of instances completed in 10 minutes. Time comparison (2-2-8). 0.05. 0.50. 2.00.

Nitin Yadav and Sebastian Sardina. RMIT University ...
BDI agents outside the coalition: not augmented. 2. M |= 〈〈A〉〉ω,ϱϕ can be checked in exponential time on the number of agents |A| and goals maxa∈A(|ϱ[a]|).

Nitin Yadav and Sebastian Sardina RMIT University ...
Value of controller: Measures degree of target's expected realizability. Reward gained on ... Optimal policy for MS,T ≡ Optimal controller for T in S. • Existence of ...

Geoff Bartlett_cv.pdf
Pendle Witch Child Wingspan for BBC4 Documentary FCP Online ... Daniel and his cats Big Wave TV Documentary Sym Online ... Geoff Bartlett_cv.pdf.

Lin-English.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Lin-English.pdf.

lin-bs.pdf
Keywords: CAPTCHAs Recognition; Handwriting recognition; Shape. context. Page 3 of 51. lin-bs.pdf. lin-bs.pdf. Open. Extract. Open with. Sign In. Main menu.

sebastian kneipp pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. sebastian ...

LIN CONG
2013. Stanford Asian American Award. 2013. The Gerald J. Lieberman Fellowship. 2012–2013. Peter F. DeVos Fellowship. 2012–2013. Dimitrijevic Fellowship. 2012–2013. Zephyr Prize for Best Paper in Corporate Finance, The 25th AFBC. 2012. Prize Win

Manmohan Singh - Visva Bharati
Topic: India's Export Trends and Prospects for Self-. Sustained Growth. [Published ... Japan's leading business daily. 1996 Honorary. Professor, ... Sept 1982 – Jan 1985: Governor, Reserve Bank of India. April 1980 – Sept 1982: Member-Secretary,

Mantej Singh Dhanjal - GitHub
07/2014 - 10/2014. Accenture. Associate ... Drove UI testing on Bluefly's Mobile app and website. Logged ... 10 tips on how Bluefly can use Social Media for lead.

Belle and sebastian fold
Rough guide pdf.57689107719 - Download Belleand sebastian fold.Theadorable ... Newyork undercover is_safe:1.Win 10 ... It's my life.Network datarecovery.

Dearly Beloved - Sebastian Wolff.pdf
Kingdom Hearts. Yoko Shimomura. Arrangement by .... Dearly Beloved - Sebastian Wolff.pdf. Dearly Beloved - Sebastian Wolff.pdf. Open. Extract. Open with.

Sheela Sebastian Vs R Jawaharaj.pdf
IPC is not satisfied in view of what has been stated under. 3. Page 3 of 19. Main menu. Displaying Sheela Sebastian Vs R Jawaharaj.pdf. Page 1 of 19.

Johann Sebastian Bach.pdf
Nov 25, 2013 - Suo padre Johann Ambrosius era. violinista di corte ad Arnstadt. Fedele alla tradizione, il giovane Johann Sebastian iniziò gli studi musicali in ...

Descargar algebra sebastian lazo pdf
Page 3 of 23. Descargar algebra sebastian lazo pdf. Descargar algebra sebastian lazo pdf. Open. Extract. Open with. Sign In. Main menu. Displaying Descargar ...

noah-by-sebastian-fitzek.pdf
developer for numerous media companies in Europe. He lives in Berlin and is currently working in. the programme management of a major capital radio station.

Press Release Geoff Tate.pdf
звучи само един вид музика и... вие знаете коя е тя. 17 ноември. вторник. зала. Христо Ботев. 20:00 ч. Класиката на Queensryche. от 1998 г., призната за.

INVITATION RENCONTRE LIN CHANVRE BIO.pdf
yesterday at Philadelphia Interna- tional Airport, Reagan denied he. remembered anything concerning a. scheme to divert funds from the. Wharton ..... Whoops! There was a problem loading this page. Retrying... INVITATION RENCONTRE LIN CHANVRE BIO.pdf.