Decision Theoretic Behavior Composition Nitin Yadav and Sebastian Sardina RMIT University, Melbourne, Australia In Proceedings of Autonomous Agents and Multi-Agent Systems (AAMAS), Taipei, Taiwan, 2011. The Behavior Composition Problem

Controller Evaluation Value of controller: Measures degree of target’s expected realizability

Garden environment

Reward gained on ... : • successful action delegation: Probability of action request × reward for the action request. • one target step (Ri ): Sum of points gained for each legal target action request.

C ONTROLLER

• k steps of whole system: R1 + R2 + . . . + Rk .

Target Garden Bot

• infinite runs of whole system: R1 + αR2 + α2 R2 + . . ., using discount factor 0 ≤ α < 1. Result Every exact solution is an optimal controller.

Solution via reduction to MDP

Multi Bot

Plucker Bot

Cleaner Bot

Task: Controller realizes virtual target behavior T by coordinating available behaviors B1 . . . Bn in environment E.

a0 , b0 , c0 e0 t1 , pluck

Motivations & Objectives cleaner: 0. 8 × 0. 1 × 0. 3

Classical behavior composition approaches: • Operate on strict uncertainty.

a0 , b0 , c0 e0 t0 , clean

• Deal only with exact solutions. • Lack “optimality” notion for problems without exact solution. Our contribution Handle (unsolvable) problems with non-exact solutions by:

8 . 0 : r e n a cle

.1 0 ×

a0 , b0 , c0 e0 t1 , water

.7 0 ×

a1 , b0 , c0 e1 t1 , pluck

cleaner: 0. 2 × 0. 9 × 0. 3 clea ner : 0. 2 ×0 .9 × 0. 7 a1 , b0 , c0 e1 Encoded MDP (partial) t1 , water

1. quantifying the sources of uncertainty: • non-determinism in the environment;

MDP encoding MS,T = hQ, A, p, ri:

• non-determinism in the behaviors;

• Q is the finite set of state encoding the state of the system, the state of the target, and the next requested action.

• action requests in the target; 2. defining optimality notions based on target “expected realizability”;

• A = {1, . . . , n, u} is the set of available behavior indexes;

3. reducing the composition problem to an MDP.

• p(q, i, q 0 ) is the stochastic transition relation encoding the possible next system state and action requested; • r(q, i) is the reward allocated on correct delegation.

Decision Theoretic Composition Problem pluck: 0. 25 water: 1 empty: 1 clean: 0. 8

water: 1 pluck: 0. 25 empty

e0

clean : 0.2

water: 1, clean: 0. 8

Environment

a0 e1

empty

• Optimal policy for MS,T ≡ Optimal controller for T in S.

clean: 0. 9

a1

• Existence of exact controller can be checked by calculating the optimal policy for horizon equal to |Q| + 1.

Cleaner Bot pluck: 0. 75

y pt

e2

clean: 0. 2 em

pluck: 0. 75

empty

clean: 0. 1

Results

hempty, 1i

t0

e3

hclean, 1i

t3 hpluck: 0. 3, 1i

t1 hwater: 0. 7, 1i

water: 1

Target Bot

hempty, 1i

t2

• Stochastic transition evolutions in avail. behaviors & environment. • Reward for each action request. • Stochastic model for target action requests.

Future work • Apply machine learning: – Reinforcement learning: model of system unknown; – Evolutionary computation: build controller incrementally. • Include extended constraints, e.g., action empty must be feasible after action pluck has been executed. • Include preferences, e.g., behavior Plucker-bot uses less energy than the Multi-bot.

Nitin Yadav and Sebastian Sardina RMIT University ...

Value of controller: Measures degree of target's expected realizability. Reward gained on ... Optimal policy for MS,T ≡ Optimal controller for T in S. • Existence of ...

450KB Sizes 5 Downloads 341 Views

Recommend Documents

Nitin Yadav and Sebastian Sardina. RMIT University ...
BDI agents outside the coalition: not augmented. 2. M |= 〈〈A〉〉ω,ϱϕ can be checked in exponential time on the number of agents |A| and goals maxa∈A(|ϱ[a]|).

Nitin Yadav, John Thangarajah, and Sebastian Sardina ...
Coverage g/p : 2. 3. 4 p/g : 2. 3. 4. FastDownward McMAS NuSMV. Percentage of instances completed in 10 minutes. Time comparison (2-2-8). 0.05. 0.50. 2.00.

Dhirendra Singh Sebastian Sardina Lin Padgham ...
School of Computer Science & Information Technology, RMIT University, ... tion of plans, while learning online. ... level plans may fail not because they were.

Sebastian Sardina Lavindra de Silva Lin Padgham
RMIT University [email protected] ... User provides (procedural) domain knowledge. – Some similarities with ... N is the agent name. 2. Π is a plan library ...

Dhirendra Singh Sebastian Sardina Lin Padgham Geoff ...
CSIRO Energy Technology, Sydney, Australia. Summary. This paper extends our earlier work integrating learning to improve plan selection in the popular. Belief ...

Position – Postdoc and 2 PhD Positions, RMIT University, Melbourne ...
Apr 13, 2015 - Yet environmental decision analysis tools typically do not include social processes, due to theoretical and methodological obstacles including ...

Position – Postdoc and 2 PhD Positions, RMIT University, Melbourne ...
Apr 13, 2015 - ... Gordon [email protected] ... Applicants interested in either project should email a letter of ...

Power Generation Loading Optimization using a ... - RMIT University
month) overhaul system, i.e. each time, a unit is through a major overhaul ... what operation mode a unit is operating under (such as mill ..... 214-220, IOS. Press ...

Nitin Spinners Ltd. - Rakesh Jhunjhunwala
Dec 14, 2015 - competitive advantage to India over Indonesia and USA. ... likely to increase ~2% yoy in marketing year 2015-16 (August to July period), as per US ...... e-mail: [email protected] | Website: www.indianivesh.in.

Belle and sebastian fold
Rough guide pdf.57689107719 - Download Belleand sebastian fold.Theadorable ... Newyork undercover is_safe:1.Win 10 ... It's my life.Network datarecovery.

Nitin Spinners Ltd. - Rakesh Jhunjhunwala
Dec 14, 2015 - Focus on value added products: Specialty yarns constitute 15% of total ..... excess depreciation charged on change in accounting norms), PAT ...

sebastian kneipp pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. sebastian ...

Dearly Beloved - Sebastian Wolff.pdf
Kingdom Hearts. Yoko Shimomura. Arrangement by .... Dearly Beloved - Sebastian Wolff.pdf. Dearly Beloved - Sebastian Wolff.pdf. Open. Extract. Open with.

Sheela Sebastian Vs R Jawaharaj.pdf
IPC is not satisfied in view of what has been stated under. 3. Page 3 of 19. Main menu. Displaying Sheela Sebastian Vs R Jawaharaj.pdf. Page 1 of 19.

Rakesh Yadav Book Download Gs (www.sarkarihelp.com).pdf ...
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Rakesh Yadav ...

Johann Sebastian Bach.pdf
Nov 25, 2013 - Suo padre Johann Ambrosius era. violinista di corte ad Arnstadt. Fedele alla tradizione, il giovane Johann Sebastian iniziò gli studi musicali in ...

Descargar algebra sebastian lazo pdf
Page 3 of 23. Descargar algebra sebastian lazo pdf. Descargar algebra sebastian lazo pdf. Open. Extract. Open with. Sign In. Main menu. Displaying Descargar ...

noah-by-sebastian-fitzek.pdf
developer for numerous media companies in Europe. He lives in Berlin and is currently working in. the programme management of a major capital radio station.

Dr. Jhillu Singh Yadav – Scientist par excellence - Arkivoc
Indian Institute of Chemical Technology) and initiated a new era in the field of ... The scientific career of Dr. Yadav at CSIR-IICT began with a strong focus on ...

NSK Magic 3 System By Nitin Kadam..pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. NSK Magic 3 ...