Agent Programming via Planning Programs

Viewer
Transcript

Agent Programming via Planning Programs Giuseppe De Giacomo1 1

Fabio Patrizi1

Dipartimento di Informatica e Sistemistica Sapienza Universit` a di Roma Rome, Italy {degiacomo,patrizi}@dis.uniroma1.it

2

Sebastian Sardina2

School of Computer Science and IT RMIT University Melbourne, Australia [email protected]

May 12, 2009

1 / 12

Motivations & Objectives Automated Planning Declarative - “goals-to-be” Flexible and “complete” Think in terms of goals One-shot problem Unfeasible computationally

Agent-oriented Programming Procedural - “goals-to-do” Could miss behavior solutions Requires specific detailed solutions Long-term behavior: “act as you go” Reduced search space

2 / 12

Motivations & Objectives Automated Planning Declarative - “goals-to-be” Flexible and “complete” Think in terms of goals One-shot problem Unfeasible computationally

Agent-oriented Programming Procedural - “goals-to-do” Could miss behavior solutions Requires specific detailed solutions Long-term behavior: “act as you go” Reduced search space

What do we look for? A way to build agents by: • building on declarative goals; • accounting for achievement and maintenance goal types; • leveraging on the temporal relations among goals; • accounting for contingencies in the domain.

2 / 12

Agent Planning Programs • • • • •

Finite state program (including conditionals and loops). Nondeterministic transitions. Possibly non-terminating. Atomic instructions: requests to “achieve goal φ while maintaining goal ψ” Meant to run in a dynamic domain: the environment.

G1 : achieve MyLoc (dept ) while maintaining ¬Fuel (empty ) t0

G5 :

t1

achieve MyLoc(home) ∧ CarLoc(home) G2 : G4 : while maintaining ¬Fuel(empty ) ach ieve My achieve MyLoc(pub) Loc (pu while maintaining G : 3 b) t2 ¬Fuel(empty ) achieve MyLoc(home) while maintaining ¬Driving

3 / 12

Agent Planning Programs • • • • •

Finite state program (including conditionals and loops). Nondeterministic transitions. Possibly non-terminating. Atomic instructions: requests to “achieve goal φ while maintaining goal ψ” Meant to run in a dynamic domain: the environment.

G1 : achieve MyLoc (dept ) while maintaining ¬Fuel (empty ) t0

t1

achieve MyLoc(home) ∧ CarLoc(home) G2 : G4 : while maintaining ¬Fuel(empty ) ach ieve My achieve MyLoc(pub) Loc (pu while maintaining G : 3 b) a t2 ¬Fuel(empty hooses ) Agent c ursue! p goal to achieve MyLoc(home) G5 : while maintaining ¬Driving

3 / 12

Environment for Planning Programs Planning programs are executed in an environment, which is a non-deterministic planning domain.

State Propositions CarLoc, MyLoc: {home, parking , dept, pub} Fuel: {empty , low , high} Strike: {true,false}

Operators goByCar(x) with x ∈ {home, parking , dept, pub} prec : MyLoc = CarLoc ∧ Fuel 6= empty post : MyLoc = x; CarLoc = x; (when (Fuel high) (oneof (Fuel high) (Fuel low ))); (when (Fuel low ) (oneof (Fuel low ) (Fuel empty )))

4 / 12

Environment for Planning Programs Planning programs are executed in an environment, which is a non-deterministic planning domain.

State Propositions

nondeterministic action!

CarLoc, MyLoc: {home, parking , dept, pub} Fuel: {empty , low , high} Strike: {true,false}

Operators goByCar(x) with x ∈ {home, parking , dept, pub} prec : MyLoc = CarLoc ∧ Fuel 6= empty post : MyLoc = x; CarLoc = x; (when (Fuel high) (oneof (Fuel high) (Fuel low ))); (when (Fuel low ) (oneof (Fuel low ) (Fuel empty )))

4 / 12

A Realization of a Planning Program goByCar(parking )

walk(dept)

walk(parking )

dept G2

dept G1 (pub)

wa lk (p ark ing )

walk(dept)

e)

om (h lk wa

goByCar(home)

home G1 (dept)

go By C

ar (h o

goByCar(home)

m e)

ub ) walk(p home G4

walk(home)

pub G5

G1 (x) : achieve MyLoc(x) while maintaining Fuel(empty ) G2 : achieve MyLoc(home) ∧ CarLoc(home) while maintaining ¬Fuel(empty ) G4 : achieve MyLoc(pub) G5 : achieve MyLoc(home) while maintaining ¬Driving

5 / 12

A Realization of a Planning Program goByCar(parking )

walk(dept)

walk(parking )

dept G2

goByCar(home)

dept G1 (pub)

wa lk (p ark ing )

walk(dept)

e)

om (h lk wa

goByCar(home)

home G1 (dept)

go By Ca r( ho m e) ub ) walk(p

home G4

walk(home)

pub G5

G1 (x) : achieve MyLoc(x) while maintaining Fuel(empty ) G2 : achieve MyLoc(home) ∧ CarLoc(home) while maintaining ¬Fuel(empty ) G4 : achieve MyLoc(pub) G5 : achieve MyLoc(home) while maintaining ¬Driving

5 / 12

A Realization of a Planning Program goByCar(parking )

walk(dept)

walk(parking )

dept G2

goByCar(home)

dept G1 (pub)

wa lk (p ark ing )

walk(dept)

e)

om (h lk wa

goByCar(home)

home G1 (dept)

go By Ca r( ho m e) ub ) walk(p

home G4

walk(home)

pub G5

shouldn’t G1 (x) : achieve MyLoc(x) while maintaining Fuel(empty ) G2 : achieve MyLoc(home) ∧ CarLoc(home) while maintaining ¬Fuel(empty ) drive to pub! G4 : achieve MyLoc(pub) G5 : achieve MyLoc(home) while maintaining ¬Driving

5 / 12

Definition of a Planning Program Solution Informally. (t, s) ∈PLAN means we can satisfy all agent’s potential requests from its state t when the dynamic domain starts in state s.

6 / 12

Definition of a Planning Program Solution Informally. (t, s) ∈PLAN means we can satisfy all agent’s potential requests from its state t when the dynamic domain starts in state s. Formally. A binary relation PLAN is a plan-based simulation relation iff: Coinductive definition (gfp) with calls to an inductive definition (lfp) (t, s) ∈PLAN implies that for all possible requests t −→achieve φ while maintaining ψ t 0 there exists a1 , a2 , . . . , an s.t. such that: a

a

an−1

a

1 2 n • s −→ s1 −→ · · · −→ sn−1 −→ sn

(plan is executable)

• si |= ψ, for si = s, s1 , . . . , sn−1

(maintenance goal is satisfied)

• sn |= φ

(achievement goal is satisfied)

0

• (t , sn ) ∈PLAN

(simulation holds in resulting state)

Planning Program Realization A planning program T is realizable in dynamic domain D iff there is a plan-simulation between the initial states of T and D. 6 / 12

Reduction to LTL Synthesis Propositional variables: • PE : environment variables • PS: system variables

Game: • Environment: chooses from 2PE • System: chooses from 2PS

Infinite play: • pe0 , pe1 , pe2 , . . . • ps0 , ps1 , ps2 , . . .

Infinite behavior: pe0 ∪ ps0 , pe1 ∪ ps1 , pe2 ∪ ps2 , . . .

7 / 12

Reduction to LTL Synthesis Propositional variables: • PE : environment variables • PS: system variables

Game: • Environment: chooses from 2PE • System: chooses from 2PS

Infinite play: • pe0 , pe1 , pe2 , . . . • ps0 , ps1 , ps2 , . . .

Infinite behavior: pe0 ∪ ps0 , pe1 ∪ ps1 , pe2 ∪ ps2 , . . . Specification: LTL formula on PE ∪ PS Win: behavior |= spec Strategy: Function f : (2PE )∗ → 2PS 7 / 12

Reduction to LTL Synthesis Propositional variables: • PE : environment variables • PS: system variables

Game: • Environment: chooses from 2PE • System: chooses from 2PS

LTL synthesis (Pnueli+Rosner 1989)

Infinite play: • pe0 , pe1 , pe2 , . . . • ps0 , ps1 , ps2 , . . .

Infinite behavior: pe0 ∪ ps0 , pe1 ∪ ps1 , pe2 ∪ ps2 , . . . Specification: LTL formula on PE ∪ PS Win: behavior |= spec Strategy: Function f : (2PE )∗ → 2PS 7 / 12

Encoding in LTL LTL Formula Φ to be realized/synthesized: Init ∧ (Trans D ∧ Trans T ) −→ Fulfill ∧ ♦Last finish plans infinitely often

8 / 12

Encoding in LTL LTL Formula Φ to be realized/synthesized: Init ∧ (Trans D ∧ Trans T ) −→ Fulfill ∧ ♦Last

1

Dynamic domain (Trans D ): MyLoc(x) ∧ Close(x, y ) ∧ walk(y ) −→ MyLoc(y ) CarLoc(x) ∧ walk(y ) −→ CarLoc(x)

2

Planning program (Trans T ):

3

Fulfillment of goals (Fulfill):

finish plans infinitely often

(dynamics of walking)

8 / 12

Encoding in LTL LTL Formula Φ to be realized/synthesized: Init ∧ (Trans D ∧ Trans T ) −→ Fulfill ∧ ♦Last

1

Dynamic domain (Trans D ): MyLoc(x) ∧ Close(x, y ) ∧ walk(y ) −→ MyLoc(y ) CarLoc(x) ∧ walk(y ) −→ CarLoc(x)

2

(dynamics of walking)

Planning program (Trans T ): t ∧ “achieve φ while maintaining ψ” ∧ ¬Last −→

t ∧ “achieve φ while maintaining ψ” (target request propagation) t ∧ “achieve φ while maintaining ψ” ∧ Last −→ t 0

3

finish plans infinitely often

(target advance)

Fulfillment of goals (Fulfill):

8 / 12

Encoding in LTL LTL Formula Φ to be realized/synthesized: Init ∧ (Trans D ∧ Trans T ) −→ Fulfill ∧ ♦Last

1

Dynamic domain (Trans D ): MyLoc(x) ∧ Close(x, y ) ∧ walk(y ) −→ MyLoc(y ) CarLoc(x) ∧ walk(y ) −→ CarLoc(x)

2

(dynamics of walking)

Planning program (Trans T ): t ∧ “achieve φ while maintaining ψ” ∧ ¬Last −→

t ∧ “achieve φ while maintaining ψ” (target request propagation) t ∧ “achieve φ while maintaining ψ” ∧ Last −→ t 0

3

finish plans infinitely often

(target advance)

Fulfillment of goals (Fulfill): “achieve φ while maintaining ψ” ∧ Last −→ φ (achievement goal is satisfied)

“achieve φ while maintaining ψ” −→ ψ (maintenance goal is respected)

8 / 12

GR(1) Formulas

[Piterman, Pnueli, Sa’ar 2006]

• LTL realizability is 2EXPTIME-complete for general LTL formulas! (Notice that satisfiability or validity for LTL is PSPACE-complete) • Several interesting LTL patterns have been studied.

9 / 12

GR(1) Formulas

[Piterman, Pnueli, Sa’ar 2006]

• LTL realizability is 2EXPTIME-complete for general LTL formulas! (Notice that satisfiability or validity for LTL is PSPACE-complete) • Several interesting LTL patterns have been studied. • “General Reactivity (1)” formulas: ϕass → ψreq of a special syntactic shape.

ϕass

= Init ∧ (Trans D ∧ Trans T );

ψreq

= Fulfill ∧ ♦Last.

Variables to control: {a| a is a domain action} ∪ {Last}

9 / 12

GR(1) Formulas

[Piterman, Pnueli, Sa’ar 2006]

• LTL realizability is 2EXPTIME-complete for general LTL formulas! (Notice that satisfiability or validity for LTL is PSPACE-complete) • Several interesting LTL patterns have been studied. • “General Reactivity (1)” formulas: ϕass → ψreq of a special syntactic shape.

ϕass

= Init ∧ (Trans D ∧ Trans T );

ψreq

= Fulfill ∧ ♦Last.

Variables to control: {a| a is a domain action} ∪ {Last} • Good news: • Synthesis can be reduced to µ-calculus model checking a game structure! • Can exploit MC symbolic techniques (OBDD)! • Realizability is polynomial in the size of the formula and the game structure.

9 / 12

Results

• Agent planning programs: programming with declarative goals. • Can be reduced to LTL synthesis of generalized reactivity formulas GR(1). • Can be solved by model checking game structures. • Is polynomial in the states of the planning domain. • Is EXPTIME in the representation as for model checking. • Can be practically implemented in model checking-based LTL synthesis such

as TLV.

10 / 12

Agent Planning Programs with Predefined Components Idea. Actions can only be carried out by using available actuators (e.g., arm, robot, video camera, web browser, etc.) s2

op

en

close

lock s0

auth

s1

unlock

logout

How. By compiling away the actuators into the dynamic domain.

See paper!

11 / 12

Conclusions

• Solving planning programs is “planning for routines” • Can be done by LTL GR(1) synthesis. • Same complexity as conditional planning under non-determinism.

Future work: • Environments/actuators with partial observability. • Integrate planning with control knowledge (e.g., HTN planning).

12 / 12

Agent Programming via Planning Programs

May 12, 2009 - Possibly non-terminating. â¢ Atomic instructions: requests to âachieve goal Ï while maintaining goal Ïâ. â¢ Meant to run in a dynamic domain: the ...

Download PDF

353KB Sizes 0 Downloads 239 Views

Report

Agent Programming via Planning Programs

Recommend Documents