Agent Programming via Planning Programs Giuseppe De Giacomo1 1
Fabio Patrizi1
Dipartimento di Informatica e Sistemistica Sapienza Universit` a di Roma Rome, Italy {degiacomo,patrizi}@dis.uniroma1.it
2
Sebastian Sardina2
School of Computer Science and IT RMIT University Melbourne, Australia
[email protected]
May 12, 2009
1 / 12
Motivations & Objectives Automated Planning Declarative - “goals-to-be” Flexible and “complete” Think in terms of goals One-shot problem Unfeasible computationally
Agent-oriented Programming Procedural - “goals-to-do” Could miss behavior solutions Requires specific detailed solutions Long-term behavior: “act as you go” Reduced search space
2 / 12
Motivations & Objectives Automated Planning Declarative - “goals-to-be” Flexible and “complete” Think in terms of goals One-shot problem Unfeasible computationally
Agent-oriented Programming Procedural - “goals-to-do” Could miss behavior solutions Requires specific detailed solutions Long-term behavior: “act as you go” Reduced search space
What do we look for? A way to build agents by: • building on declarative goals; • accounting for achievement and maintenance goal types; • leveraging on the temporal relations among goals; • accounting for contingencies in the domain.
2 / 12
Agent Planning Programs • • • • •
Finite state program (including conditionals and loops). Nondeterministic transitions. Possibly non-terminating. Atomic instructions: requests to “achieve goal φ while maintaining goal ψ” Meant to run in a dynamic domain: the environment.
G1 : achieve MyLoc (dept ) while maintaining ¬Fuel (empty ) t0
G5 :
t1
achieve MyLoc(home) ∧ CarLoc(home) G2 : G4 : while maintaining ¬Fuel(empty ) ach ieve My achieve MyLoc(pub) Loc (pu while maintaining G : 3 b) t2 ¬Fuel(empty ) achieve MyLoc(home) while maintaining ¬Driving
3 / 12
Agent Planning Programs • • • • •
Finite state program (including conditionals and loops). Nondeterministic transitions. Possibly non-terminating. Atomic instructions: requests to “achieve goal φ while maintaining goal ψ” Meant to run in a dynamic domain: the environment.
G1 : achieve MyLoc (dept ) while maintaining ¬Fuel (empty ) t0
t1
achieve MyLoc(home) ∧ CarLoc(home) G2 : G4 : while maintaining ¬Fuel(empty ) ach ieve My achieve MyLoc(pub) Loc (pu while maintaining G : 3 b) a t2 ¬Fuel(empty hooses ) Agent c ursue! p goal to achieve MyLoc(home) G5 : while maintaining ¬Driving
3 / 12
Environment for Planning Programs Planning programs are executed in an environment, which is a non-deterministic planning domain.
State Propositions CarLoc, MyLoc: {home, parking , dept, pub} Fuel: {empty , low , high} Strike: {true,false}
Operators goByCar(x) with x ∈ {home, parking , dept, pub} prec : MyLoc = CarLoc ∧ Fuel 6= empty post : MyLoc = x; CarLoc = x; (when (Fuel high) (oneof (Fuel high) (Fuel low ))); (when (Fuel low ) (oneof (Fuel low ) (Fuel empty )))
4 / 12
Environment for Planning Programs Planning programs are executed in an environment, which is a non-deterministic planning domain.
State Propositions
nondeterministic action!
CarLoc, MyLoc: {home, parking , dept, pub} Fuel: {empty , low , high} Strike: {true,false}
Operators goByCar(x) with x ∈ {home, parking , dept, pub} prec : MyLoc = CarLoc ∧ Fuel 6= empty post : MyLoc = x; CarLoc = x; (when (Fuel high) (oneof (Fuel high) (Fuel low ))); (when (Fuel low ) (oneof (Fuel low ) (Fuel empty )))
4 / 12
A Realization of a Planning Program goByCar(parking )
walk(dept)
walk(parking )
dept G2
dept G1 (pub)
wa lk (p ark ing )
walk(dept)
e)
om (h lk wa
goByCar(home)
home G1 (dept)
go By C
ar (h o
goByCar(home)
m e)
ub ) walk(p home G4
walk(home)
pub G5
G1 (x) : achieve MyLoc(x) while maintaining Fuel(empty ) G2 : achieve MyLoc(home) ∧ CarLoc(home) while maintaining ¬Fuel(empty ) G4 : achieve MyLoc(pub) G5 : achieve MyLoc(home) while maintaining ¬Driving
5 / 12
A Realization of a Planning Program goByCar(parking )
walk(dept)
walk(parking )
dept G2
goByCar(home)
dept G1 (pub)
wa lk (p ark ing )
walk(dept)
e)
om (h lk wa
goByCar(home)
home G1 (dept)
go By Ca r( ho m e) ub ) walk(p
home G4
walk(home)
pub G5
G1 (x) : achieve MyLoc(x) while maintaining Fuel(empty ) G2 : achieve MyLoc(home) ∧ CarLoc(home) while maintaining ¬Fuel(empty ) G4 : achieve MyLoc(pub) G5 : achieve MyLoc(home) while maintaining ¬Driving
5 / 12
A Realization of a Planning Program goByCar(parking )
walk(dept)
walk(parking )
dept G2
goByCar(home)
dept G1 (pub)
wa lk (p ark ing )
walk(dept)
e)
om (h lk wa
goByCar(home)
home G1 (dept)
go By Ca r( ho m e) ub ) walk(p
home G4
walk(home)
pub G5
shouldn’t G1 (x) : achieve MyLoc(x) while maintaining Fuel(empty ) G2 : achieve MyLoc(home) ∧ CarLoc(home) while maintaining ¬Fuel(empty ) drive to pub! G4 : achieve MyLoc(pub) G5 : achieve MyLoc(home) while maintaining ¬Driving
5 / 12
Definition of a Planning Program Solution Informally. (t, s) ∈PLAN means we can satisfy all agent’s potential requests from its state t when the dynamic domain starts in state s.
6 / 12
Definition of a Planning Program Solution Informally. (t, s) ∈PLAN means we can satisfy all agent’s potential requests from its state t when the dynamic domain starts in state s. Formally. A binary relation PLAN is a plan-based simulation relation iff: Coinductive definition (gfp) with calls to an inductive definition (lfp) (t, s) ∈PLAN implies that for all possible requests t −→achieve φ while maintaining ψ t 0 there exists a1 , a2 , . . . , an s.t. such that: a
a
an−1
a
1 2 n • s −→ s1 −→ · · · −→ sn−1 −→ sn
(plan is executable)
• si |= ψ, for si = s, s1 , . . . , sn−1
(maintenance goal is satisfied)
• sn |= φ
(achievement goal is satisfied)
0
• (t , sn ) ∈PLAN
(simulation holds in resulting state)
Planning Program Realization A planning program T is realizable in dynamic domain D iff there is a plan-simulation between the initial states of T and D. 6 / 12
Reduction to LTL Synthesis Propositional variables: • PE : environment variables • PS: system variables
Game: • Environment: chooses from 2PE • System: chooses from 2PS
Infinite play: • pe0 , pe1 , pe2 , . . . • ps0 , ps1 , ps2 , . . .
Infinite behavior: pe0 ∪ ps0 , pe1 ∪ ps1 , pe2 ∪ ps2 , . . .
7 / 12
Reduction to LTL Synthesis Propositional variables: • PE : environment variables • PS: system variables
Game: • Environment: chooses from 2PE • System: chooses from 2PS
Infinite play: • pe0 , pe1 , pe2 , . . . • ps0 , ps1 , ps2 , . . .
Infinite behavior: pe0 ∪ ps0 , pe1 ∪ ps1 , pe2 ∪ ps2 , . . . Specification: LTL formula on PE ∪ PS Win: behavior |= spec Strategy: Function f : (2PE )∗ → 2PS 7 / 12
Reduction to LTL Synthesis Propositional variables: • PE : environment variables • PS: system variables
Game: • Environment: chooses from 2PE • System: chooses from 2PS
LTL synthesis (Pnueli+Rosner 1989)
Infinite play: • pe0 , pe1 , pe2 , . . . • ps0 , ps1 , ps2 , . . .
Infinite behavior: pe0 ∪ ps0 , pe1 ∪ ps1 , pe2 ∪ ps2 , . . . Specification: LTL formula on PE ∪ PS Win: behavior |= spec Strategy: Function f : (2PE )∗ → 2PS 7 / 12
Encoding in LTL LTL Formula Φ to be realized/synthesized: Init ∧ (Trans D ∧ Trans T ) −→ Fulfill ∧ ♦Last finish plans infinitely often
8 / 12
Encoding in LTL LTL Formula Φ to be realized/synthesized: Init ∧ (Trans D ∧ Trans T ) −→ Fulfill ∧ ♦Last
1
Dynamic domain (Trans D ): MyLoc(x) ∧ Close(x, y ) ∧ walk(y ) −→ MyLoc(y ) CarLoc(x) ∧ walk(y ) −→ CarLoc(x)
2
Planning program (Trans T ):
3
Fulfillment of goals (Fulfill):
finish plans infinitely often
(dynamics of walking)
8 / 12
Encoding in LTL LTL Formula Φ to be realized/synthesized: Init ∧ (Trans D ∧ Trans T ) −→ Fulfill ∧ ♦Last
1
Dynamic domain (Trans D ): MyLoc(x) ∧ Close(x, y ) ∧ walk(y ) −→ MyLoc(y ) CarLoc(x) ∧ walk(y ) −→ CarLoc(x)
2
(dynamics of walking)
Planning program (Trans T ): t ∧ “achieve φ while maintaining ψ” ∧ ¬Last −→
t ∧ “achieve φ while maintaining ψ” (target request propagation) t ∧ “achieve φ while maintaining ψ” ∧ Last −→ t 0
3
finish plans infinitely often
(target advance)
Fulfillment of goals (Fulfill):
8 / 12
Encoding in LTL LTL Formula Φ to be realized/synthesized: Init ∧ (Trans D ∧ Trans T ) −→ Fulfill ∧ ♦Last
1
Dynamic domain (Trans D ): MyLoc(x) ∧ Close(x, y ) ∧ walk(y ) −→ MyLoc(y ) CarLoc(x) ∧ walk(y ) −→ CarLoc(x)
2
(dynamics of walking)
Planning program (Trans T ): t ∧ “achieve φ while maintaining ψ” ∧ ¬Last −→
t ∧ “achieve φ while maintaining ψ” (target request propagation) t ∧ “achieve φ while maintaining ψ” ∧ Last −→ t 0
3
finish plans infinitely often
(target advance)
Fulfillment of goals (Fulfill): “achieve φ while maintaining ψ” ∧ Last −→ φ (achievement goal is satisfied)
“achieve φ while maintaining ψ” −→ ψ (maintenance goal is respected)
8 / 12
GR(1) Formulas
[Piterman, Pnueli, Sa’ar 2006]
• LTL realizability is 2EXPTIME-complete for general LTL formulas! (Notice that satisfiability or validity for LTL is PSPACE-complete) • Several interesting LTL patterns have been studied.
9 / 12
GR(1) Formulas
[Piterman, Pnueli, Sa’ar 2006]
• LTL realizability is 2EXPTIME-complete for general LTL formulas! (Notice that satisfiability or validity for LTL is PSPACE-complete) • Several interesting LTL patterns have been studied. • “General Reactivity (1)” formulas: ϕass → ψreq of a special syntactic shape.
ϕass
= Init ∧ (Trans D ∧ Trans T );
ψreq
= Fulfill ∧ ♦Last.
Variables to control: {a| a is a domain action} ∪ {Last}
9 / 12
GR(1) Formulas
[Piterman, Pnueli, Sa’ar 2006]
• LTL realizability is 2EXPTIME-complete for general LTL formulas! (Notice that satisfiability or validity for LTL is PSPACE-complete) • Several interesting LTL patterns have been studied. • “General Reactivity (1)” formulas: ϕass → ψreq of a special syntactic shape.
ϕass
= Init ∧ (Trans D ∧ Trans T );
ψreq
= Fulfill ∧ ♦Last.
Variables to control: {a| a is a domain action} ∪ {Last} • Good news: • Synthesis can be reduced to µ-calculus model checking a game structure! • Can exploit MC symbolic techniques (OBDD)! • Realizability is polynomial in the size of the formula and the game structure.
9 / 12
Results
• Agent planning programs: programming with declarative goals. • Can be reduced to LTL synthesis of generalized reactivity formulas GR(1). • Can be solved by model checking game structures. • Is polynomial in the states of the planning domain. • Is EXPTIME in the representation as for model checking. • Can be practically implemented in model checking-based LTL synthesis such
as TLV.
10 / 12
Agent Planning Programs with Predefined Components Idea. Actions can only be carried out by using available actuators (e.g., arm, robot, video camera, web browser, etc.) s2
op
en
close
lock s0
auth
s1
unlock
logout
How. By compiling away the actuators into the dynamic domain.
See paper!
11 / 12
Conclusions
• Solving planning programs is “planning for routines” • Can be done by LTL GR(1) synthesis. • Same complexity as conditional planning under non-determinism.
Future work: • Environments/actuators with partial observability. • Integrate planning with control knowledge (e.g., HTN planning).
12 / 12