Active Behavior Recognition in Beyond Visual Range Air ... - Ron Alford

Viewer
Transcript

Active Behavior Recognition in Beyond Visual Range Air Combat Ron Alford1

Hayley Borck2

Justin Karneeb3

David W. Aha4

1 ASEE/NRL Postdoctoral Fellow [email protected] 2 Knexus Research Corporation [email protected] 3 Knexus Research Corporation [email protected] 4 U.S.

Naval Research Laboratory [email protected]

May 31st, 2015 Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

May 31st, 2015

1 / 17

Beyond Visual Range Air Combat

100km $300 million planes

Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

May 31st, 2015

2 / 17

Beyond Visual Range Air Combat

100km $300 million planes $10 million sensor package Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

May 31st, 2015

2 / 17

Beyond Visual Range Air Combat

100km $300 million planes $10 million sensor package $1 million $ missiles Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

May 31st, 2015

2 / 17

Adding UAV wingmen to the mix

The Promise: More platforms per pilot Better strategies

Reduced pilot risk Retain (most) human judgment The Caveat: Pilot is already cognitively burdened UAV needs to respond (or act) intelligently

Source: Dassault Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

May 31st, 2015

3 / 17

Some obstacles to intelligent behavior

Partial-observability Continuous action space Multi-agent (non-zero-sum)

Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

May 31st, 2015

4 / 17

Some obstacles to intelligent behavior

Partial-observability Full-observability Continuous action space Discrete action space Multi-agent (non-zero-sum) Single-agent with fixed (unknown) opponent

Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

May 31st, 2015

4 / 17

Behavior Recognition (Assumptions) Aggressive Assumptions: Finite set of predictive agent models Used in training recognizer Used to predict future states

Safety-Aggressive

Agents use fixed polices React to history of observations Not rational nor optimal

Behavior Recognition (generic): Inputs: Agent models History of observations

Passive

Oblivious

Output: A probability distribution over the models. Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

May 31st, 2015

5 / 17

Acting depends on behavior recognition Almost all actions in air combat are dependent (or relative) to other agents. Safety-Aggressive vs. Aggressive

Safety-Aggressive vs. Oblivious

Safety-Aggressive vs. Safety-Aggressive

Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

May 31st, 2015

6 / 17

Behavior recognition depends on acting Our actions determine what we observe. Fly 90 Fly 0 Safety-Aggressive Aggressive

Aggressive Oblivious

Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

May 31st, 2015

7 / 17

Case-based Behavior Recognition

The rough algorithm: During training:

φ

Run a number of randomized trials Project states to a feature space Record short histories of features and their associated models as cases

During recognition:

θ

d

Retrieve cases with similar histories Treat the relative frequency of agent models as a probability distribution

Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

May 31st, 2015

8 / 17

How acting influences Case-based Behavior Recognition



 A 0.48  SA 0.48  Ob 0.04

 

 A 0.03  SA 0.03  Ob 0.94

Alford, Borck, Karneeb and Aha

 A 0.33  SA 0.33  Ob 0.33 A Aggressive SA Safety-Aggressive Ob Oblivious

Active Behavior Recognition in BVR Combat

May 31st, 2015

9 / 17

How acting influences Case-based Behavior Recognition

A Aggressive SA Safety-Aggressive Ob Oblivious



 A 0.03  SA 0.94  Ob 0.03

 

 A 0.48  SA 0.04  Ob 0.48

Alford, Borck, Karneeb and Aha

 A 0.33  SA 0.33  Ob 0.33

Active Behavior Recognition in BVR Combat

May 31st, 2015

10 / 17

How acting influences Case-based Behavior Recognition

Acting and Behavior Recognition: Head-long flight disambiguates Safety-Aggressive Perpendicular flight disambiguates Oblivious Need both to make a confident prediction

Similar to a POMDP Move with uncertainty about other agents’ behaviors Our actions give evidence about that behavior POMDPs are hard

Approximate as an MDP

Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

May 31st, 2015

11 / 17

Planning Domain Plan over histories of observations Observations divided into 60 second epochs Actions: Four discrete actions Four possible outcomes (agent models) Probability dependent on behavior recognizer and current history

Use flight simulator (AFSIM) applying action to a history Purpose: Maximize a utility function over finite horizon

Alford, Borck, Karneeb and Aha

Fly 0◦

Fly 60◦

Fly 90◦

Fly 180◦

Active Behavior Recognition in BVR Combat

May 31st, 2015

12 / 17

Sample-based planning (PROST) 

A  SA Ob Fly 0

A,0.33

 0.33 0.33  0.33

Fly 90

Ob,0.33

Fly 0



A  SA Ob

 0.48 0.04  0.48

Fly 90

A,0.48

Ob,0.48 

A  SA Ob

Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

 0.02 0.02  0.96

May 31st, 2015

13 / 17

Meta-goal reasoning We require a utility function! Possible mission success functions: Number of “kills” Air space denied Reconnaissance Diversion

Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

May 31st, 2015

14 / 17

Meta-goal reasoning We require a utility function! Possible mission success functions: Number of “kills” Air space denied Reconnaissance Diversion

Road blocks Roll-outs (simulation) are slow Success functions are often discontinuous

Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

May 31st, 2015

14 / 17

Meta-goal reasoning We require a utility function! Possible mission success functions: Number of “kills” Air space denied Reconnaissance Diversion

Road blocks Roll-outs (simulation) are slow Success functions are often discontinuous

Instead: Average confidence in most likely model Confidence is generally smooth Emphasize the role of planning in resolving recognition ambiguity

Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

May 31st, 2015

14 / 17

Experimental Setup Safety-Aggressive Four different observer behaviors running the behavior recognizer: Safety-Aggressive Passive Random Active behavior recognition planner

Passive

Random

Evaluation metric: Confidence in correct behavior over time.

Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

May 31st, 2015

15 / 17

Recognition Results Behavior Probability Over Time - All Behaviors Behavior Probability

1 0.9 0.8

Active Planner

0.7

Random Baseline

0.6

Passive Baseline

0.5

Safety Aggression Baseline 480

460

440

420

400

380

360

340

320

300

280

260

240

220

200

180

160

140

120

80

100

60

40

0.4 Time in Simulation (Seconds)

Both Safety-Aggressive and Passive fail to disambiguate between two behaviors Random eventually distinguishes between all behaviors Planning gets good (>90%) recognition scores faster Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

May 31st, 2015

16 / 17

Conclusion / Future Work

Behavior Recognition and Acting: Probabilistic recognition pairs well with probabilistic planning Need faster roll-outs to persue mission success Discrete states, actions, and policies

Game theoretic play (regret minimization) When do we need a separate behavior recognition component?

Alford, Borck, Karneeb and Aha

Active Behavior Recognition in BVR Combat

May 31st, 2015

17 / 17