Control of Human-Machine Interaction for Wide Area Search Munitions in the Presence of Target Uncertainty P IA E. K. B ERG -Y UENa , S IDDHARTHA S. M EHTAb , E DUARDO L. PASILIAOa , ROBERT A. M URPHEYa a Munitions Directorate, Air Force Research Laboratory, Eglin AFB, FL-32542, USA b Department of Mechanical and Aerospace Engineering, University of Florida, Shalimar, FL-32579, USA
I NTRODUCTION The defense community is envisioning enhanced capabilities by introducing greater degree of munitions autonomy and the ability to strike multiple targets simultaneously from a single air platform in dynamic environments. One such complex system, the Wide Area Search Munition (WASM), combines the features of an unmanned aerial vehicle with those of traditional munition systems. Bainbridge in the “Ironies of automation” suggests that with increase in the complexity of a control system the need of a human operator would be more crucial and at the same time these complex systems can place a high demand on an operator’s cognitive load. Automation reliability is also shown to have a widespread effect on operator attitudes, e.g., trust and behavior. It has been observed that unreliable automation along with high cognitive load can deteriorate the overall system performance significantly. Optimal system performance requires that strengths and limitations of both the human operator and the automation be taken into account when designing a control system. P ILOT
W IDE A REA S EARCH M UNITIONS W RIGHT-PATTERSON AFB, O HIO
INTERACTING WITH THE
EXPERIMENTAL STATION AT
C ONTROLLER D EVELOPMENT The control objective is to assist an operator in decision making to improve the performance of a closed-loop HMI system by minimizing the total cost of mission and the number of interactions or mission length . The control objective is achieved in two stages by 1) developing an optimal controller based on the above performance criterion and 2) tracking operator actions along the optimal control sequence obtained in stage 1 at each discrete event time k. A receding horizon control problem is formulated to determine an optimal control sequence over the finite horizon of length N for an ith process in state si with terminal state sti determined using SPRT-based target estimation. A new optimal t , thereby compensating for exogenous control sequence is computed from each evolved state s+ to s i i disturbances such as operator errors, high cognitive load etc. A cost functional Ji comprising of the cost of action ci(s+ i , si , ai) and number of interactions for each process is defined as Ji , αλi +
si(1) = si,si(N + 1) = sti, s+ i (k), si (k) ∈ S, ai (k) ∈ A
Target Uncertainty There exists an uncertainty in the classified target causing false alarms , missed targets , and elevating the risk of collateral damage . Target uncertainty is a result of limited efficiency of the pattern classification and image processing methods used by WASMs to classify a target based on the captured images. The uncertainty poses a challenge in determining an optimal action policy for a given MDP since the cost of action as well as the terminal state is a function of the target type. We propose a novel sequential probability ratio test (SPRT) to adaptively estimate the target type using the actions taken by an operator with respect to the initial classification provided by an automated system. The estimates are obtained by sequential hypothesis testing where the null and alternate hypotheses are functions of the classification type I and type II errors. Thereby, the proposed method exploits superior human cognitive abilities to detect and rectify target uncertainty caused by the machine.
7 6
1
5
The objective of the presented work is to develop a control architecture for human-machine interaction (HMI) system that not only takes into account the real-time operator cognitive workload assessment but also the uncertainties associated with automation reliability . Specifically, we present a discretetime Markovian process model and develop an event-based switching controller with adaptive target estimation to compensate for the automation errors.
4
i
Σn (x )
3
0.5
i=0
i=0
i
Σn (x )
O BJECTIVE & M ETHOD
2 1
0
where λi is the number of interactions required to reach sti and α, β ∈ R are positive weights. The optimal action at time k is obtained as the first step in the control sequence that minimizes the cost functional given by u∗(k) = a∗i (1). The controller a∗i (1) provides an optimal action that minimizes Ji at each time k; however, the operator action ai(k) may not be optimal. Therefore, we propose a tracking controller to maintain the operator action sequence along the desired optimal control sequence at time k. The tracking error is defined as e(k) , ai(k) − a∗i (1) such that when e(k) 6= 0 the operator is notified with a corrective action, where the operator has the authority to accept or reject the corrective action. However, in the presence of high cognitive workload, i.e., µ(k) ˙ ≥ φ, the control authority is switched to the machine and an optimal control u∗(k) is applied until the operator cognitive load reduces such that µ(k) ˙ < φ.
VALIDATION & C ONCLUSIONS Simulation results based on 36 experiments involving 12 different subjects are presented to verify the performance of the proposed controller. The psychophysiological data collected during each trial consisted of ECG, EEG (F7, Fz, Pz, T5, and O2), and EOG channels. Statistical model of operator behavior is obtained for different target types and the CSI is modeled as a complex Gaussian process using the available empirical data. The environment consisted of a group of 4 WASMs modeled using constant altitude unicycle dynamics and 6 stationary targets (3 high-interest, 2 low-interest, and 1 false unit) including a false unit misclassified as a high-interest target, i.e., unreliable automation. Average and standard deviation of the total mission cost for 1000 Monte Carlo trials is shown in table below for 4 simulation scenarios - using the proposed method and when the controller, operator, or CSI measure were excluded from the simulation.
0 −1 0
2
4 Observations (x)
6
0
0.2
0.4 0.6 Observations (x)
0.8
C OST
1
COMPARISON FOR SIMULATIONS USING
1000 M ONTE C ARLO
TRIALS
800
Cognitive State Indicator (CSI) 0.9 0.8
Cognitive Stress Index
0.7 0.6
700
Actual HMI cost with human in−the−loop Optimal HMI cost w/o human in−the−loop
700
600
600
Low cognitive load High cognitive load Unmeasured cognitive load
600 500
500 Frequency
High cognitive load
Real-time workload assessment is crucial to provide the operator adaptive-aiding when warranted by task demands. Cannon et al. [1] introduced an approach to detection of temporal changes in operator cognitive state based on monitoring the trends of CSI. CSI is a time-varying parameter obtained in real-time by projecting multidimensional EEG and EOG signals onto a continuous range by maximizing the Kullback-Leibler (KL) distance between distributions of the signals. CSI, denoted by µ(t) ∈ [0, 1], indicates the degree to which the operator is engaged. A high cognitive load is identified when the rate of change of CSI exceeds an a priori determined threshold φ ∈ R, i.e., µ(t) ˙ ≥ φ.
Optimal HMI cost Actual HMI cost Actual HMI cost w/o controller
700
1
The presented system consists of a set of WASMs (4, 8, or 16) traveling autonomously along the known desired trajectories. The autonomous operation of WASMs involves sensing the environment, classifying the perceived target as high-interest, low-interest, or false unit, and communicating the target information to a remotely situated operator. With respect to the received information, the operator performs a set of actions that can be broadly categorized as: 1) target labeling - marking a target as strike, no strike, or misidentification based on the operator’s perception of target and classification accuracy, 2) target attack - requesting and assigning a weapon (WASM) to each target, and authorizing the attack, and 3) battle damage assessment - verifying the effect of attack on a target and poststrike reporting.
βci(s+ i (k), si (k), ai (k))
k=1
TARGET U NCERTAINTY D ETECTION U SING SPRT
S YSTEM M ODEL
N X
400 300
500
Frequency
In this report, we describe the progress in developing a control architecture for human-in-the-loop wide area search munitions to reduce operator errors in the presence of unreliable automation and operator cognitive limitations. An optimal input tracking controller with adaptive automation uncertainty compensation and real-time workload assessment is developed to improve the system performance. Extensive simulations based on the experimental data involving 12 subjects demonstrate effectiveness of the presented controller.
The model analyzed for the presented problem is a parallel Markov decision process (MDP) such that a new process is initialized for each target identified by a group of WASMs. At each state si ∈ S ⊂ R there exists a finite set of actions ai ∈ A ⊂ R to choose from such that the transition si → s+ i occurs when an action ai is taken in si with the cost of action ci(s+ i , si , ai), where i = 1, 2, . . . , n denotes the number of MDPs (targets). Also, since the presented process is fully observable it can be modeled as a MDP.
Frequency
S UMMARY
400
400 300
300
200
200
200
100
100
100
0 30
35
40 Cost of interaction
45
50
0 25
0 30
35
40 45 Cost of interaction
50
55
30
35
40 Cost of interaction
45
0.5 0.4 0.3 0.2 0.1
0 0
50
100
150
200
250
300
350
time [s]
[1] J. A. Cannon, P. A. Krokhmal, R. V. Lenth, and R. Murphey, “An algorithm for online detection of temporal changes in operator cognitive state using real-time psychophysiological data”, Biomed. Signal Proc. and Contr., vol. 5(3), pp. 229–236, 2010.
1 2 3 4
Contr. Oper. CSI Average cost Cost std.dev. X X X 32.78 2.25 ✗ X X 42.97, 31% ↑ 2.34 X ✗ X 37.06, 13% ↑ 3.25 X X ✗ 34.22, 4.5% ↑ 3.35
Observations from 1 and 2: in the absence of the proposed controller the mission cost increases by 31%; from 1 and 3: unreliable autonomous systems incur more cost however, the system performance can be improved using human supervision to estimate the uncertainties; from 1 and 4: the presented CSI-based switching controller avoids performance degradation during high cognitive load conditions.