I.

INTRODUCTION

Wireless Sensor Networks (WSN) naturally encompass a broad range of applications that consists of multiple sinks and a large number of static and mobile sensor nodes. Sensors are energy constrained by battery while sinks have infinite energy. Communication paradigm between sensors and sinks is one-tomany, namely, a sensor can transmit data to a sink via multiple paths. In this paper, we consider a WSN that is identical to the above network. We investigate the problem of energy-aware multi-path selection in such network. The multi-path delivery paradigm is considered because of its significant potential in advanced applications and the reduction of bottleneck issues [3]. The energy-aware path selection is considered because a sensor is constrained by energy and wants to prolong its lifetime, which leads to possible selfishness as the sensor may decline to forward packets for neighboring nodes [5]. For this reason, methods of encouraging cooperation among sensor nodes are needed. A method for cooperation stimulation should have a strong theoretical base, but the majority of previous work focuses on protocol design and experimental assessment, thereby resulting in inadequate theoretical background. There have been some researches that applied a gametheoretic approach to model interactive decision situations in wireless networks since such problem is fundamentally a conflict situation, where game players are competitive (noncooperative) with each other and use different levels of information sharing and different interactive formats. For example, the authors in [6] consider a game for which players choose their strategies simultaneously, i.e., a static game [9]. However, this kind of interactive format does not properly represent routing, as there is sequential delay between sending a packet by a source node (source) and forwarding the packet

by an intermediate node (forwarder) and energy of the sensors may change within such delay. So, the routing problem is a dynamic game [9]. The work in [7], [8] describes games on the assumption that players of the games share their complete information [9], namely, each player’s payoff is common knowledge among all the players. Clearly, the WSN we consider in this paper inherits various sources of uncertainty, e.g., sensor’s mobility and its energy change. As a result, the contemporary game-theoretic approach based on static and complete information is inappropriate for our problem. We propose a new model that uses Dynamic Bayesian games with incomplete information [9]. We formulate the source / forwarder problem in dynamic Bayesian game contexts and derive equilibrium strategies of the game. This game is played by every node participating in packet delivery, thereby helping the node decide delivery paths, i.e., next hops, toward a sink. Eventually, the source node delivers traffic to the sink via an energy-aware path. The factors, such as energy, location (related to mobility), and cooperation of sensors, are taken into account in this work. In addition, each sensor is unaware of energy condition of its neighboring sensors so the belief update system based on Bayesian game theory is proposed to improve the efficiency of path selection and to minimize the need of instantaneous update about local sensors’ energy. Our goal is to enable energy-efficient data delivery among sensor nodes. We test the feasibility of the proposed technique via numerical analysis and simulations. Our results show that the proposed method efficiently uses the belief update / estimation to help the sensor nodes select delivery paths, thereby improving the network lifetime about 28%, compared to the techniques, such as Flood and AODV [10]. The rest of the paper is organized as follows. In Sections II, III and IV, we discuss the related work on game-theoretic approach in wireless networks, the dynamic Bayesian game model for path selection and the simulations and results. Section V offers conclusions and discusses future work. II.

RELATED WORK

Game theory is the study of the interaction of autonomous agents so it is frequently used to model the strategy of wireless nodes [8]. As already mentioned, selfishness can cause a suboptimal performance so cooperation methods, such as incentive mechanisms, are proposed to steer the nodes toward constructive behavior. The cooperation work can be classified to reputation-based mechanisms and credit-based mechanisms

1

[1]. The work in [5], [8] fall into the reputation-based category, where a node earns a positive reputation by cooperation, otherwise, is regarded as a rogue node and isolated from the network. Specifically speaking, packet forwarding strategy [5] uses the tit-for-tat approach for a repeated game to provide nodes incentive to cooperate where the node is served by its peer based on its past actions. This node consequently behaves in a socially beneficial manner. Node participation strategy [8] investigates users’ willingness to join resource sharing peer-topeer networks given the cost of node participation. This work extends the concept of fear among nodes of being punished and proposes the grim-trigger approach for a repeated game that a node shares as long as all other nodes share; does not share if any of the others have deviated in the previous round. The success of that depends on the number of nodes in the network. Related reputation-based schemes rely on the observation on neighboring nodes to select next actions. If the information of observation is second-hand, it is subject to false accusations. In addition, the assumptions of [5], [8], e.g., congestion-free communication and complete information, are invalid with our work. Next, we illustrate the credit-based approach: a node is credited for cooperating with other nodes toward a common goal, and is debited for requesting services from other nodes. Intrusion detection strategy [6] and our work belong to this category and form a Bayesian game model. However, as early mentioned, [6] plays a simultaneously moving game, unlike our work; addresses different problems. Nurmi [2] proposes a Bayesian game model for routing of wireless nodes. That is close to our work with the differences in objectives of a game model, network assumptions, and lack of a systematic demonstration. III.

denoted by j (forwarder). A source or a forwarder is said to be rational if it maximizes delivery of its own packets while minimizing unnecessary energy consumption. Since a sensor does not know energy state of its neighbors, players i and j have private information about their types T, i.e., energy state. Especially, i is interested in j’s type (either energy sufficient denoted by Tj = 1, or energy constraint denoted by Tj = 0) because j plays the role of i’s forwarder and its energy is unknown to i. On the contrary, j knows i’s type since i always initiates its action first, giving j a signal about i’s energy. Fig. 1 illustrates the extensive form of the Bayesian game. N represents an entity deciding j’s type. Source i with the belief that forwarder j’s energy level is sufficient has two pure strategies: Send H packets (a.k.a. Select the path between i and j) or discard the packets and remain in Idle mode. If source i believes that forwarder j is energy constrained, it limits its pure strategy to remain in Sleep mode saving energy. Meanwhile, forwarder j has two pure strategies after receiving an i’s action. Forwarder j either Forwards i’s packets to its next hop node or discards the packets and remains in Idle mode (Not Forward). The strategy discard the packets and remain in idle mode is adopted by comparing a sensor’s remaining energy level to a predefined threshold and then choosing the strategy if the remaining energy level were smaller than the predefined threshold. This policy helps the sensor balance its energy resource. In addition, it is more compatible with latencysensitive data and monitoring an opponent in the idle model helps update the sensor’s belief (see Section III C). N

BAYESIAN GAMES MODEL FOR PATH SELECTION

A. Bayesian Game Model We assume the following system model in this paper. Consider a WSN already mentioned in Section I. Sensor nodes and sinks are equipped with a location determination technique, e.g., GPS or triangulation methods. Sensors can measure their own energy levels at reasonable precision without loss of generality, but they are not aware of energy states of other sensors. All sensors operate on a common carrier frequency and transmit at the same power that covers a limited footprint. Communication between a sensor and a sink is based on multihop relaying so the packet delivery between a sensor and a sink is a combination of multiple source / forwarder (one hop) games. We model this game as a repeatedly dynamic Bayesian game with no discount factor with respect to payoffs of the sensor, namely, the payoffs function remains the same in every stage of the game. Stages are represented by a discrete model of time where time is divided into time periods, i.e., game stages tk, k = 0, 1, 2… Cost parameters and payoff functions are common knowledge among the game players. The parameter remains unchanged during a single game stage but it can be altered as a game evolves. We model a two-player Bayesian game. One player is a sensor node with information for sinks, denoted by i (source). The other player is a one-hop neighbor of the source i and decides whether to forward i’s packets or to discard them,

i believes that j is energy sufficient.

i believes that j is energy constrained.

Source i Sleep mode

Idle mode

Send H packets Forwarder j

Not forward Forward Not forward Forward Not forward

(0,

(0,

-Ci)

-βHR - Ci) -Ci)

(-Ci,

(-Ci,

Forward

(HR - KHCTX, ((1-2α)HR - KHCTX,

-βHR - Ci) -HR - Ci)

(2α-1)HR – KHCF)

Figure 1. Extensive form of Bayesian game

To deliver a packet to sinks, sensors need to make a mutual contribution to the packet forwarding. Cooperation between the sensors can not be taken for granted, assuming the sensor was under the control of its own authority, thereby requiring cooperation stimulation in the game. We define that a sensor will earn a reward R if it forwards a packet for a neighboring sensor, where R > 0. Meanwhile, the neighboring sensor pays a price –R for the service. R is considered as a virtual credit of the game, a gain, and –R indicates a loss of credit on a benefiter. The gain earned by forwarding a packet is equivalent to the loss caused by requesting a service, therefore stimulating cooperation among sensors. This reward and loss scheme is a logical concept that steers the nodes to behave rationally. It is not implemented as the token-based admission control orthogonal to the packet forwarding of this paper. In the payoffs functions, H represents a number of transmitted packets by i, α represents the successful transmission rate from i to j that reflects channel contention

2

around i, β represents the false packet detection rate of j, and H ∈ [1, ∞ ], α, β ∈ [0, 1]. Costs of transmitting one packet of traffic, forwarding one packet of traffic, and remaining in idle mode for the duration of one game stage are denoted by CTX, CF, and Ci respectively, where CF > CTX > Ci > 0. It is reasonably assumed that R is larger than any the energy cost parameters, justifying a player’s transmission strategy and forward strategy. K represents the location influence on the energy cost functions and it is equal to the ratio θ/θth. θ represents an angle that is formed by a source node, its next hop neighbor and a sink. It is calculated by the geographical coordinates of the involved sensor nodes. Consider the example shown in Fig. 2. The angle θ between Edge (S, D) and Edge (S, 2 2 2 S, F1 - F1, D + S, D . If the value of θ1 F1) is derived as θ1 = Arc cos(

2 × S, D × S, F1

)

were small, the value of K reduces, indicating a lower energy cost metric, that is to say that F1 is close to D with respect to the angle. It is because the small angle refers to a shorter distance or less hop counts between F1 and D, resulting in the low energy cost for a packet delivery to a destination given that the transmission range of sensors is limited. Otherwise, F1 is in less attractive position to perform the packet forwarding for S. The above idea is to encourage a source node selecting its next hop neighbor with less hop counts (distance) toward a sink so energy and delay is minimized. θth, a tunable parameter, controls the process of path selection, i.e., next hop selection. The higher the value of θth is, the more delivery paths to be selected by a source node S due to the reduction of K. By smartly manipulating the θth, our path selection method effectively balances energy usages among different sensor nodes. Its details are further explained in Section III D. Source = S = (x1, y1) Sink = D = (x2, y2) Forwarder 1 = F1 Forwarder (F1) Radius (R)

θTh

Forwarder (F2)

θTh

= distance between S and D

θ2

F1,D= distance between F1 and D

θ1 Source (S)

S,D = ( x1 − x 2) 2 + ( y1 − y 2) 2

Sink (D)

S,F1 = distance between S and F1 Θ1= angle between edge(S, D) and edge(S, F1) θTh= tunable angle threshold

Figure 2. Illustration of a path selection based on the angle θ

Next, payoffs of strategy combinations of the Bayesian game are illustrated. Different combinations of strategies selected by i and j result in different payoffs that depend on the expected gain or loss of selecting a strategy and the energy cost of executing the strategy. With the strategy combination of i and j as (Send H packets, Forward), the payoffs of i and j are ((1-2α)HR-KHCTX) and ((2α-1)HR-KHCF), respectively. These expected gain / loss are based on the number of i’s transmitted packets H and the value of α. For the j, the expected gain of playing Forward H packets is HRα – HR(1-α)= (2α-1)HR, where (1-α) is the negative rate of failed transmission. In contrast, the expected gain of playing Sending H packets by i is equal to a negative form of the j’s gain, i.e. the price that i pays for the j’s service (1-2α)HR. For both of the strategies, their energy cost is a function of K, H and a fixed parameter CF or CTX. CF or CTX represents the energy cost of the 1st hop delivery of a strategy while K represents the metric approximately modeling the energy cost beyond the 1st hop delivery of that.

Another possible strategy combination of i and j is (Send H packets, not Forward), which, j’s payoff is –HR-Ci because it punishes itself by not contributing to i’s packet forwarding while consuming energy in idle model. i’s payoff on the other hand is the inverse of j’s loss plus the energy cost of transmitting H packets. For the rest of strategy combinations with i being in idle mode and in sleep mode, it is clear that i’s payoff is -Ci and 0, respectively. j’s payoff is -Ci if it runs the not Forward strategy, and it consumes energy without transmitting a packet (idle mode), and has an estimated loss – βHR because of false packets detection if j decides the Forward strategy given that i is in idle mode or in sleep mode. B. Bayesian Nash Equilibrium Consider that rational players always try to maximize their payoffs. Analyzing equilibrium strategies of a game model is desirable since it provides optimal agreements between the players. In other words, actions of the players are such that no player wants to deviate from his predicted strategy [9]. In our game, the source i and the forwarder j select strategies to maximize their own payoffs, given the incomplete information about its opponent’s energy, in a stage-wise manner. Therefore, we derive Bayesian Nash Equilibrium (BNE) of the game in a single stage to understand mutually acceptable game strategies. In this analysis, a prior belief on j having sufficient energy, i.e., B0, is assigned. This probability is prior common knowledge. To decide whether there is a pure-strategy BNE in the game or not, we first examine the pure strategy combination (i plays Idle mode, j plays not Forward). In this case, the not Forward strategy is a dominated choice compared to the Forward strategy. However, if j chooses the not Forward strategy, the best choice for i is the Send H packets strategy. Therefore, the strategy combination (i plays Idle mode, j plays not Forward) is not a pure BNE strategy. Assume that i believes that j has sufficient energy, i.e., high B0. i adopts the Send H packets strategy. Otherwise, i chooses the Sleep mode strategy. In this case, the most reasonable response for j is to play the Forward strategy. However, given the j‘s choice, i’s original strategy Send H packets becomes suboptimal and i may switch to play Idle mode. Thus, the strategy combination (i plays Send H packets if j has sufficient energy but plays sleep mode if j is energy constrained, j plays Forward, high B0) is not a pure BNE strategy either. Next, follow the same scenario as above with the adjustment that i believes that j is constrained by energy, i.e., low B0. The best option for j is to play not Forward. Such strategy combination (i plays sleep mode if j is energy constrained but plays Send H packets if j has sufficient energy, j plays not Forward, low B0) is a pure BNE strategy. According to the previous BNE analysis, we learn that only one pure BNE strategy existed when B0 is low. When B0 is high, we can use the mixed strategy approach to analyze BNE, where P represents the i’s probability of Send H packets, and F represents the j’s probability of Forward. Examples of the Expected Payoffs of player j (EPj) are shown next. EP j ( Forward ) = B 0 × P × ((2α − 1) HR − KHC F ) + B 0 × (1 − P ) × ( − β HR − C i ) + (1 − B 0 ) × ( − β HR − C i ) (1) EP j ( not Forward ) = B 0 × P × (( − HR − C i ) + B 0 × (1 − P ) × ( − C i ) + (1 − B 0 ) × ( − C i ) (2)

3

Assume that the expected payoffs of j‘s strategies are equal (i.e., Eq. (1) = Eq. (2)). i’s equilibrium probability of Send H packets is P* =

βHR

. Likewise, j’s equilibrium

B0 × (2∂HR − KHCF + βHR + Ci)

probability of Forward is F* =

PHR − PKHCTX + Ci − PCi 2∂HRP

. With the

B0 and the equilibrium strategies above, we have the mixedstrategy BNE of the game: if i believes j is energy sufficient, i transmits packets with probability P*. Otherwise, i remains in sleep mode. For j, it forwards i’s packets with probability F*. In the case if i believes that j is energy constrained, the pure strategy exists for which i plays Send H packets or Sleep mode if j were energy sufficient or energy constrained, respectively. In the meantime, j plays a pure strategy not Forward. The finding above has derived the equilibrium condition of the game that is used to facilitate the design of paths selection of sensor nodes. One concern with the above analysis is the lack of belief update system because the configuration above is a single-stage game environment. In a multi-stage Bayesian game, a player’s belief is updated based on game evolution at the end of each stage, resulting in a posterior belief. This belief is assigned to be a prior belief at the subsequent stage. C. Multi-stage Dynamic Bayesian Game In this section we extend the dynamic Bayesian game above to a multi-stage dynamic Bayesian game with a belief update system. We apply the same Bayesian game model to every game stage with no discount on strategy payoffs. A player action at the stage tk is denoted by a(tk). An action history profile of j with respect to his opponent i’s action at the stage tk is denoted by h ij ( t k ) that represents a vector of the j’s actions at stages t0, t1, …, tk-1. In the abovementioned single-stage game where i’s prior belief is given at a stage, let’s called t0, its equilibrium strategy is determined. Suppose the game continues and moves beyond the stage t0. In the next stage t1, a new prior belief is needed for deciding i’s next strategy. This new prior belief is determined by i’s observation on j’s actions, and action history profiles of i and j. For j, given that i’s type is clear, j’s strategy depends on action history profiles of i and j. Next, we explain how i updates his belief on the j‘s type following Bayes rule. Eq. (3) shows the belief i has about the energy state of j at the stage tk+1, a probability distribution that is conditioned on the j’s action at the stage tk and the action history profile of j with respect to i at the stage tk. Bi(tk+1) is thus the i’s posterior belief at the stage tk, i.e., the i’s prior belief at the stage tk+1. P(aj(tk)|Tj, h ( t ) ) is a random i j

k

probability that the j’s action is observed at the stage tk given the j’s type and his action history profile. Clearly, this kind of belief update is associated with i’s observation so it is possible to include errors, given the error-prone wireless medium. In order to characterize possible observation errors, we introduce the parameters X and Y, where X is the rate of correctly observing an opponent’s action and Y is the rate of false positive observation. With the X and Y, we demonstrate the computation of an observed action at a stage (see Eq. (4) and (5)). They are parts of the belief update used in Eq. (3). Using the inequality payoff formula in Eq. (6) with the Bi(.), the i’s action with optimal payoff is derived as ai*; ai^ is an alternative

action of i. Similarly, the optimal action of j is derived by Eq. (7), where, the j’s belief is omitted since the i’s type is public. Follow Eq. (6) and (7). We derive a pair of the actions that constitutes the BNE of a single stage game. Furthermore, by proving that our model satisfies the subgame perfection, the four Bayesian conditions and the P condition [9], our game owns Perfect Bayesian Equilibrium (PBE) [4], which is associated with the above BNE analysis. B i (tk

+1

) = P (T j | aj (tk ), h ij (tk )) =

Bi (T j | h ij (tk )) × P ( a j (tk ) | T j , h ij ( tk ))

∑ B (T i

T j*

* j

| h ij (tk )) × P ( a j (tk ) | T j* , h ij (tk ))

(3)

P ( a j (tk ) = Forward | T j = 1, h ij (tk )) = XF + Y (1 - F )

(4)

P ( a j (tk ) = not Forward | T j = 1, h ij (tk )) = (1 − X ) F + (1 − Y )(1 - F )

(5)

Payoffi (( ai*, aj ) | B i (.), h ij ) ≥ Payoffi (( a i ^, aj ) | B i (.), h ij )

(6)

Payoffj (( ai, aj*) | h ij ) ≥ Payoffj (( ai, aj ^) | h ij )

(7)

D. Design and Implementation of Path Selection This section describes our game model integration with the new path selection method. Its foundation is the equilibrium strategies of game that include the heuristic assistances from location and energy information of sensors. Consider a source node i and its one hop neighbor j in the game. Algorithm 1 describes how i selects j as a valid next hop, j ∈ Ni. Ni denotes one-hop neighborhood of i. First, i learns the information of sinks and forms the Ni. Next, i determines its prior belief at the present stage using the uniform probability distribution or the belief update, i.e., Eq. (3). The uniform probability distribution is adopted because of the known discretization method for type. Other probability distributions, e.g., Beta distribution or Weight distribution, can be used under special assumptions. Finally, i adopts the BNE analysis that includes the angle and energy evaluations to decide a strategy. Note that, the angle calculation is a part of the process of belief update while the energy comparison partially determines the sensor’s strategy. If i‘s final strategy is the Send H packets, it will send a forward request to j and wait for the j’s response, namely, j’s forward requests to its next hop neighbors toward a sink. Otherwise, i stays in Idle mode or in Sleep mode. Algorithm 2 describes j‘s decision after learning the i’s strategy. j runs the BNE analysis and its energy evaluation to decide a strategy. Assume that j accepts the i’s forward request and decides to forward the packets. j will initiate new games between itself and all of its neighbors except i in order to find its next hops closer to the sink. If j sends a forward request to a neighbor in the new game, j does not only try to deliver the i’s packets but also notifies i about the j’s forwarding effort because of broadcast nature of wireless communication. That helps i update its belief on j. Such process is repeated until it attains to the sink. It is possible that j does not forward the i’s packets after receiving the i’s forward request. One obvious reason is contributed by the j’s low energy status while the other reason is the lack of j’s next hop neighbors. In order to

4

Algorithm 1: Node i decides if node j is a member of next hops or not // Part 1: Neighbor discovery and sink selection for node i while receive a query Q do LS1 Å Location of Sink 1 (S1) from Q end while IS1 Å Interest of S1 from Q while receive a reply of j (ACKj) after i sends a join request do Lj Å Location of node j from ACKj {j} end while Ni := Ni // Part 2: Initialization of i’s belief update (Bi) if stage = t0 Construct Bi by the uniform probability distribution else if //derive the posterior and use as a new prior in the next stage Derive K with LS1, Lj, Li, θth Calculate Bi by Eq. (3) // Part 3: Deriving an action for i Decide the optimal action of i ai* using Bi and Eq. (6) //END

∪

Algorithm 2: Node j decides if to forward node i’s packets or not Decide the optimal action of j aj* by Eq. (7) //END

A forward request message from a source node i is passed by a number of intermediate sensors, which, their information likes ID and its distance to i is attached to the message. A sink node that receives i’s forward request sends an ACK message to i via the disjoint paths derived by the information in i’s forward request. This acknowledges potential paths between the sink and i. i then selects one of the disjoint paths to deliver traffic; changes to an alternative path if the original path were disconnected by the inferior sensor condition (low energy or poor location) along the path. Note that the selection of the path between i and the sink follows the FCFS policy in this paper. It can be extended to QoS-based path selections in the future. If i runs out of path choices (next hops) due to the limitation of energy or location (angle), we address that by exploring all the i’s next hops with the tunable θth. By increasing the θth, the paths, which were excluded at early stages due to the angle reason, can be evaluated again for packet forwarding. When no path is available with the high θth, the predefined energy threshold Eth is reduced to allow less energy nodes participating in the delivery. This kind of feature provides the best-effort path selection while avoiding the inefficient paths selected at early stages. Note that, we configure the single traffic path in this paper to save energy. For other applications, e.g., reliability-sensitive, the multi-path delivery is an option. The issues, e.g., failed transmission and mobility, are addressed using acknowledgement, timer and location update methods. A mobile sensor sends a location update that includes a declaration of existence and geographical data to new neighbors initiating new games if the mobile sensor starts a forward request or is asked to join another sensor’s one-hop neighborhood. Such response is reactive. The original path, which is broken by the mobile sensor, is reported by an upstream neighbor of that. It asks the source node to select another path for traffic. If such message does not make the source node act, the source node will select a new path after a timer is expired. Such computation involves limited local sensors, inducing small overhead for the new path selection. E. Analytical Validation In next examples, we demonstrate that using the proposed game model efficiently predicts a node’s belief. We evaluate

the response of i’s belief update under different actions of j, K, X and Y. Assume the default parameters, α = 0.9, β = 0.01, K = 0.1, R = 1, H = 1, CF : CTX : Ci = 2 : 1 : 0.02, X = 0.85 and Y = 0.1, in the game assessment. B0 is set 0.5 for all four scenarios. In Fig. 3, we show that i’s belief on j’s energy condition is affected by recently observed actions of j in previous stages where 1 and 0 in Fig.3 represent Forward strategy and Not Forward strategy, respectively. With the forward action in a previous stage, i’s belief increases since j is able to perform the last forward action, indicating j with sufficient energy. Otherwise, i’s belief decreases. In particular, the belief update of the sensor node with a small K, e.g., K = 0.1, obviously demonstrates this trend. We also find that the change rate of i’s belief based on a forward action is faster than that based on a not forward action. This can be translated into quicker convergence on the path setup when a forwarder becomes ready. Next, we investigate the mobility effect on i’s belief. As a sensor node moves and changes its θ or the θth is tuned, the value of K of the sensor is altered, affecting its belief update. In Fig. 3, the belief update with a large K and different early actions of j results in almost no change or negative changes (below zero) in i’s belief (see K is 5 and 6 respectively). This is caused by inapt location / angle of j. It increases the weight of energy cost of payoff functions, thereby outweighing incentive of j’s action. As K decreases, indicating less energy cost, i’s belief promptly changes in accord with j’s previous actions. The lower the value of K is, the faster belief to be updated by a node. As early mentioned, observation errors affect belief update. We here verify the effect of X and Y on i’s belief. From Fig. 3, between X = 0.65 and K = 0.1 (i.e., same K but X = 0.65 vs. 0.85) we note that the higher X is, the faster i’s belief changes with new observations. Between Y = 0.2 and K = 0.1 (i.e., same K but Y = 0.1 vs. 0.2), the lower Y is, the faster i’s belief converges to 1 and the slower it diverges. From these assessments, we find that the belief update system is swift and accurate in updating a node’s belief and it is proved to behave reasonably under the observation errors and the location effect. 0.9 Prior belief of i on j's energy

avoid incorrect belief update on j’s energy, a control message is devised to handle the later scenario.

0.7 0.5 0.3 0.1

Actions of j in the previous stage = {1110000111}

-0.1 0

2

K = 0.1 K=6

4 6 Number of stage games K=2 K=4 X = 0.65 Y = 0.2

8

10 K=5

Figure 3. Validation of dynamic Bayesian game model

IV.

PERFORMANCE EVALUATION

In this section, the performance of the proposed path selection protocol, the Shortest distance path (single path based on the Ad hoc On Demand Distance Vector (AODV)) [10], and the Flood (multiple paths with no redundant delivery) is evaluated in ns-2 to compare their network lifetime. We select the IEEE 802.15.4 as MAC protocol with the data rate of 250 Kbps. The sensor range is 15 meters. The packet size is 70 bytes. A random dense network is tested in simulations, where, it includes 50 nodes in a 70 x 70 m2 area with one source and

5

one sink. 10% of the nodes follow the Random Waypoint model with 2 m/s velocity. The average two-hop neighborhood size in this topology is 12 nodes. This reflects a dense populated network with hidden terminals and multi-hop data delivery. We adopted a periodic traffic model with the 0.24 packet/second generation rate. For the radio channel propagation model, a two-ray path loss model was chosen. We applied the energy cost model, where transmit power is 17 mW, receive power is 35 mW and idle power is 0.71 mW. Initial energy of every sensor is 0.5 J. Eth is 0.25 J. Initial θth is π/8. An opponent’s energy is believed to be sufficient if the belief ≥ 0.5. We evaluate the network lifetime through these metrics: (1) The 1st node die: Time when the first node depletes all energy. (2) No alternative path: Time when no path is available between a source and a sink due to lack of energy or paths. Metric (2) better represents such evaluation since a network can still survive with a few energy-depleted nodes. Fig. 4 compares the metrics of the comparative techniques. The Flood shows the shortest network lifetime in both of the metrics since its data delivery does not consider sensors’ energy and location and generates more traffic loads, resulting in higher energy consumption and waste due to data collision and retransmission. Provided that the sensors with Flood can deliver traffic over multiple paths, the effect of the first node die can be alleviated and its network lifetime is expanded 25%. The Shortest distance path uses a hop count to derive a single delivery path. If the path were disconnected, the source node re-establishes a new path based on distance. It therefore outperforms the Flood because of less traffic and contention in the network, but its lifetime is shorter than the proposed method. The proposed path selection enhances network lifetime by selecting the delivery path based on a sensor’s angle and energy, namely, first using the paths with qualified energy and angle, then using the paths with inferior conditions. Such prioritized feature ensures the fair use of delivery paths, consequently balancing the traffic loads of network and improving the network lifetime, based on the no alternative path metric, up to 28% more, compared to the other two techniques. For the performance of the 1st node die metric, the result of proposed path selection significantly outperforms that of Flood and slightly surpasses that of Shortest distance path. This is contributed by our energy-aware features and the topology. Fig. 5 shows the remaining energy distribution of the 20 sensor nodes surrounding the sink when the no alternative path occurs. The proposed path selection has the lowest remaining energy in each node and the least Standard Deviation (SD) of energy samples. The reason is that the network with the proposed technique provides the longest operation time and distributes traffic loads / energy cost among different sensor nodes. When a sensor node is closer to a sink or in a congested zone, likes node 1 or node 11, its energy resource depletes in a faster rate. Overall, these results denote that the proposed protocol is more energy-efficient. V.

CONCLUSION AND FUTURE WORK

In this paper, a novel path selection method for WSNs has been proposed. It comprises dynamic Bayesian game approach

and energy-aware features, so deriving energy-balanced path strategies for sensors. We analytically prove that the proposed game model efficiently estimates a sensor’s belief in dynamic and continuous game plays, enabling the sensor to find a suitable delivery path. The simulation study shows that the proposed path selection considerably increases the network lifetime up to 28% more, compared to Shortest distance path and Flood. Overall, this technique is distributed, scalable and theoretically justified. Its overhead message is localized. In the future, theoretically, we plan to expand the current game model using other energy classifications and distributions and analyzing behavior strategies and belief update. Practically, we plan to explore our path selection scheme with the QoS requirements, e.g., energy, latency and reliability, and verify performance with parameters, such as mobility speed, moving pattern, overhead, a number of sources and network topology.

Figure 4. Network lifetime

Figure 5. Remaining energy distribution

REFERENCES [1]

V. Srivastava, J. Neel, A. B. MacKenzie, R. Menon, L. A. DaSilva, J. E. Hicks, J. H. Reed and R. P. Gilles, “Using Game Theory to Analyze Wireless Ad Hoc Networks,” IEEE Communications Surveys and Tutorials, vol. 7, no. 4, 2005. [2] P. Nurmi, “Modelling Routing in Wireless Ad hoc Networks with Dynamic Bayesian Games,” Proc. of the IEEE International Conference on Sensor and Ad hoc Communications and Networks (SECON), 2004. [3] I. F. Akyildiz and I.F. Kasimoglu, “Wireless Sensor and Actor Networks: Research Challenges,” Ad Hoc Networks Jouranl, vol. 2, no. 4, pp. 351-367, October, 2004. [4] C.-K. Lin, “Perfect Bayesian Equilibrium Analysis for Energy-Aware Path Selection in WSNs,” Technical Report, Q2S centre, Norwegian University of Science and Technology, 2009. [5] M. Felegyhazi, J-P Hubaux and L. Buttyan, “Nash Equilibria of Packet Forwarding Strategies in Wireless Ad Hoc Networks,” IEEE Trans. on Mobile Computing, vol. 5, no. 5, pp. 463-476, May, 2006. [6] Y. Liu, C. Comaniciu and H. Man, “A Bayesian Game Approach for Intrusion Detection in Wireless Ad Hoc Networks,” Proc. of the International GameNets Workshop, October, 2006. [7] A. Economides and J. Silvester, “Multi-objective Routing in Integrated Services Networks: a Game Theory Approach,” Proc. of the IEEE INFOCOM Networking Conference, 1990. [8] A. MacKenzie and L. DaSilva, Game Theory for Wireless Engineers. Morgan & Claypool Publishers, 2006. [9] D. Fudenberg and J. Tirole, Game Theory. Cambridge, Massachusetts: Massachusetts Institute of Technology press, 1992. [10] C. E. Perkins, E. M. Belding-Royer and S. Das, Ad Hoc on Demand Distance Vector (AODV) routing, IETF RFC 3561.

6