General Terms

Researchers have proposed many wireless MAC protocols such as [20], [8], [25], [24], [6], and [17] which exploit frequency-agile radios and multiple available channels to increase network throughput. These protocols usually only require each node to have one radio. By carefully coordinating the frequency hopping of different nodes, different node pairs can use multiple channels simultaneously. In [17], Mo et al classified these protocols into four generalized categories and compared their performances through both analysis and simulation. They found that the Parallel Rendezvous family of protocols has the best overall performance by removing the bottleneck of a single control channel. These protocols show good promise for use with multi-hop networks because these networks suffer from self-interference and traditional MAC protocols using only one channel often fail to provide satisfactory throughput. However, we are not aware of any implemented Parallel Rendezvous multi-channel MAC protocols. We argue one major reason is that existing proposals such as McMAC[17] and SSCH[6] have not thoroughly considered a practical aspect of the design essential for a working implementation, namely: synchronization. Through an exploration including an implementation exercise on hardware, we show that synchronization for multi-channel MAC protocols is a non-trivial problem. We designed and implemented a synchronization mechanism specifically for this purpose and show that it has tackled the problem of synchronizing one-hop neighbor pairs effectively, thereby paving the way for efficient multi-channel MAC protocols.

Algorithms, Performance, Design, Experimentation.

Categories and Subject Descriptors C.2.1 [Computer-Communication Networks]: Network Architecture and Design—Wireless communication; C.2.2 [ComputerCommunication Networks]: Network Protocols; C.2.5 [ComputerCommunication Networks]: Local and Wide-Area Networks— Access schemes ∗This work was supported by the National Science Foundation (NSF) under Grant CNS-0435478.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MobiCom’06, September 23–26, 2006, Los Angeles, California, USA. Copyright 2006 ACM 1-59593-286-0/06/0009 ...$5.00.

Keywords Time Synchronization, Protocol, Wireless, Multi-channel MAC, Medium Access Control, Recursive Least Squares (RLS), Clock Drift.

1. INTRODUCTION In a typical ad hoc wireless network, all nodes within the network use the same frequency (or channel) to ensure maximum connectivity. Commodity wireless network interfaces operate at speeds ranging from 11Mbps to 54Mbps per channel. However, in a multi-hop wireless network, the achievable throughput is much lower than the speed of the radio because the indicated speed does not take into account the medium sharing effects of a multi-hop wireless network. First, nodes cannot transmit and receive at the same time. Second, when a node transmits, its neighbors cannot receive from another node due to interference. As the node density increases, the throughput share of each node decreases. However, some of the contention is not necessary because modern radios often have the ability to operate over a range of frequencies (i.e., different channels). In an ad hoc network where not all neighbors want to communicate with the same node, simultaneous transmissions over several channels are possible. By allowing different sender and receiver node pairs to transmit on different channels simultaneously, the potential increase in network throughput is tremendous. Many MAC protocols have been proposed to exploit multiple available channels. They differ in the number of transceivers used and the method for neighbors to agree on a channel to use. In [17], Mo et. al classified existing multi-channel MAC protocols into four general categories and compared their performance through analysis and simulation under various conditions. The performance of the Parallel Rendezvous approach is found to be superior to the other three in most traffic scenarios while using only one transceiver. However, this approach also has a higher protocol complexity and it requires one-hop neighbor pairs to synchronize. To the best of our knowledge, there has been no implementation of a multi-channel MAC protocol using the Parallel Rendezvous approach. In fact, we are not aware of any implementation of any non-TDMA multi-channel MAC protocol using only one transceiver per node. We argue that synchronization is a major issue that has not been completely solved in prior Parallel Rendezvous multichannel MAC protocols such as SSCH[6] and McMAC[17], thereby hindering their deployment. By identifying and solving the prob-

lems related to synchronization, we prove that multi-channel MAC protocols using one transceiver per node are indeed practical. In Sec.2, we first examine the practical synchronization challenges and explain why they have not been resolved satisfactorily in prior works. Synchronization problems arise due to imperfection of hardware. Therefore, to ground our discussion on a concrete platform, we explain our choice of hardware for our experiments in Sec.3. Note, however, our solution is not specific to this particular hardware platform. In Sec.4, we describe a simplified version of McMAC, which is a protocol of the Parallel Rendezvous multichannel MAC family. This protocol serves as a software framework in which we incorporate our synchronization mechanisms. Next in Sec.5 we study the source of synchronization errors using our experimental hardware platform as a concrete example. Then, we propose a solution to the synchronization problem in Sec.6. Sec.7 shows the experimental evaluation of the synchronization mechanism for a simplified multi-channel MAC protocol using only one transceiver per node. Finally, we summarize our lessons learnt through the implementation exercise in Sec.8. We hope to contribute in the following ways: 1. Analyze the characteristics of clock drift on a typical hardware platform for MAC implementation. 2. Illustrate practical difficulties in the implementation of synchronization protocols that are often absent in analytical or simulation-based studies. 3. Propose a simple and robust synchronization scheme that is implementable even on low-cost wireless devices based on Recursive Least Squares. 4. Prove multi-channel MAC protocols are feasible even though they require synchronization of one-hop neighbors.

2.

RELATED WORK

2.1 Multi-Channel MAC Protocols Multi-channel MAC protocols can be divided into those using a single and those using multiple transceivers (radios) per node. Another way of categorization, as suggested in [17], is by the mechanism sender-receiver pairs use to agree on a data transfer channel. In such way, multi-channel MAC protocols can be classified into 4 categories: • Dedicated Control Channel Examples are DCA (Dynamic Channel Allocation) [27], DCA-PC (Dynamic Channel Allocation with Power Control) [28] and DPC (Dynamic Private Channel) [13]. • Split-Phase Both MMAC [20] and MAP (Multichannel Access Protocol) [8] are examples. • Common Hopping Examples include CHMA (channel hoping multiple access) [25] and CHAT (Channel Hopping multiple Access with packet Trains) [24]. • Parallel Rendezvous Examples of this approach include SSCH (Slotted Seeded Channel Hopping) [6] and McMAC [17]. The distinguishing feature of the Parallel Rendezvous family is that different sender and receiver pairs can meet simultaneously on different channels to start data exchange. Within the Parallel Rendezvous category, there are two protocols proposed in the research literature – SSCH and McMAC. In both protocols, each node has one radio, and different nodes can have different channel hopping

sequences. Although they differ in the way the hopping sequences are used, both protocols require synchronization. McMAC assumes all nodes be able to transform their local clocks to that of their onehop neighbors, but they do not have to hop at the same boundaries. However, no details are given as how to achieve so. SSCH requires all nodes in the network hop at the same boundaries. SSCH suggests using a synchronization protocol such as [10] to achieve global synchronization without going into further details. However, the authors also mention that the retransmission strategy in SSCH [to achieve the equivalence of a broadcast] may not be compatible with the use of broadcast needed in the synchronization protocol such as [10]. Overall, we feel that the challenge of synchronization in the context of multi-channel MAC protocols has been overlooked. Fundamentally, as nodes operate in parallel on different channels, traditional synchronization protocols based on periodic broadcast do not work well.

2.2 Time Sync Time synchronization is a long-standing problem in distributed systems. Two recent surveys on synchronization algorithms can be found in [2] and [19], which focus on wired and wireless networks respectively. In [2], Anceaume and Puaut defines a taxonomy for clock synchronization algorithms along many dimensions. One dimension classifies synchronization algorithms as either external or internal. External algorithms synchronize every clock to an external realtime clock while internal algorithms only try to synchronize all clocks in the same network to show the same value. For multichannel MAC protocols, we study synchronization in a different context: nodes only need to track the hopping of its one-hop neighbors in order to communicate. Therefore, there is no need for every node in a network to agree on a common clock. By exploiting this fact, we can completely avoid convergence issues by synchronizing neighbor pairs independently. In pair-wise synchronization, nodes with faulty clocks affect only themselves but have no effects on the communication between healthy nodes. In fact, nodes do not even need to correct their local clocks to show the same value as their neighbors. A node only needs to correctly predict the current clock value of one-hop neighbors in order to predict their hopping sequences. In this paper, the term synchronization refers to the estimation of a neighbor’s clock based on one’s local clock, rather than the correction of local clocks. Consequently, nodes can tolerate frequent topology changes with ease. In this paper, we assume that the hopping sequence of a node depends only on its local clock value and its MAC address (i.e., its seed). Consequently, whenever the local clock wraps around, the hopping sequence also wraps around. Hence, one needs not explicitly consider the case when the offset between the clocks of a neighbor eventually drifts apart causing overflows. (However, the hopping sequence can wrap around much more frequently than the local clock, as is the case in our implementation to reduce memory/computation requirements.) Anceaume and Puaut[2] also classifies different synchronization algorithms as Deterministic, Probabilistic, or Statistical. Deterministic algorithms assume upper bounds on message delays while the other two do not. Statistical algorithms further assume the knowledge of message delay distribution such as mean and variance. Our algorithm does not assume any deterministic or probabilistic bound on message delays. However, since synchronization is between one-hop neighbors, we do assume the link speed is known and hence one way transmission delay can be compensated for. To mitigate the effects of timestamping errors, one can use simple averaging to combine multiple timestamp samples as in [11].

More generally, Romer et. al [19] summarizes that there have been three techniques proposed to combine multiple timestamp samples to increase synchronization accuracy: linear regression [10], [15], PI controller [16], and convex hull [7]. We also use a variant of regression to mitigate timestamping errors. Our approach differs from existing approaches using regression such as [10] and [15] in that we apply Recursive Least Squares (RLS) to the problem instead of using a fixed number of timestamps for linear regression. This approach has the benefit of using constant amount of memory (regardless of the history forgetting factor) and that the updates can be computed recursively as new timestamps arrive. It also adapts to the changes in the relative clock drift between neighbors. In addition, we consider many practical aspects of implementation such as avoiding overflows and increasing computational efficiency. More information on RLS can be found in [12]. In general, pair-wise synchronization between one-hop neighbors in the context of multi-channel MAC protocols is a considerably simpler problem than global synchronization in the classical context due to the following reasons: • If the clock of a node becomes faulty, it only affects itself (either as a sender or a receiver). Communication among other nodes are unaffected. • Queueing and medium access delays occur at the MAC layer of the sender. Since synchronization is a part of the MAC, this delay is easily accessible by the synchronization algorithm. • The propagation delay is bounded by the longest distance between two possible neighbors. In wireless systems, this distance is usually quite limited. • The radio link speed is usually known, so the message transmission delay can be compensated for accurately. However, time synchronization for the purpose of multi-channel MAC also poses a number of new challenges: • Broadcast is very unreliable because only a small subset of devices is on the same channel as the sender at any time due to random hopping. It is very hard to guarantee that each neighbor receives periodic broadcast beacons as a result. Algorithms relying on broadcasts cannot be used without modification. • Synchronization algorithms not based on broadcasts usually require round-trip unicast communication between every neighbor pairs. Such pair-wise communication incurs a high message overhead. These synchronization packets scales quadratically as the number of nodes in an area. Eventually, synchronization traffic can affect data traffic when the network is congested or when the sender or receiver is very busy. Worse, neighbors may lose synchronization during such periods making them unable to re-synchronize. • Even watch grade crystal clocks can drift tens of microseconds every second. Therefore, it is necessary to estimate and compensate for the drift between nodes. We will see in Sec.5 that this estimation and compensation process has to be continuous. • Some radios such as all 802.11 interfaces provide hardware support for timestamping with an accuracy of a few microseconds while others (e.g., IEEE 802.15.4[22]) do not. When

only software timestamping is available, the accuracy of timestamps cannot be easily guaranteed. As a result, synchronization algorithms must both detect and tolerate outliers in the timestamp samples.

3. PLATFORM 3.1 Selection of a Platform We investigated several potential hardware platforms as summarized in Tab.1. First, there is a choice of different controllers. Some platforms such as CaLinx[9] use an FPGA as a controller while others such as the motes [18] use micro-controllers (MCU). The CaLinx platform is designed specifically for a digital logic design class at UC Berkeley. The Telos mote platform, also from UC Berkeley, is designed for low-power wireless sensor networking. In addition to having different controllers options, we also had a choice of different radio technologies ranging from pure software radio [1], to ASIC 802.11[14][26] and ASIC 802.15.4[3] radios. For our purpose, we require programmability in all aspects of the MAC layer, but we do not care about the physical layer. Therefore, ASIC radios are a better choice than software radios because they are simpler to program. FPGA-based platforms offer higher performance than micro-controller based ones, but they also require us to program at the Hardware Description Language level, which prolongs development time. FPGA-based boards are also more expensive. Finally, for ASIC radios, we had a choice between 802.11[21] and 802.15.4[22] based radios. IEEE 802.15.4 radios are designed for use in wireless sensor networks or lower data rate applications. Therefore, for high throughput systems, 802.11 based radios are more suitable. However, to achieve a high throughput, 802.11 chipsets tend to implement a lot of the MAC features specific to 802.11 in hardware. As a result, the level of control at the MAC level is reduced, which is unacceptable because we require complete control of packet transmission times and packet formats. Tab.1 summarizes the pros and cons of the different platforms: We decided to use a micro-controller based platform with an ASIC radio. We found several such platforms with different features. At the end, we chose the Berkeley Telos Mote [23] platform running TinyOS. The advantages of this platform is the availability of extensive free software and the relative low cost of hardware, which allows us to buy up to tens of nodes. The downside of this platform is that the micro-controller is slow compared with the radio speed, thereby artificially limiting the throughput of a node.

3.2 Platform Information Below, we summarize the features of the platform from the specifications and micro-benchmarks we conducted to measure the performance of the hardware: • Micro-controller Telos uses a 16-bit micro-controller running at 4MHz. It has 10KB of RAM and an ASIC DSSS radio conforming to the IEEE 802.15.4 standard rated at 250Kbps. Our MAC protocol was designed from scratch and it is not related to the ZigBee standard. Each Telos mote has a USB connection for downloading program and communicating with a PC. Finally, it has a 32768Hz crystal oscillator clock yielding a resolution of 30.5µs. • Channels IEEE 802.15.4 standard has 16 channels in the 2.4GHz range. However, due to energy spill over and imperfect filtering, one cannot use all 16 channels simultaneously without interference. Our experiments found that at most every other channel can be used simultaneously with-

Example Platform Programming Abstraction Level PHY/MAC Programmability MAC Source Code Availability Development Time Unit Cost (USD) Throughput

FPGA Software Radio Polarizone SDR Design Bench [1]

FPGA + ASIC Radio CaLinux[9]

Microcontroller + ASIC Radio Telos[18] SmartStudio[5]

Commercial 802.11 Ref. Design Atheros [14] Maxim [26]

HDL

HDL

C or Assembly

PHY + MAC

MAC

MAC

C or Assembly Parts of MAC

Little

Little

Some

Some

Long $1000’s High

Long $100’s High

Short ∼$100 Lower

Short ∼$20 High

Table 1: Summary of comparison of different hardware platforms. out interference. Hence, we use only up to 8 channels in all of our experiments. • Channel Switching Time The Chipcon CC2420 radio takes 300µs to switch from one frequency to another as measured from our experiments. During that time, packets cannot be sent or received. More recent radios have dramatically reduced the time it takes to switch channels. For example, the newer Chipcon CC2500 2.4GHz radio [4] takes only 90µs to switch channel if the radio has been calibrated at startup. • Frame Duration On the Telos mote running TinyOS, we measured that it takes about 150µs after the packet is assembled in memory before it can be sent on air. Most of this time is spent in sending the packet from the micro-controller to the radio chip over a serial bus bit by bit. On our platform, each TinyOS packet includes a physical length field, a MAC header, and a MAC payload. We use the MAC payload for our multi-channel MAC header, synchronization, and experiment data collection purposes. Beacon packets contain an 8-byte payload for a total size of 18 bytes, and probe/data packets contain a 28-byte payload (default TinyOS maximum on our platform) for a total size of 38 bytes. At 250Kbps, beacon packets last 576µs, and probe/data packets last 1216µs excluding preambles.

4.

PROTOCOL DESCRIPTION

In this section, we describe the details of the protocol we have implemented. This protocol is a simplified version of McMAC, but it includes all of McMAC’s salient features to ensure that we do not miss any important implementation issues. The description of the protocol is divided into 4 sections: channel hopping, discovery, synchronization, and rendezvous. The protocol parameters have been chosen to suit our hardware platform.

4.1 Random Channel Hopping Time is divided into Small Slots of 1/32768 seconds (i.e. 30.518us) each. The duration of each slot is the same as a clock tick at 32768 Hz. The clock value is 32-bit long. Each Big Slot consists of 128 Small Slots and therefore lasts 3906.3 us. A Big Slot is divided into 4 windows: Channel Switching Time (12 Small Slots/366µs), guard time(32 Small Slots/977µs), Contention Window (32 Small

Slots/977µs), and Data Window (52 Small Slots/1586µs). The artificially large guard time window is added to facilitate experiments to estimate the difference between the target transmission time and the actual transmission time. An idle node hops at Big Slot boundaries in a pseudo-random fashion over all available channels. The hopping boundaries of different devices do not coincide in general. Knowing the current clock value and the pseudo-random number generator seed of a node allows one to predict its future hopping sequence. The 32-bit local clock and the seed comprise the hopping signature of a node. Having different hopping boundaries allows different ad hoc networks to merge or split easily without requiring devices to re-align their time slots. Each device chooses its own seed once randomly at startup time, and this seed never changes. For our implementation, we simply use the unique 16-bit MAC address as the seed. Since the processing power of our hardware is limited, we use only the first 128 hops from the beginning of the sequence, and repeat after 128 hops. This way, a node can pre-compute the entire hopping sequence of its neighbor after knowing the seed of the neighbor.

4.2 Discovery Discovery is the process in which devices learn about their current one-hop neighbors and their hopping signatures. Beacon packets contain a hopping signature that includes the 16-bit seed of the sender (same as the MAC address) and a 32-bit local time stamp when the packet is sent. Upon receiving a beacon, the receiver can therefore immediately predict the future hopping sequence of the sender. After a device is turned on, it immediately starts broadcasting beacon packets periodically with a random delay while hopping on its home sequence to announce its presence. A neighbor is considered inactive if no beacon has been heard from it for over 1 minute. There are ways to optimize the discovery time both deterministically and probabilistically. However, due to space constraints, we do not explore other discovery mechanisms in this paper.

4.3 Synchronization Synchronization refers to the process in which nodes estimate and compensate for the difference in their clock speeds and offsets. Nodes rely on the time stamps in beacons, not only for initial node discovery, but also for synchronization. Each incoming bea-

4.4 Rendezvous Rendezvous is the process in which a sender and a receiver agree on a channel for data transfer. In our protocol, the sender has to estimate the local time of the receiver at its target transmission time based on clock skew information supplied by the synchronization algorithm. The local time of the receiver in turn tells the sender on which channel the receiver would be in the Big Slot containing that local time. To send a packet to a particular receiver, the sender first calculates the beginning of the next new Big Slot of the receiver. Each Big Slot contains a contention window of 32 Small Slots. Next, the sender picks a Small Slot within the contention window uniformly at random. The sender then immediately deviates from its home sequence by switching to the receiver’s home channel in the receiver’s next Big Slot. The sender waits until the desired time to transmit, and listens for a carrier before transmission. If a carrier is sensed, it aborts the transmission and returns to its home channel of its current Big Slot.

5.

UNDERSTANDING CLOCK ERRORS

In this section, we examine the clock drift behavior of our platform by collecting sample clock error data from the motes. In general, clock errors can be broken down into 3 components: (i) longterm average deviation of the clock speed from an ideal clock, (ii) short term variation of the clock speed from the long-term average due to variable ambient temperature, and (iii) errors in the time stamping process.

5.1 Time Stamping Errors To minimize the third kind of error, time stamps should be done by hardware. This feature is available in all 802.11-compliant radios, for example. On our platform, however, only software timestamping is available. We reuse the time-stamping techniques by Maroti et al. [15] since they use the same TinyOS platform as ours. Since the time stamp is generated by software, we occasionally observe large time stamping errors of up to about 150 clock ticks (i.e., 4580µs). We measure time stamping errors by having multiple nodes time stamp the same packet and compare them against each other. The frequency of the occurrence of such errors increases as nodes become busier. If our algorithm is robust enough for systems with software generated time stamps, we expect it to work well for all other systems as well.

5.2 Long Term Drift We carried out an initial experiment to quantify the first kind of error on the Telos platform using 18 nodes all operating on the same channel. A beacon node sends out a packet with its local time stamp every 10 seconds. The packet is then received by all other 17 nodes. Each receiver time stamps the packet and forwards both the receive and send time stamps to an attached PC for logging. The resolution of the time stamp is 1/32768 seconds or 30.5µs. Fig.1 shows the relative clock drift of each node with respect to the average of the 17 clocks. The x-axis shows the reference time in hours. The average of the 17 clocks is used as the reference time on the x-axis since we do not have an ideal reference clock. The y-axis shows the difference between the clock reported by a par-

0.3

0.2 Clock Drift of Each Node w.r.t. Avg Clock (Second)

con contains the time stamp of the sender. On receipt, the receiver time stamps the packet using its local clock. Pairs of such corresponding sender-side and receiver-side time stamps are fed into a synchronization algorithm that estimates the current clock skew between the sender’s and the receiver’s clocks. We discuss the details of the synchronization algorithm in Sec.5.

0.1

0

−0.1

−0.2

−0.3

−0.4

0

1

2

3

4 5 Reference Time (Hour)

6

7

Figure 1: Relative clock drift of 17 nodes over 8 hours.

ticular node and the average time stamp reported by the 17 nodes, measured in seconds. Due to the definition of the reference clock, about half of the clocks are faster than the reference. The maximum drift between the fastest and the slowest clocks is about 0.6 seconds over a period of 8 hours, which amounts to about a 21 parts-per-million (ppm) maximum drift.

5.3 Short Term Drift Variation The 17 curves in Fig.1 are virtually straight but not exactly so. Therefore, each node still has to continuously estimate the drift of its neighbors’ clocks using its own imperfect local clock. To illustrate the clock drift variation over time, we arbitrarily choose a pair among the 17 nodes and plot the receive time stamps reported by node j against those reported by another node i. This would be almost a straight line with a slope very close to 1. Then we fit a straight line through these points using a least squared error (LSE) method as illustrated in Fig.2. Finally, we subtract the straight line from the curve to see the estimation error. Fig. 3 shows the residual estimation error for an arbitrarily chosen, but typical pair (2,17). We summarize our findings of this experiment: 1. There is some fine noise in the estimation error. Since the clock resolution is 1 tick (or equivalently 30.5µs), estimation errors below one tick are caused by quantization. We cannot hope to correct such errors. 2. In general, most clocks have fairly good short term stability as we can see that the errors are obviously correlated over time. This is a good news because it means the drift rate of clocks vary slowly for these nodes (at least when they experience about the same temperature). We can correct a large fraction of such estimation error by adaptively estimating the drift rate using recent history. 3. Among all the pairs, the difference of their clock speeds vary less than 1 ppm from their long term clock drift rate which is 1 to 2 orders of magnitude smaller than the long-term drift shown earlier in Fig.1. Even so, one cannot ignore this short term variation because errors can build up over time.

8

6. TIME SYNCHRONIZATION In the previous section, we have illustrated the clock drift behavior between neighbors. This understanding allows us to investigate different clock synchronization algorithms in this section. We first collect send and receive time stamps pairs and then use them to compare the performances of different clock synchronization algorithms through emulation on a PC. Finally, we fine tune the algorithm for efficient and robust implementation for our platform and validate the results through actual experiments. This approach greatly simplifies programming because we have the full resources of a PC rather than an embedded micro-controller when we are exploring different algorithms.

Receive Time Stamps Reported by Node j Ideal Clocks(slope=1) Actual Time Stamps Fitted Straight Line

6.1 Time Stamp Trace Collection

Receive Time Stamps Reported by Node i

In this section, we describe a set of experiments designed to capture time stamps traces for emulating different synchronization algorithms. We set up 16 nodes within the range of one another to each send out beacon packets every 10 seconds. Each beacon packet contains the following information: 1. SrcID (2 bytes): Node ID of the sender.

broadcast packets received by both i and j

2. SrcSeqNum (2 bytes): Sequence number of the beacon packet from this sender.

Figure 2: For each of the beacon packets received by both i and j, we plot the received time stamps reported by j against those reported by i. Next, we fit a straight line through it. The difference between the straight line and the actual curve is the estimation error.

3. SrcClock (4 bytes): 32-bit local time stamp of the sender when the packet is sent. Upon reception, the receiver sends a log packet containing the received packet and the above information over the USB port to an attached PC. Therefore, when a node beacons, it triggers a total of 15 log packets, one from each of its neighbors. The packets contain sufficient information to emulate any synchronization algorithm based on nodes periodically broadcasting beacon packets. We can also detect packet losses in our experiment.

6.2 MMSE 80

Error of Prediction at Node #2 (x 30.5us)

60

40

20

0

−20

−40

−60

0

1

2

3 4 5 Local Time at Node #17 (Hour)

6

7

Figure 3: Residual Error of Estimation by Regressing Time Stamps of Node 2 on Time Stamps Reported by Node 17.

8

A standard approach for clock synchronization is by an iterative Minimum Mean Squared Error (MMSE) estimator on a recent history of a fixed number of time stamps as in [10] [15]. Let yi ’s denote the sender’s time stamps contained in the beacon packets and xi ’s be the receiver’s time stamps. The index i is the sequence number of a beacon packet. We model the relationship between the two as: yi = sxy (xi − xo ) + ǫi where ǫi is the error due to the time stamping (on both the sender and the receiver sides). sxy is the ratio of the speed of y’s clock to that of x’s. xo is the value of x’s clock when y’s clock starts at 0. It is reasonable to try the MMSE estimator using a variable number of recent send/receive time stamp pairs. The estimation process works as follows. The senders send out beacon packets containing the senders’ local time stamps (yi ’s). The receivers time stamp each beacon packet (xi ’s). Based on the recent history of (xi , yi ) pairs, the receiver predicts the sender’s time stamp in the incoming packet. This estimate is called yˆi . The estimation error is yi − yˆi . The standard error of estimation is defined to be the square root of the mean of the (yi − yˆi )2 . Since there are 16 nodes, there are 16×15 = 240 sender/receiver pairs. A synchronization algorithm being tested is run between every pair, yielding 240 standard error figures. We use the median and the maximum of these 240 numbers as the figures of merit for different synchronization algorithms. Some devices such as the motes have very limited memory and processing power; thus, it might not be feasible to remember a large number of previous sample points for regression. However, for

estimators of the form: yˆi = bo + b1 xi . We define the objective of the regression to be minimizing the residual sum of squares:

Median and Maximum Standard Errors in Prediction among All Node−Pairs

0.8

0.75

RSS =

0.7

n X

γ n−i (yi − yˆi )2 =

0.6

By setting

γ n−i (yi − bo − b1 xi )2

i=1

i=1

0.65

n X

∂RSS bo

and

∂RSS b1

to 0, we have:

0.55

βˆo

0.5

n X

γ n−i + βˆ1

i=1

0.45

0.4

βˆo

0.35

0

10

20

30 40 50 60 70 Number of Previous Sample Points Used in Prediction

80

90

γ n−i xi + βˆ1

In the previous section, we assumed that the nodes can remember as many sample points as necessary, which is not true for mass produced low-cost wireless devices. For example, our motes have 10KB of memory to be shared by all applications and the operating system. Each time stamp pair takes 8 bytes. Assuming a beacon interval of 10 seconds, achieving the optimal estimation error requires keeping 30 points or 240 bytes per neighbor. On a different hardware platform, perhaps even more history needs to be remembered. This severely limits the number of neighbors possible. We seek an algorithm that keeps a smaller constant amount of memory per neighbor and one that allows an incremental update when a new time stamp pair is observed. In this section, we investigate a recursive estimation algorithm that gives earlier points geometrically lesser weight depending on age instead of equal weights. Using the same notation as before, xi and yi represent the receiver’s and sender’s time stamps. yˆi denotes the receiver’s estimate of the sender’s time stamp. We restrict ourselves to linear

n X

γ n−i yi

(1)

γ n−i xi yi

(2)

i=1

γ n−i x2i =

n X i=1

i=1

n X

γ n−i ; Sx,n =

i=1

6.3 Recursive Estimation

n X

Solving (1) and (2) directly would yield the desired βˆo and βˆ1 , but it would require remembering all xi ’s and yi ’s as before. Instead, to allow recursive computation of βo and β1 , we first define the followings: S1,n =

now, we ignore the implementation constraints and assume that devices are not resource constrained. We want to know if MMSE performs well enough for our application, given that clock drift rates are not constant, and that time stamping errors are not necessarily Gaussian. In this section, we use the most recent n pairs of local (receiver’s) and remote (sender’s) time stamps to feed into a standard MMSE estimator. By varying n, we attempt to find the optimal number of points that should be used to achieve the highest accuracy. If the estimator uses too few points, the error in the time stamp samples would dominate and reduce accuracy. However, if too many points are used, the estimator cannot quickly track shortterm variation of clock speed. Fig.4 shows the maximum (upper curve) and the median (lower curve) estimation errors as the number of data points used in regression varies from 5 to 100. As expected, the errors drop at first but eventually rise again. The optimum occurs around 30 data points, which corresponds to 300 seconds in real-time since the beacons are 10 seconds apart unless there is packet loss. Packet loss ratios between pairs range from 0% to 4% with most being below 1%. Therefore, the optimal synchronization algorithm based on MMSE should use about 5 minutes of history. Overall, MMSE works well with the right number of data points.

γ n−i xi =

i=1

i=1

100

Figure 4: Maximum and Median Standard Error of Estimation among All Node Pairs vs. History Length. Data points are 10 seconds apart.

n X

n X

n X

γ n−i xi ; Sy,n =

i=1

Sxy,n =

n X

γ n−i xi yi ; Sx2 ,n =

i=1

n X

γ n−i yi

i=1 n X

γ n−i x2i

i=1

The immediate consequence of the definition is that S1,n , Sx,n , Sy,n , Sx2 ,n , and Sxy,n can be calculated recursively. That is, Sx,n = Sx,n−1 γ + xn S1,n = γS1,n−1 + 1 Sy,n = γSy,n−1 + yn Sx2 ,n = γSx2 ,n−1 + x2n Sxy,n = γSxy,n−1 + xn yn . Therefore, we can incrementally update S1,n , Sx,n , Sy,n ,Sx2 ,n , and Sxy,n without having to keep any time stamps after they have been used to update these 5 variables. The amount of storage needed per neighbor is reduced from the original 30 time stamp pairs (i.e., 60 numbers) ndown to 5 numbers. Furthermore, as n increases, 1 approaches 1−γ which is a constant requiring no S1,n = 1−γ 1−γ updates. The normal equations (1) and (2) can be rewritten as: βˆo S1,n + βˆ1 Sx,n = Sy,n

(3)

βˆo Sx,n + βˆ1 Sx2 ,n = Sxy,n

(4)

Solving (3) and (3), we have: Sx,n Sxy,n − Sx2 ,n Sy,n βˆo = 2 Sx,n − Sx2 ,n S1,n

(5)

Sx,n Sy,n − S1,n Sxy,n βˆ1 = 2 Sx,n − Sx2 ,n S1,n

(6)

Using iterative MMSE, one design parameter is the number of points to remember for the estimation. In recursive estimation, the

Similarly,

1

Sz 2 ,n = Sx2 ,n − 2kSx,n + k2 S1,n 0.9

Std. Error of Estimation (ticks)

0.8

Sw,n = Sy,n − cS1,n 0.7

Szw,n = Sxy,n − kSy,n − cSx,n + kcS1,n

0.6

To further simplify, we choose k = Sx,n /S1,n and c = Sy,n /S1,n . Therefore,

0.5

0.4

0.5

Sz,n = 0 0.55

0.6

0.65

0.7

0.75 Gamma

0.8

0.85

0.9

0.95

1

Sz 2 ,n = Sx2 ,n − Figure 5: Maximum and Median Standard Error of Estimation among All Node Pairs vs. γ. A larger γ gives more weight to earlier points.

forgetting factor γ is important. Fig.5 shows the effect of using different values of γ in the estimation accuracy. The x-axis shows the value of γ used, while the y-axis shows the median and maximum standard error of estimation among all pairs. In general, the performance of recursive estimation is not as good as iterative MMSE. However, in practice, both algorithms produce an estimate with a standard error that is less than 1 clock tick. Any error below 1 clock tick is completely overwhelmed by the quantization error. Therefore, there is no practical difference in terms of performance. Also notice that as γ increases the estimation error first reduces slowly, reaches an optimum around 0.925, and then increases very quickly. Therefore, it is safer to err on the side of a smaller γ. It is much more detrimental to forget too slowly than to forget too quickly.

6.4 Recursive Estimation without Overflow The solution presented in equations (5) and (6) are not suitable for direct implementation because of overflows. As n increases, Sx,n , Sy,n , Sx2 ,n , and Sxy,n all grow without bound and will evenP n−i tually overflow. Take Sx,n = n xi as an example. First i=1 γ we need to address the wrapping-around of xi . Second, as time goes on, the local clock of the receiver xi increases. The sum Sx,n is at least as big as xi because the weight γ n−i is equal to 1 for the most recent sample. Eventually, the variable containing Sx,n overflows. An important observation is that only the relative values of xi and yi are important. Adding a constant offset to the clocks of the senders and receivers does not change the essence of the regression. The motivation for adding an offset to xi is such that by carefully choosing the offset, we can make the most recent samples of xi to be close to 0, and earlier samples take on large negative values. In the sum, earlier sample points have a geometrically decreasingP weight that quickly approaches zero. Therefore the sum n−i Sx,n = n xi becomes bounded. i=1 γ To illustrate, suppose we add an offset of −k and −c to xi and yi respectively. This amounts to a change of variables from xi and yi to zi = xi − k and wi = yi − c. We want to compute Sz,n , Sz 2 ,n , Sw,n , and Szw,n using Sx,n , Sy,n , Sx2 ,n , and Sxy,n without redoing the recursive computation from scratch. n X i=1

γ n−i zi =

n X i=1

γ n−i (xi − k) = Sx,n − kS1,n

2 Sx,n S1,n

Sw,n = 0 Szw,n = Sxy,n −

Sz,n =

(7)

Sx,n Sy,n S1,n

(8) (9) (10)

After the change of variables, we can calculate zˆi and get yˆi by adding c back to it. The result is exactly the same as if we had never changed the variables. Furthermore, if we change our variables twice, first by an amount of (k1 , c1 ), then again by (k2 , c2 ), the result is the same as changing the variables by an amount of (k1 + k2 , c1 + c2 ). The beauty of this is that if we change variable after each iteration by making Sw,n and Sz,n both zero and then renaming z as x and w as y, we can avoid overflowing Sx,n , Sy,n , Sx2 ,n , and Sxy,n completely. Algorithm.1 shows the complete recursive algorithm for updating the sums and calculating yˆi .

6.5 Computational Efficiency So far we have ignored the problem of computational efficiency. Since the time stamps are 32-bit long, we have carried out all arithmetic in 64-bit double precision floating point numbers to prevent loss of precision. However, many low cost platforms including our Telos motes do not support floating point numbers in hardware. Worse, the GNU C compiler we use supports only singleprecision floating point emulation for our motes. If we use only single-precision floating point numbers to implement Algorithm.1, the estimation error would become unacceptably large. In order to overcome this, we rely on several important observations. First, yi increases at 32768 ticks/second, but yi −xi grows at the relative drift rate between 2 neighbors’ clocks, which was measured to be no more than 1 tick per second in Sec.5.2. Instead of regressing the sender’s time stamp yi on the receiver’s time stamp xi , we can regress yi − xi on xi . Algorithm.1 can be readily reused by feeding it (yi − xi , xi ) instead of (yi , xi ). As a result, the estimated sender’s time stamp is now yˆi + xi instead of yˆi . By using yi − xi , the magnitudes of Sy , Sw , Sxy , and Szw are smaller by a factor of 214 . Unfortunately, the optimization just mentioned does not reduce magnitude of Sx2 . In order for all calculations to fit in single precision floating numbers without much loss of precision, we need to reduce xi . Note that the synchronization algorithm regresses y − x on x, but the magnitude of these two numbers are vastly different. This is because they increase at vastly different rates. The quantity y − x represents the drift between the clock of the sender and that of the receiver. Therefore, its magnitude grows at the drift rate

Algorithm 1 Recursive estimation algorithm without overflow.

7. EXPERIMENTAL EVALUATION

/* Initialization */

7.1 Overview

γ = 0.9 /* Tunable parameter */ READ (x1 ,y1 ) /* From first beacon packet */ i = 1 /* num of beacons received so far */ k = x1 c = y1 S1 = 1 Sx = Sy = Sx2 = Sxy = 0 while (True) READ (xi,yi ) /* From incoming beacon */ i=i+1 ′ x = xi − k y ′ = yi − c if (i ≥ 3) /* Predict after 2 pts */ yˆi = βˆ1 x′ + c /* Estimated send time */ εi = y − yˆi /* Estimation error */ endif /* Update sums */

S1 = γS1 + 1 Sx = x ′ Sy = y ′ Sx2 = γSx2 + x′2 Sxy = γSxy + x′ y ′

In this section, we present the experimental results of our protocol. We first evaluate the performance of the time synchronization mechanism on its own. We then evaluate the ability of a node to accurately track the hopping pattern of a neighbor. The two major differences between the experimental results presented in this section and the emulation presented in the previous are: (i) nodes now hop according to their own random hopping sequences, rather than all be on the same channel and (ii) all algorithms run on the motes in real-time rather than on a PC. Due to hopping, a node cannot hear a beacon from a neighbor unless it is on the same channel at the time of the beacon, beacons are therefore much less frequent. In addition, the inter-beacon arrival times are random. These two effects place greater demand on the robustness of the synchronization algorithm. In these experiments, only beacon packets are used for synchronization. In an actual network, nodes will also send data packets, each of which can piggy-back a 4-byte sender time stamp for synchronization. We expect the synchronization accuracy in an actual network carrying traffic to be better than the experiments presented here because a continuous stream of data packets guarantees that neighbors who frequently communicate are well synchronized.

/* Change of variable */

Sz 2 = Sx2 − Sx2 /S1 Szw = Sxy − Sx Sy /S1 k = k + Sx /S1 c = c + Sy /S1 βˆ1 = Szw /Sz 2

/* Renaming variable */

Sx 2 = Sz 2 Sxy = Szw

end while

between the two nodes. Under normal operating conditions, we measured the drift to be no more than 30ppm. Since each tick is 30.5µs, the magnitude of y − x increases by no more than 1 every second. In contrast, x is the receiver’s clock, which increases by 32768 every second. Therefore, x is a much larger quantity than y − x with many more bits of precision. However, since y − x has only a few bits of precision, having many extra bits of precision in x do not improve the accuracy of the prediction. We exploit this fact and ignore the 12 least signification bits of x in our regression calculation with no significant loss in prediction accuracy. In addition, the most significant bits turn out to be also unimportant. As an example, suppose that neighbors are timed out after 2 minutes. In 2 minutes, only the lower order 21 bits may change. Therefore, one can safely ignore the 11 higher order bits of the 32bit time stamp without any confusion due to wrap-around. Combining these two observations about x, we can throw away the upper 11 bits and the lower 12 bits of the 32-bit time stamps. Therefore, the value of x can now fit in 9 bits instead of 32. As a result, we have reduced the magnitude of Sx2 by a factor of 2(32−9)×2 = 246 without significant increase in estimation error. Using a combination of the above mentioned techniques, we are able to carry out all the calculations specified in Algorithm.1 using only integers and single-precision floating point numbers.

7.2 Time Synchronization Experiment In this experiment, there are 16 nodes in the network. All nodes are in range of one another. The nodes hop over M channels according to their own random hopping sequences. After powering on, each node begins broadcasting beacons once every 1 or 3 seconds with a random delay. Each beacon packet contains the 16-bit address of the sender and the time stamp txTime when the packet is sent. The sender senses the medium right before sending the beacon. If medium is busy, the beacon is simply skipped. When a node receives the first beacon from a neighbor, it remembers this sender’s address and time stamp, which are used to predict the sender’s channel in the future. The receiver also records the local time rxTime when the packet arrives. Subsequently, when the node receives another beacon from this neighbor, it uses the send and receive time stamp pairs to (i) update the clock offset and drift estimation of this neighbor by running the synchronization algorithm and (ii) gauge the accuracy of synchronization among them. To gauge the accuracy of time synchronization, the receiver uses rxTime to estimate the sender’s txTime. Each node collects statistics on the difference between the estimated sender time estiTxTime and the actual txTime and reports them at the end of the experiment to PC for processing. On our platform, since the time stamps are generated by software, they occasionally have very large errors. These outliers, if accepted, severely reduce the accuracy of the synchronization algorithm. Since our earlier experiments showed that the crystal clocks on our platform do not drift more than 30ppm under our operating conditions, the time synchronization algorithm heuristically rejects (but counts) any beacon that indicates a clock drift of over 30ppm since the last accepted beacon from this sender. This simple heuristics successfully removes the vast majority of outliers encountered.

7.3 Time Synchronization Results Table 2 shows the results of the synchronization experiment with 16 nodes. In general, nodes are able to track their neighbors’ clocks to within 6 ticks (i.e., 181.5µs) 99.9% of the time when each node beacons once every second. As expected, the accuracy of synchro-

nization goes down as the number of channels increases or when beacon interval increases. When the number of channels is M , each time a node beacons, each of its neighbor hears it with a probability of 1/M (ignoring secondary effects such as packet corruption). Therefore, as M increases, the average interval between receiving beacons from any neighbor increases. Each node predicts the sender’s clock value contained in a beacon packet when it is received. When beacon arrivals are farther apart, the average time since the last beacon is longer. Therefore, the prediction is based on older information, resulting in more error when received beacons are farther apart. In addition, an increase in interval between beacons also has two other subtle effects. First, for a fixed forgetting factor γ, each received time stamp has a longer lasting effect since updates are less frequent. Second, the probability of accepting an outlier increases. Consider the first effect. Suppose the optimal forgetting factor is γˆ , if a node receives a beacon from its neighbor every v seconds on the average. When the average interval increases from v to v ′ , using the same forgetting factor of γˆ is no longer optimal – forgetting becomes too slow. For a comparable real time before a time ′ stamp is forgotten, the forgetting factor must be reduced to γ ˆ v /v to speed up forgetting. Therefore, in our experiments nodes adapt the forgetting factor according to the M and v using this formula: γ = (0.9)vM/4 . The second effect is much harder to compensate for. At the core of the problem is that on our platform, software, rather than hardware, generates the time stamps. Software time stamping is prone to large errors. Under our operating conditions, neighbor clocks do not drift more than 30ppm or equivalently 1 tick per second. If updates are seconds apart, two clocks can at most drift no more than several ticks. Therefore, one can easily identify packets with time stamp errors because they tend to be at least 50 ticks or more. The problem is that when updates are farther apart, it becomes intrinsically harder to distinguish packets with erroneous time stamps from those widely different timestamps due to a fast varying clock. As a result, when beacon interval is 3s, the percentage of packets rejected as outliers is smaller. One of the fields in Tab.2 is called ”% beacon (init)”. This field shows the percentage of beacon packets that arrive at a neighbor either for the first time or after a timeout. By default, a neighbor is timed out if it has not been heard for about 1 minute. Since the beacon arrival times are random, there is a chance that a neighbor has not been heard before the time out. When beacons are sent 3 seconds apart, and there are 8 channels, the average beacon arrival from a particular neighbor is 24 seconds. Due to the pseudo-random hopping sequences, the probability that a node has not heard a neighbor before a timeout is high enough that 16.58% of beacon packets arrive from a neighbor that has been timed out. This clearly shows that the beacon period should be reduced. Even in this case, however, the root-mean-squared synchronization error is less than 3 clock ticks.

7.4 Channel Tracking Experiments In general, the ability of a sender to accurately track the current channel of its receiver depends on the quality of the time synchronization. However, this relationship is not straightforward. For example, if the sender is late, the packet might still arrive before the receiver hops again. Therefore, the packet is received properly even though the synchronization is not perfect. In this set of experiments, 15 nodes periodically broadcast beacon packets and synchronize with one another as in the sync experiments. In addition, each node sends and receives probe packets, which are unicast packets to a particular node ID. The simultane-

No. of Nodes No. of Channels Beacon Interval [s] Forgetting Factor γ # beacons sent (total) # beacons received % beacons accepted % beacons rejected % beacons (init) RMS Sync Error [ticks] % w/ ≤ 3 tick Error % w/ ≤ 6 tick Error

16 4 1 0.9 28450 90227 99.31 0.42 0.27 1.262 98.67 99.99

16 4 3 (0.9)3 9512 30062 97.03 0.24 2.73 1.346 99.04 99.64

16 8 1 (0.9)2 28442 44025 98.54 0.36 1.10 1.171 99.64 99.90

16 8 3 (0.9)6 9509 14871 83.16 0.27 16.58 2.624 93.13 96.85

Table 2: Summary of Time Synchronization Experiments ous sending of beacons and probes are meant to emulate different devices trying to discover their neighbors, synchronize with them, and send data packets to them in a multi-channel network. The traffic load on the network, however, is light. As a sender, a node periodically picks a random neighbor node ID (1 through 15) that is currently synchronized and schedules a probe to that neighbor. It schedules the packet to begin within the receiver’s Contention Window (see Sec.4.1) only. To achieve this, the node converts its local clock time to the neighbor’s time frame, picks a random Small Slot that falls within the Contention Window of the neighbor’s next Big Slot. It then converts target receive time back to its own time frame and schedules the probe to be sent at precisely that time. The node then immediately switches to the neighbor’s home channel in the upcoming Big Slot and waits. The node does a carrier sense just before sending the probe. If the medium is busy, the probe is aborted. A probe is counted as sent only if it is actually transmitted over the air. The sender node then logs to PC these relevant fields: srcID, destID, probeSeqNo, targetRxTime, where srcID is the sender node’s ID, and probeSeqNo is the per-receiver probe sequence number. Each node, upon receiving a probe, writes to the PC these fields: srcID, destID, probeSeqNo, where destID is its own node ID. srcID and probeSeqNo are copied from the incoming probe.

7.5 Channel Tracking Results Table.3 shows the results of the channel tracking experiments. We vary the beacon interval and the probe interval for different columns. Each column shows the average result of 4 runs of 20 minutes each. Note that we have fixed the forgetting factor γ at 0.96 because the performance of our protocol is insensitive to it. Overall, the synchronization accuracies are comparable to those in the synchronization-only experiments (Tab.2). Measured synchronization errors between nodes are less than 6 ticks for over 99% of beacon packets for all 4 runs. However, the probe reception rates are not as high as the synchronization accuracy would suggest. Through a series of additional experiments, we determined that a large percentage of lost probes are actually sent on the correct channel at the desired time, and even detected by the receiver. However these packets are not eventually passed to the software either due to a failed CRC check or software bugs. We quantified the number of such packets by logging the time periods for which the radio is receiving a packet. Our radio has a dedicated pin called SFD pin which tells the software if it thinks it is in the process of receiving a packet. During the experiments, each node keeps track of the periods for which the SFD pin is on in

No. of Nodes No. of Channels Beacon Interval [s] Probe Interval [s] Forgetting Factor γ # sent beacons # received beacons % beacons accepted % beacons rejected % beacons (init) RMS Sync Error [ticks] % w/ ≤ 3 tick Error % w/ ≤ 6 tick Error # probe packets sent # probe packets received % probe received # lost probes detected % probe recv’d+detected

15 4 1 1 (0.9)6 17313 50237 99.05 0.46 0.43 1.168 99.54 99.93 14020 13005 92.76 560 96.76

15 4 1 3 (0.9)6 17484 50784 99.24 0.33 0.42 1.140 99.62 99.95 4659 4382 94.04 158 97.43

15 4 3 1 (0.9)6 5717 16446 96.05 0.49 3.38 1.670 98.52 99.38 13852 12889 93.05 696 98.07

15 4 3 3 (0.9)6 5778 16810 96.65 0.19 3.16 1.545 98.78 99.45 4674 4428 94.73 183 98.64

Table 3: Channel Tracking Results

flash memory. At the end of the experiment, these periods are sent to a PC for post-processing. For each missed packet, we compare the time when a packet is supposed to arrive with these periods when the receiver’s radio is actually receiving. We discovered that out of the 5.27% to 7.24% of probe packets lost, a majority of them (3.39% to 5.02%) are actually detected by the hardware. These packets are detected within 5 ticks of their target receive time and have a matching length on the air. The hardware also confirms the destination address matches that of the receiver. Therefore, we are confident that these detected packets are the matching lost probe packets. Since most lost probe packets are actually detected by the hardware, we conclude that the synchronization and rendezvous mechanisms are not the culprit for the majority of losses. Instead, we suspect a bug in the radio control software causes the error. We include in the last two rows of Tab.3 the number of such detected probes and the combined percentage of packets received or detected. Overall, we can confirm that over 96% probe packets are sent at the right time on the correct channel. We therefore conclude that our synchronization algorithm achieves the desired level of synchronization for multi-channel MAC protocol operation.

7.6 Generality of Solution To discover practical problems of implementing a multi-channel MAC protocol, we have to implement our protocol on a particular hardware platform. However, our platform is a typical one consisting of a watch-grade crystal oscillator connected to a low-power microcontroller and an ASIC radio. Therefore, we believe that the challenges we face are not unique to our particular hardware platform, but are rather general. Our synchronization algorithm has 4 parameters: maximum allowed drift rate between neighbors, forgetting factor (γ), beacon interval, and neighbor timeout. These parameters should be adapted at design time to fit the hardware platform but they need not be optimal for the multi-channel MAC protocol to operate. In this section, we discuss how these parameters should be adapted to a new platform and the tradeoffs between generality and synchronization performance. Maximum Allowed Drift Our synchronization algorithm uses the maximum drift parameter to help detect outliers, which are

caused by occasional large timestamping errors in software. A large maximum allowable drift may cause a receiver to admit timestamps with a large error into its synchronization algorithm. However, a small allowable drift may cause a receiver to reject valid timestamps when the actual drift rate between two neighboring clocks is larger than expected (e.g., when the temperature of the two nodes is vastly different in an outdoor environment). The tradeoff between larger admissible drift range and better rejection of outliers is inherent. Fortunately, the accuracy of a crystal oscillator in terms of ppm deviation from real-time can be obtained from the manufacturer of the crystal. This number can also be determined experimentally as the maximum rate difference between the fastest and slowest clock. For better robustness, this maximum drift rate should be multiplied by a safety factor. Forgetting Factor The forgetting factor represents a tradeoff between using more sample points (i.e., longer history) to reduce estimation error and using a shorter history to more quickly adapt to short term variation in the relative drift on the order of minutes. If minimizing synchronization errors is the goal, one should tune the forgetting factor to match the stability characteristics of the crystal clocks used and the interval between receiving beacons. Fortunately, synchronization only has to be accurate enough such that two neighbors can rendezvous with a high probability. The results in Sec.6.3 show that the effect of optimizing the forgetting factor is negligible relative to the resolution of the clock, so it is unnecessary to optimize precisely as shown in Sec.7.4. Beacon Interval and Neighbor Timeout Beacon interval controls how often a node broadcasts (in one-hop) an empty packet containing its current clock value and its hopping seed (same as its MAC address). Neighbor timeout controls how long a neighbor can remain in a neighbor table after receiving the last packet from this neighbor. The results in Sec.7.4 show that beaconing once every few seconds is more than sufficient for synchronization purposes. The real reason for more frequent beacons is to help neighbors detect link changes faster. Assuming that there are M channels, and each node sends k beacons per timeout period, the probability of timing out a neighbor because all beacons sent by it are lost due to random hopping is: M −1 k ) . M One can easily calculate the minimum beacon interval required to achieve the desired probability of timeout. In an actual network carrying traffic, since each data packet can carry an extra 4-byte timestamp, the dependence on beacons is further reduced. In summary, even though there are four parameters available for tuning, optimal settings of these parameters are not necessary for correct multi-channel MAC operation. ptimeout = (

8. CONCLUSION Multi-channel MAC protocols hold good promise for increasing the throughput of an ad hoc network in a practical way without any hardware changes. We argue that the implementation of these protocols are hindered by the lack of suitable synchronization schemes because of the unique requirement that nodes are not all on the same channel. We have also shown that the synchronization requirements of these protocols are actually non-trivial, especially when the devices are limited in storage and computation power. We proposed a practical synchronization mechanism and implemented a simple version of a multi-channel MAC protocol using off the shelf sensor network hardware. Our results show that it is possible to synchronize one-hop neighbors as the nodes beacon periodi-

cally while hopping. The hardware platform we use is very limited in terms of computational power and memory capacity. The clocks have a coarse granularity of only about 30.5µs. Furthermore, the radio is not optimized for high speed hopping either. Despite these constraints, we show that one-hop neighbors can track their onehop neighbors quite precisely. We show that a sender can track its receiver’s hopping sequence over 99% while they are both hopping on 4 channels. In summary, our proposed synchronization techniques achieves a level of pair-wise synchronization that is enough for parallel rendezvous multi-channel MAC protocols. We therefore expect parallel rendezvous protocols to be a practical way to increase the network throughput of an ad hoc wireless network in which each node has only one radio.

9.

[16]

[17]

[18]

REFERENCES

[1] Polarizone Technologies Sdn. Bhd. ( 600204-U ). SDR Design Bench http://www.polarizone.com/. [2] Emmanuelle Anceaume and Isabelle Puaut. Performance evaluation of clock synchronization algorithms. Technical Report RR-3526, INRIA, 1998. [3] Chipcon AS. CC2420 2.4 GHz IEEE 802.15.4 / ZigBee-ready RF Transceiver Data Sheet (rev. 1.3) http://www.chipcon.com/files/ CC2420 Data Sheet 1 3.pdf. [4] Chipcon AS. CC2500 Single Chip Low Cost Low Power RF Transceiver http://www.chipcon.com/files/ CC2500 data sheet 1 1.pdf. [5] Chipcon AS. http://http://www.chipcon.com/. [6] P. Bahl, R. Chandra, and J. Dunagan. Ssch: Slotted seeded channel hopping for capacity improvement in ieee 802.11 ad-hoc wireless networks. In MobiCom, 2004. [7] Jean-Marc Berthaud. Time synchronization over networks using convex closures. IEEE/ACM Trans. Netw., 8(2):265–277, 2000. [8] J. Chen, S. Sheu, and C. Yang. A new multichannel access protocol for ieee 802.11 ad hoc wireless lans. In PIMRC, volume 3, pages 2291 – 2296, 2003. [9] UC Berkeley EECS 150 Components and Design Techniques for Digital Systems. Data sheets for calinx /calinx2 http://www-inst.eecs.berkeley.edu/ ∼cs150/sp06/Documents.php, 2006. [10] Jeremy Elson, Lewis Girod, and Deborah Estrin. Fine-grained network time synchronization using reference broadcasts. In OSDI ’02: Proceedings of the 5th symposium on Operating systems design and implementation, pages 147–163, New York, NY, USA, 2002. ACM Press. [11] Saurabh Ganeriwal, Ram Kumar, and Mani B. Srivastava. Timing-sync protocol for sensor networks. In SenSys ’03: Proceedings of the 1st international conference on Embedded networked sensor systems, pages 138–149, New York, NY, USA, 2003. ACM Press. [12] Simon Haykin. Adaptive Filter Theory. Prentice Hall, 4th edition, 2001. [13] Wing-Chung Hung, K.L. Eddie Law, and A. Leon-Garcia. A Dynamic Multi-Channel MAC for Ad-Hoc LAN. In Proc. 21st Biennial Symposium on Communications, pages 31–35, Kingston, Canada, June 2002. [14] Atheros Communications Inc. Atheros AR5002G 802.11b/g WLAN solution http://www.atheros.com/. [15] Miklos Maroti, Branislav Kusy, Gyula Simon, and Akos Ledeczi. The Flooding Time Synchronization Protocol.

[19]

[20]

[21]

[22]

[23] [24]

[25]

[26]

[27]

[28]

Technical Report TR No.: ISIS-04-501, Institute for Software Integrated Systems, Vanderbilt University, 2004. David L. Mills. Internet time synchronization: The network time protocol. In Zhonghua Yang and T. Anthony Marsland (Eds.), Global States and Time in Distributed Systems, IEEE Computer Society Press. 1994. Jeonghoon Mo, H. Wilson So, and Jean Walrand. Comparison of multi-channel mac protocols. In the 8-th ACM/IEEE International Symposium on Modeling, Analysis and Simulation of Wireless and Mobile Systems, October, 2005. Joseph Polastre, Robert Szewczyk, and David Culler. Telos: Enabling Ultra-Low Power Wireless Research. In Proc. of the Fourth International Conference on Information Processing in Sensor Networks: Special track on Platform Tools and Design Methods for Network Embedded Sensors (IPSN/SPOTS), April 25-27 2005. Kay Romer, Philipp Blum, and Lennart Meier. Time synchronization and calibration in wireless sensor networks, October, 2005. Jungmin So and Nitin H. Vaidya. A multi-channel mac protocol for ad hoc wireless networks. Technical report, UIUC, 2003. IEEE Computer Society. ANSI/IEEE Std 802.11 1999 Edition (R2003) Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications. IEEE Computer Society. IEEE 802.15.4: Wireless Medium Access Control (MAC) and Physical Layer (PHY) Specifications for Low-Rate Wireless Personal Area Networks (LR-WPANs), 2003. The TinyOS Project. http://webs.cs.berkeley.edu/tos/. A. Tzamaloukas and J. Garcia-Luna-Aceves. Channel-hopping multiple access with packet trains for ad hoc networks. In In Proc. IEEE Mobile Multimedia Communications (MoMuC ’00), Tokyo, 2000. Asimakis Tzamaloukas and J. J. Garcia-Luna-Aceves. Channel-hopping multiple access. In ICC (1), pages 415–419, 2000. Maxim Integrated Products Inc. Sunnyvale California USA. Max2820, max2820a,max2821, max2821a 2.4ghz 802.11b zero-if transceivers data sheet rev. 04/2004, 2004. Shih-Lin Wu, Chih-Yu Lin, Yu-Chee Tseng, and Jang-Ping Sheu. A Dynamic Multi-Channel MAC for Ad-Hoc LAN. In Proc. International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN ’00), page 232, Dallas/Richardson, Texas, USA, December 2000. Shih-Lin Wu, Yu-Chee Tseng, Chih-Yu Lin, and Jang-Ping Sheu. A Multi-channel MAC Protocol with Power Control for Multi-hop Mobile Ad Hoc Networks. The Computer Journal, 45:101–110, 2002.