Extensions To ARP For Cache Revalidation

Viewer
Transcript

Extensions To ARP For Cache Revalidation Dan Ardelean CS Dept., Purdue University

Douglas Comer Cisco Systems

[email protected]

[email protected]

Draft of 2004/10/22 20:08

Abstract Applications such as Voice Over IP (VoIP) make it important that the end-to-end delivery service provided by the underlying network exhibits low average latency, and that the maximum latency encountered does not differ significantly from the average (i.e., variance is low). Furthermore, performance of protocols such as TCP is much lower when the network introduces significant changes in the round-trip time or it reorders packets. Thus, it is desirable that the communication system maintains low variance in latency and does not reorder packet transmission. This paper observes that current implementations of the Address Resolution Protocol can create significant jitter and can result in packet reordering, and the potential for poor performance increases because revalidation can become synchronized. The paper proposes an extension to RFC826 that uses early, randomized cache revalidation to reduce jitter and avoid reordering. We argue that although it is useful in most implementations of ARP, the extension is of special importance in systems such as network processors that separate control processing from fast-path processing.

1

Introduction

Because it fills a fundamental role in the TCP/IP protocol suite, the Address Resolution Protocol (ARP) [1] is used throughout the Internet in both hosts and routers. ARP is used to translate Internet Protocol addresses (IP addresses) used by applications and upper layers of the protocol stack into physical addresses (also called MAC addresses) used by the underlying hardware. For a discussion of ARP and alternatives, see [2]. Although in practice ARP is used almost exclusively with Ethernet networks, the protocol is designed to accommodate heterogeneous address types. Furthermore, ARP provides late binding between a protocol address and an equivalent MAC address. That is, ARP delays address binding until a sender has generated an IP datagram to forward to another system (host or router) on the local network. Once a sender has used its IP routing table to compute S, the IP address of a system to which Draft: 2004/10/22 20:08

the datagram should be sent, the sender invokes ARP software to map S into an equivalent MAC address, M . We say that the software resolves the IP address S. Because address resolution is performed for each datagram being transmitted, the result of the address mapping can change at any time. To resolve an IP address that has not been encountered previously, ARP uses the underlying network. ARP software broadcasts an ARP request message that contains the IP address S for which a binding is needed. All systems on the network receive the broadcast, but only the system that has been assigned address S answers the request. The answer consists of an ARP response message that specifies the pair (S, M ). Once a reply arrives, the original host uses the MAC address to send the datagram to S.

2

ARP Caching

Sending an ARP request and waiting for the subsequent reply introduces significant latency; doing so for each datagram transmission is intolerable. Thus, to reduce overhead, ARP maintains a cache of recent address bindings. Whenever it receives information about a binding (S, M ), ARP places the pair in the cache. On subsequent requests for resolution of address S, ARP extracts the binding from the cache without sending additional request messages across the network. Although not mandated by the standard, most implementations of ARP buffer (i.e., hold) the outgoing datagram while waiting for ARP to resolve the address. Thus, the first datagram sent to a given destination experiences high latency because the datagram must wait while ARP sends a request and waits for a reply. Once ARP has used the network to contact a given system, however, subsequent traffic to the system proceeds with lower latency because resolution requires only a local computation and does not need a round trip across the network. To ensure that ARP bindings are dynamic (i.e., to allow a computer’s MAC address to change or to allow a computer to move to a new network), entries in an ARP cache are not permanent. Instead, ARP only allows a binding to persist for a bounded time before the bindPage: 1

ing must be revalidated. In practice, implementations of ARP use soft state to manage the cache. That is, ARP associates a timer with each binding in the cache. The timer is set when the entry is created; when the timer expires, the entry is removed. Removing an entry from the cache means ARP will treat the address as unknown; the next request to resolve the address will result in the broadcast of a new ARP request.

2.1

Consequence of ARP Caching

The ARP caching mechanism introduces three problems: packet reordering, minor jitter from a single ARP request, and significant jitter from multiple ARP requests. Packet Reordering. To see how reordering can occur, assume application software on host A generates a datagram sent to H1 followed by a datagram sent to H2 . Although the datagram to H1 is generated first, ARP may need to delay the transmission while using the network to obtain the necessary binding. While it waits for an ARP reply, the computer continues processing. Thus, the datagram to H2 can be processed. If address H2 can be resolved from the cache, the system will proceed to transmit the second datagram while waiting for the reply (i.e., a datagram generated later will be sent before the datagram generated earlier). Minor Jitter. The jitter that results from ARP cache management is obvious: the initial packet to any system will experience more delay than subsequent packets because the initial packet must wait for a round-trip time while ARP sends a request and receives a reply. It can be argued that for most applications, the additional latency (typically a few milliseconds) is so small that it does not have any measurable effect on the application. Significant Jitter From Multiple ARP Requests. To understand how jitter can become significant, consider a worst case: assume a packet is sent along a path in the Internet and that none of the hops along the path has ever had traffic. At each hop, the ARP cache will be empty, so ARP will need to broadcast an ARP request for the next computer. That is, ARP on the initial source will broadcast a request to resolve the address of the first router, the first router will broadcast an ARP request to resolve the address of second router, and so on. Thus, each hop along the path sends three packets across the network that leads to the next hop (the ARP request, ARP reply, and the packet being sent). As a result, if we ignore the extra time required to manage the ARP cache and only measure network latency, the total latency will be three times the latency required to send a subsequent packet. Draft: 2004/10/22 20:08

2.2

ARP Cache Management And Worst Case

To see how the worst case can occur in practice, consider the ARP cache management described in the protocol standard [1]. Figure 1 lists the sequence of steps that ARP software performs when an ARP packet arrives. merge_flag = FALSE; Extract the sender’s protocol address, S and MAC address, M from ARP message; if S is in arp cache { update the hardware address in the entry with M; reset the expiration timer of the entry; merge_flag = TRUE; } if the target address is one of the local protocol addresses { if merge_flag == FALSE { add tuple (S,M) to the cache set entry expiration timer } if this is a request { prepare the reply and send it directly to the host requesting } }

Figure 1: The sequence of steps ARP performs when a message arrives.

It should be noted that the exact details of timer management specified in Figure 1 are not specified in the standard [1]. In particular, although it discusses the need for cache entry expiration, the RFC does not not mandate that the timer must be reset when an entry is updated with fresh information. However, most operating systems use the semantics described above, and it can be taken as a de facto standard. Interestingly, the straightforward timer management scheme allows worst case jitter to occur. To see why, consider an initial transmission to a new destination. At each hop along the path, ARP sends a request and uses the reply to install a new cache entry for the next hop. Assuming each implementation uses a fixed timeout and chooses the same value for a cache timer, T , each entry along the path will expire T time units later. That is, timers along the path will be synchronized. Thus, it is likely that once the timers expire, the next datagram sent along the path will experience a significant increase in latency. Furthermore, the change in latency can be large enough to cause TCP’s congestion management mechaPage: 2

nisms to back off or to be noticeable in a VoIP applications. There is one small exception to the synchronization described above. According to the standard, ARP updates the cache entry for the sender before checking the message type. Doing so allows the receiver to update an existing entry for system H whenever H sends either a request or a reply. Furthermore, the protocol contains an optimization – if computer X sends an ARP request for computer Y , in addition to replying, computer Y places an entry in its ARP cache for computer X. The optimization is based on the observation that if X intends to send a datagram to Y , probability is high that Y will send a datagram to X in the near future (because most applications involve a two-way exchange). Thus, synchronization can be broken in cases where other systems send ARP messages for unrelated flows (e.g., if X broadcasts a request for Y , the request will cause Y to updates its cache).

3

Early Revalidation Of The ARP Cache

We propose using a form of early cache revalidation to eliminate jitter caused by ARP timeout. That is, an ARP implementation can anticipate the impending need for an ARP request and send the request before the need arises. Anticipation means watching the use of each ARP cache entry – inactive entries are allowed to expire as usual, but early revalidation is applied to any entry that has been used in the last K time units. Sending the request means transmitting a normal ARP request message as if the entry had expired. Unlike the current algorithm, however, early revalidation leaves the entry marked valid, and resolution continues to use the entry while the request proceeds. If the time of revalidation is chosen carefully, the reply will arrive and the timer will be reset before the entry becomes invalid. Thus, when early revalidation is in place, ARP will not inject any jitter.

3.1

Timing Of Early Revalidation

How early should an entry be revalidated? On one hand, revalidation should be early enough to leave a safe margin before the timer expires (e.g., to account for operating system overhead and process scheduling). On the other hand, because revalidation introduces more network traffic, revalidating too often is undesirable (e.g., revalidating at approximately 10% of the timer value increases ARP traffic by an order of magnitude). Assume that the timer set for ARP cache entry expiration is T . Because revalidation involves sending an ARP request and receiving an ARP reply, revalidation requires at least RT T time units, where RT T is the round-trip time of the underlying network. Obviously, early revalidation must send the request no later than T − RT T (or the reply will arrive after the timer exDraft: 2004/10/22 20:08

pires). In a practical situation, early revalidation must allow for other delays such as the time required to form a request, variance in the round trip time of the underlying network, processing time on the remote computer, and the time required to handle a reply. Although it may be possible to measure each source of latency, doing so will be difficult. Moreover, on many systems, the ARP cache timer, T , is set to a multiple of minutes (20 minutes is a popular value). As a result, processing overhead will be a small fraction of T . Thus, one way to choose an estimate for the overhead is to use a fraction of T . For instance, one might choose O = .1T , and revalidate at time T − RT T − O.

3.2

Self-Clocking Revalidation

Timer management is arguably the most demanding aspect of ARP because the timing requirements mean that ARP cannot be implemented on a system that does not support timers. Adding early revalidation timers increases the requirements further. To ameliorate the problem, we propose a self-clocking scheme for early revalidation in which early revalidation requests are generated as a side-effect of ARP cache lookup. That is, when a lookup is performed, check the time remaining before the entry expires. If no revalidation is in progress and the time remaining lies between O+RT T and 2(O+RT T ), initiate a revalidation (i.e., send an ARP request) and mark the entry as being revalidated. Although it is possible to use a single timer for both early revalidation and conventional timeout, doing so introduces programming complexity. Self-clocking achieves the same effect with less mechanism.

3.3

Applicability

Early revalidation can be used in an arbitrary host or router to avoid jitter and prevent reordering problems. Although it may be helpful in general situations, it is especially important in two special cases: architectures in which the fast data path is separate from the control path (e.g., a high-speed router or a network processor [4]), and devices that contact many other destinations (e.g., an access router). In architectures with separate control and data paths, ARP lookup for outgoing packets is handled in the fast path, often with assist from hardware such as a T-CAM. However, an ARP cache miss or an incoming ARP packet is treated as an exception that requires processing in the slow path. Thus, most ARP cache management proceeds through the slow path, which can become a bottleneck. As a result, an ARP timeout can cause substantially more delay than normal forwarding. In any device that contacts many other devices, ARP can become a bottleneck if multiple entries time out near the same time (e.g., within one RT T ) because the device will send many requests and receive Page: 3

many ARP responses, which can cause buffer overflow. Using early revalidation avoids the problem while maintaining the soft state of ARP entries.

4

Iterative Prevalidation

The early validation described above is optimized for situations with high locality of temporal reference. That is, early revalidation only applies to a target that continues to receive traffic. The technique can also be extended to handle situations where a given destination only receives occasional traffic separated by long idle periods (e.g., an SNMP message to each system on a network once per hour). To do so, a host or router prevalidates a set of IP addresses by establishing an ARP binding for each address before network traffic arrives for the address. In the most straightforward approach to prevalidation, a system iterates through all possible addresses on each attached network and requests ARP information for each address. The needed information can be computed from the IP address and address mask assigned to each interface. For example, consider a router with an interface on network 192.168.10.0/24. From the mask and interface address, the router can determine that the potential address space of the network ranges from 192.168.10.1 through 192.168.10.254 because the all-0’s and all-1’s broadcast addresses must be excluded from the range. The router iterates through the interval by sending an ARP request for each address. If a given address is currently in use, the target will send an ARP response, and the router will add the binding to its ARP cache. Prevalidation can be used periodically to maintain ARP information for arbitrary destinations on the network.

4.1

Advantages And Disadvantages Of Iterative Prevalidation

Prevalidation has the advantage of eliminating jitter for the initial transmission along a path. Doing so improves latency for the initial message of a connectionless transmission and provides a better round-trip estimate for the initial messages of a connection-oriented protocol such as TCP. However, iterative prevalidation has two disadvantages: it requires a larger ARP cache and can flood the network with broadcast requests. The first disadvantage can be important for systems in which the ARP cache is smaller than the number of systems attached to the network because prevalidation can place an entry in the cache even if no datagrams will ever be sent to the destination. More to the point, prevalidation can cause a cache overflow situation in which superfluous entries (i.e., entries that will never be needed) can displace entries that are being heavily used. Ironically, although prevalidation is intended to reduce jitter, displacing a heavily used entry causes unnecessary jitter. Draft: 2004/10/22 20:08

To understand how bursts of ARP requests can become problematic, consider the load on the network and the load on attached hosts. Because each ARP request is broadcast, every computer attached to the network must receive and process every request. In particular, in addition to routers and desktop systems, small, slow devices such as embedded systems or sensors must also handle each broadcast. For a network with a handful of possible host addresses, the series of broadcasts will not be overwhelming. Consider, however, a network with an address mask of 16 bits (i.e., a network in which 16 bits specify a network prefix and 16 bits specify a host suffix (i.e., a /16 number). Such a network has 216 possible host addresses, which results in a burst of 216 packets from each router on the network. Ironically, a high speed router that implements prevalidation can cause more problems than a slow speed router because the high speed device will emit packets in a tighter burst.

4.2

Reducing Traffic Bursts

One way to solve the burst problem described in the previous section distributes ARP requests over a longer period of time. For example, suppose a network has n possible host addresses, and the possible time period for ARP prevalidation is t. Address requests can be spread uniformly throughout interval t. That is, instead of sending a burst of back-to-back requests, one request is sent at each t/n time increments. Alternatively, t can be divided into a set of k intervals, and the sender can transmit a set of n/k probes at the beginning of each interval. Despite its advantage in eliminating the difference in latency between an initial packet and later packets, prevalidation does not work well for all situations, even if traffic is spread out uniformly. For example, consider a network configured as 10.1.0.0/16, but which only has three hosts attached. Each hosts will be required to process all of the ARP probes even though the majority of them (i.e., all but three) will serve no purpose in establishing bindings. We observed that to maintain a complete and correct set of bindings at all times, prevalidation must be run periodically. As a result, prevalidation can be combined with conventional ARP revalidation: when prevalidating, give priority to addresses that correspond to active entries in the ARP cache. In particular, when sending a burst of requests, place requests for active addresses first (where they have a lower probability of being lost).

4.3

Randomized Probes

Another approach to the reduction of broadcast during prevalidation involves the use of randomization. If the network has n possible host addresses and t possible time units for validation, each address can be assigned a Page: 4

random time within t. Similarly, if an interval approach is used, randomization can be employed to spread requests across the beginning of each interval. Although randomization spreads requests, it may be better to use a uniform distribution for two reasons. First, generating random values can be computational expensive, and easiest on hardware that supports floating point computation. Second, from a networking point of view, distributing requests uniformly means the additional load on the network will remain constant, making performance more predictable (and helping avoid driving transport protocols into congestion reaction).

5

Implementation

We implemented the early revalidation mechanism presented above in the Xinu operating system as part of the protocol stack in [3]. Xinu is a simple and elegant operating system, well-suited to experimentation because it does not impose many scheduling or protection restrictions on the protocol stack. Although some differences may occur, the implementation on other UNIX-like platforms should be straight-forward. Although it might seem that significant resources are required for the implementation of early ARP revalidation, counts of timer operations show that for reasonable networks, the additional facilities required are negligible when compared to facilities required for conventional ARP. Figure 2 presents the time intervals involved in the revalidation process. REQUEST

VALID

RENEW

Dlookup

VALID

Drenew

Tnorecord Trevalid Texpire

Figure 2: Time intervals for early ARP revalidation We define three states for each entry in the ARP cache: REQUEST, VALID, RENEW, and we associate a single timer with each ARP entry. An entry is in the VALID state when the binding has been verified and no further action is pending; lookup and use proceeds. The entry is placed in the REQUEST state if the entry is not currently VALID, but an ARP request has been sent to the destination. Packets that need to use the entry must be queued to await the ARP reply. When a reply arrives, the entry will become VALID, the timer will be set to Trevalid , and the queued packets will be sent. To provide usage information, each ARP lookup must Draft: 2004/10/22 20:08

record the status. Thus, when a cache lookup finds an entry in the VALID state, the current value of the timer is compared to Trevalid . If the entry has less than Dlookup time remaining, a flag is set to indicate that the entry was used in the Dlookup time units before revalidation. The value of Dlookup is chosen in our implementation to 0.2Trevalid . If Trevalid is set to twenty minutes it yields to a value of four minutes for Dl ookup. Thus, if the entry is used in the last four minutes, an attempt will be made to revalidate the entry. Choosing the value Dlookup represents a trade-off. A large value can increase the probability of reducing problems such as jitter and reordering, but increases ARP traffic and uses more resource (more entries are likely to be present in the cache if all the used ones get revalidated). A small value for Dlookup decreases ARP traffic and resource usage, but jitter and reordering problems can be more frequent. Once the timer for an entry expires, the software checks the flag to determine if the entry should be revalidated or not. If the entry is idle (i.e., the flag is not set), the entry is deleted from the cache. If the flag is set, the entry state will change to RENEW and an ARP request will be unicast directly to the host. The timer is reset to expire at Drenew = RT T + O. If the timer expires, the entry will be deleted. When the ARP reply is received, the state of the entry will change to VALID, the lookup flag will be cleared, and the entry’s timer will be reset to the Trevalid . In comparison to conventional ARP software, our implementation contains only two important changes. First, an additional state is added for each ARP entry. Second, the code to execute on timer expiration is slightly more complex. However, the overhead is constant (i.e., does not depend on the number of entries), and insignificant compared to the round trip time on a typical local area network. Thus, we conclude that only a few additional lines of code are needed to implement the early revalidation described above. We have also implemented the revalidation technique on a network processor – the Agere APP550. Using revalidation and the APP550’s ability to classify packets at wire speed, we have succeeded in implementing most of the ARP lookup at wire speed.

6

Conclusion

This paper presents an early revalidation technique for management of entries in an ARP cache. It reports on an implementation that shows the mechanism can be implemented as a minor change to existing ARP implementations. Our revalidation is deterministic, and can be self-clocking. We also analyze ways the technique can be extended to prevalidation by explaining the advantages and disadvantages of each approach. In architectures that separate ARP processing from other packet Page: 5

handling, we have shown that early revalidation eliminates jitter and keeps ARP lookup exclusively on the fast path.

References [1] David C. Plummer, An Ethernet Address Resolution Protocol – or – Converting Network Protocol Addresses to 48.bit Ethernet Address for Transmission on Ethernet Hardware, November 1982, Status: Standard (STD-37) [2] Douglas Comer, Internetworking with TCP/IP: Principles, Protocols, and Architectures, Fourth Edition, Prentice Hall, Inc. (2000) [3] Douglas Comer, David L. Stevens Internetworking with TCP/IP Vol. II: Design, Implementation, and Internals, Prentice Hall, Inc. (1998) [4] Douglas Comer, Network Systems Design Using Network Processors, Pearson Education Inc. (2004)

Draft: 2004/10/22 20:08

Page: 6

Extensions To ARP For Cache Revalidation

Oct 22, 2004 - with Ethernet networks, the protocol is designed to ac- commodate ... gram to forward to another system (host or router) on the local network.

Download PDF

55KB Sizes 1 Downloads 170 Views

Report

Extensions To ARP For Cache Revalidation

Recommend Documents