Proposal for Improving Linux XIA Routing Performance -

Viewer
Transcript

Proposal for Improving Linux XIA Routing Performance Qiaobin Fu [email protected] March 16, 2015

1

Abstract

Linux XIA is a native implementation of XIA in the Linux kernel, which can intrinsically support network evolution, collaboration, and interoperability. XIA uses eXpressive Internet Protocol (XIP) to incrementally introduce changes (i.e., principals) to its network layer. Furthermore, principals introduce eXpressive IDentifiers (XIDs) to forward packets. Specifically, each XID is the pairing of a principal type (32 bits) and a name or ID (160 bits). One of the most challenging issue in XIA is the lookup and update of the 160-bit address, whose addressing and routing is maintained per-principal. Because of the diversity of principals, we may hardly find an efficient routing algorithm to handle all those principals while supporting fast lookup as well as update. In this proposal, I decompose the lookup of principal XIDs into three categories: longest prefix matching (e.g., IP lookup, Name lookup in NDN), exact matching (e.g., MAC address table), and range matching (e.g., packet classification, substring matching). Then, I introduce a group of Bloom filter-based routing algorithms to improve the routing performance in Linux XIA. Finally, I propose some potential related projects and research problems.

2 2.1

Introduction Background and Motivation

With the rapid growth of Internet technologies, such as video applications, cloud computing, the demands for routers with high throughput are urgent. In addition, the growth of Forwarding Information Base (FIB) size has been accelerating. Therefore, the fast growth of FIB sizes and throughput demands bring significant challenges to routing algorithms. We are highly desired to explore both time-efficient and space-efficient routing algorithms.

2.2

Technical Challenges

Technically, the routing performance issues in Linux XIA [1] are much more challenging than the traditional routing for the following reasons. The first technical challenge is to handle much longer addresses, i.e., 160-bit addresses. To illustrate this issue, we can compare it with IPv4, which has 232 possible addresses. Even for IPv6, we generally only lookup the first 64 bits. However, researchers are still working on the design of better algorithms to reduce the memory consumption and improve the lookup speed. While the XIA address has 160 bits, allowing 2160 addresses, or 2128 , times as many as IPv4, which is tremendous. The second technical challenge is to deal with diverse routing schemes. On one hand, since Linux XIA makes effort to support both the legacy and emerging network architectures, different network architectures may have completely different routing schemes. For instance, the IPv4 uses 1

Proposal for Improving Linux XIA Routing Performance

By Qiaobin Fu

the longest prefix matching rule, while LIPSIN [2] uses zFilter to transfer information. On the other hand, we may have to define some specific routing tables to support principals with specific functions, such as QoS, load balance, packet classification, network measurement, etc. All of them may require different routing schemes. The third technical challenge is to cope with multiple FIBs. In Linux XIA, each principle maintains its own “FIB” (e.g., addressing, routing, etc.), however, there can be a lot of principles running in our Linux XIA at the same time. In addition, we may need to support Network Virtualization in the near future, considering the recent advance in virtualization technologies. In such situations, each principle may have more than one FIBs. Therefore, a big issue is how to manage these FIBs in a single machine.

2.3

Solutions

I have a few preliminary ideas for the above three challenges, all of which are based on Bloom filters or Cuckoo filters [3]. For the first challenge, for a general-purpose routing algorithm, I think the length of the address is too long for most of the current routing algorithms, which can consume tremendous memory. However, due to the memory efficiency of Bloom filters, we may try to introduce Bloom filter-based routing algorithms to manage the FIBs. For the second challenge, we may divide the FIBs into three different categories: such as Longest Prefix Matching, Exact Matching, and Range Matching. Then, we can explore different algorithms to deal with different kinds of routing schemes. For the third challenge, we can maintain additional information, such as T ags, bit-vector [11], etc., to distinguish different FIBs. Taking the IPv4 lookup algorithms for example, there are some algorithms supporting multi-FIBs, we may implement them in our XIA first, and then others.

3 3.1

Bloom Filters and Their Application Bloom Filter

A Bloom filter is an array of m bits for representing a set S = {x1 , x2 , . . . , xn } of n elements. Initially all the bits in the filter are set to zero. The key idea is to use k hash functions, hi (x), 1 ≤ i ≤ k to map items x ∈ S to random numbers uniform in the range 1, . . . , m. The hash functions are assumed to be uniform. An element x ∈ S is inserted into the filter by setting the bits hi (x) to one for 1 ≤ i ≤ k. Conversely, y is assumed a member of S if the bits hi (y) are set, and guaranteed not to be a member if any bit hi (y) is not set.

Figure 1: Example of standard Bloom filter

2

Proposal for Improving Linux XIA Routing Performance

By Qiaobin Fu

Figure 1 shows an example of a Bloom filter [4]. This filter begins as an array of all 0’s. Each item xi in the set S is hashed k times, with each hash yielding a bit location; there bits are set to 1. To check if an element y is in the set, hash it k times and check the corresponding bits. The element y1 cannot be in the set, since one of the bits is a 0. The element y2 is either in the set or it is a false positive.

3.2

Cuckoo Filter

A cuckoo filter [3] is a compact variant of a cuckoo hash table that stores only fingerprints−a bit string derived from the item using a hash function−for each item inserted, instead of keyvalue pairs. The filter is densely filled with fingerprints (e.g., 95% entries occupied), which confers high space efficiency. A set membership query for item x simply searches the hash table for the fingerprint of x, and returns true if an identical fingerprint is found.

Figure 2: Illustration of cuckoo hashing A basic cuckoo hash table consists of an array of buckets where each item has two candidate buckets determined by hash functions h1 (x) and h2 (x). The lookup procedure checks both buckets to see if either contains this item. Figure 2(a) shows the example of inserting a new item x in to a hash table of 8 buckets, where x can be placed in either buckets 2 or 6. If either of x’s two buckets is empty, the algorithm inserts x to that free bucket and the insertion completes. If neither bucket has space, as is the case in this example, the item selects one of the candidate buckets (e.g., bucket 6), kicks out the existing item (in this case “a”) and re-inserts this victim item to its own alternate location. In this example, displacing “a” triggers another relocation that kicks existing item “c” from bucket 4 to bucket 1. This procedure may repeat until a vacant bucket is found as illustrated in Figure 2(b), or until a maximum number of displacements is reached. If no vacant bucket is found, this hash table is considered too full to insert. Although cuckoo hashing may execute a sequence of displacements, its amortized insertion time is O(1).

Figure 3: Space and lookup cost of Bloom filters and three cuckoo filters Figure 3 compares space-optimized Bloom filters and cuckoo filters with and without semi3

Proposal for Improving Linux XIA Routing Performance

By Qiaobin Fu

sorting. In conclusion, Cuckoo filters improve upon Bloom filters in three ways: (1) support for deleting items dynamically; (2) better lookup performance; and (3) better space efficiency for applications requiring low false positive rates ( < 3%).

3.3

Hierarchical Bloom Filters

Figure 4: Example of inserting a string into a hierarchical Bloom filter Shanmugasundaram et al. presented a data structure called Hierarchical Bloom Filter to support substring matching [5][6]. This structure supports the checking of a part of string for containment in the filter with low false positive rates. The filter works by splitting an input string into a number of fixed-size blocks. These blocks are then inserted into a standard Bloom filter. By using the Bloom filter, it is possible to check for substrings with a block-size granularity. This substring matching may result in combinations of strings that are incorrectly reported as being in the set (false positives). For example, a concatenation of two blocks from different strings would be incorrectly recognized as an inserted substring. Figure 4 illustrates the hierarchical nature of this construction. The hierarchical Bloom filter construction improves matching accuracy by inserting the concatenation of blocks into the filter in addition to inserting them separately. This means that two subsequent single block matches can be verified by looking up their concatenation. This approach generalizes to a sequence of blocks; however, storage space requirements grow as more block sequences are added to the structure. This filter was used to implement a payload attribution system that associates excerpts of packet payloads to their source and destination hosts. The filter was used to create compact digests of payloads. The system works by dividing the payload of each packet into a set of blocks of a certain fixed size. Each block is appended with its offset in the payload: (content k offset). The blocks are then hashed and inserted into a Bloom filter. A hierarchical Bloom filter is a collection of the standard Bloom filters for increasing block sizes. When a string is inserted, it is first broken into blocks which are inserted into the filter hierarchy 4

Proposal for Improving Linux XIA Routing Performance

By Qiaobin Fu

starting from the lowest level. For the second level, two subsequent blocks are concatenated and inserted into the second level. This block-based concatenation continues for the remaining levels of the hierarchy. The resulting structure can then be used to verify whether or not a given string occurs in the payload. The search starts at the first level and then continues upwards in the hierarchy to verify whether the substrings occurred together in the same or different packets.

3.4

Bloom filter based Routing Algorithms

Dharmapurikar et al. [7] proposed the PBF algorithm where they use Bloom filters to first find the longest matching prefix length in on-chip memory and then use hash tables in off-chip memory to find the next hop. Figure 5 illustrates the basic configuration of PBF algorithm.

Figure 5: Basic configuration of Longest Prefix Matching using Bloom filters Minlan et al. [8] proposed BUFFALO, a Bloom Filter Forwarding Architecture for Large Organizations, where they construct one Bloom filter for each next hop (i.e., outgoing link), and store all the addresses that are forwarded to that next hop. By checking which Bloom filter the addresses match, they perform the entire address lookup within the fast memory for all the packets. Figure 6 shows the BUFFALO switch architecture, where it combines standard Bloom filter with Counting Bloom filter for fast update. Dong Zhou et al. [12] proposed to design a software-based Ethernet switch-CuckooSwitch, which is built upon a memory-efficient, high-performance, and highly-concurrent Cuckoo hash table for compact and fast FIB lookup. This paper claimed that the CuckooSwitch can process 92.22 million minimum-sized packets per second on a commodity server equipped with eight 10 Gbps Ethernet interfaces while maintaining one billion entries in the FIB, consisting of destination MAC addresses. 5

Proposal for Improving Linux XIA Routing Performance

By Qiaobin Fu

Figure 6: BUFFALO Switch Architecture Bin Fan et al. [13] proposed an optimistic cuckoo hashing scheme, applied it to the popular Memcached system, and substantially improved both its memory efficiency and throughput. Lim et al. [9] proposed to use one bloom filter to find the longest matching prefix length. However, this algorithm has limited scalability, since it needs a trie-based data structure, which is hard to support longer addresses. I do not discuss it here, if you are interested in it, please refer to the original paper.

4

Solutions

Here, I decompose the FIBs into three different categories: Longest Prefix Matching, Exact Matching, and Range Matching. Notice that, we can simply use Cuckoo filters to support deletions dynamically instead of the combination of standard Bloom filter with Counting Bloom filters. (A) For Longest Prefix Matching 1. We may take the advantages of the PBF algorithm, and extend it. For such situations, we may first define the matching unit, such as a bit in IP lookup, a name chunk in NDN [10], etc. Then, we can build a similar architecture in Figure 5. Furthermore, to reduce the number of Bloom filters needed to lookup, we can insert part of a current prefix into all the Bloom filters representing prefixes that shorter than the current one. When we lookup a prefix, we can apply binary search on these Bloom filters, while increasing the memory. 2. We can extend the Hierarchical Bloom Filters to allow longest prefix matching. For example, we can see each prefix as a substring of the whole string. In addition, we can allow each fixed-size block to support strings with variable lengths. An issue is how to encode the length. (B) For Exact Matching 1. We can apply the algorithms proposed by Minlan et al. to this situation. 2. We can apply the Hierarchical Bloom Filters. (C) For Range Matching, we can potentially apply the Hierarchical Bloom Filters. 6

Proposal for Improving Linux XIA Routing Performance

By Qiaobin Fu

(D) For additional information encoding, we may discover different methods, and compare them. For example, using additional data structures, shifting bloom filters, etc.

5

Project Outline

(A) Conduct research on the Bloom filter-based routing algorithms. (B) Implement an evaluation environment for different routing algorithms. (C) Implement the algorithms to run in the evaluation environment. (D) Evaluate all the implemented algorithms based on the evaluation environment. (E) Implement the IPv4 and IPv6 lookup algorithms based on Bloom filters in Linux XIA, which may have some interesting performance. (F) Implement the general-purpose routing algorithms. (G) Extending the XIP command to control the matching principals. (H) Write papers to publish the previous work.

6

Research Questions

(A) Figuring out how to define the format of the addresses instead of using arbitrary ones may help. We may take the advantage of the design of IPv6 address. (B) For different kinds of matching, we may learn something from the advance of packet classification algorithms. (C) The potential fourth technical challenge is to manage the dependencies among different principles. We may see the the dependency as next hops or groups, we may use Bloom filters supporting groups to manage the dependencies dynamically. (D) How to encode the multi-FIBs in Bloom filter based algorithms? (E) Considering the long address, whether can we propose some new hashing schemes to reduce the hashing computation? Modular Hashing Functions? (F) How to handle false positives? (G) The Hierarchical Bloom Filter-based algorithms may potentially support more complex routing lookup? Or look for better ones?

7

References

1 https://github.com/AltraMayor/XIA-for-Linux/wiki 2 Jokela, Petri, et al. “LIPSIN: line speed publish/subscribe inter-networking.” ACM SIGCOMM Computer Communication Review 39.4 (2009): 195-206. 3 Fan, Bin, et al. “Cuckoo Filter: Practically Better Than Bloom.” Proceedings of the 10th ACM International on Conference on emerging Networking Experiments and Technologies. ACM, 2014. 7

Proposal for Improving Linux XIA Routing Performance

By Qiaobin Fu

4 Broder, Andrei, and Michael Mitzenmacher. “Network applications of bloom filters: A survey.” Internet mathematics 1.4 (2004): 485-509. 5 Shanmugasundaram, Kulesh, Herv Brnnimann, and Nasir Memon. “Payload attribution via hierarchical bloom filters.” Proceedings of the 11th ACM conference on Computer and communications security. ACM, 2004. 6 Tarkoma, Sasu, Christian Esteve Rothenberg, and Eemil Lagerspetz. “Theory and practice of bloom filters for distributed systems.” Communications Surveys & Tutorials, IEEE 14.1 (2012): 131-155. 7 D. Sarang, K. Praveen, and T. D. E. “Longest prefix matching using bloom filters”. In Proc. ACM SIGCOMM, pages 201-212, 2003. 8 Yu, Minlan, Alex Fabrikant, and Jennifer Rexford. “BUFFALO: Bloom filter forwarding architecture for large organizations.” Proceedings of the 5th international conference on Emerging networking experiments and technologies. ACM, 2009. 9 H. Lim, K. Lim, N. Lee, and K.-H. Park. “On adding bloom filters to longest prefix matching algorithms.” IEEE Transactions on Computers (TC), 63(2):411-423, 2014. 10 http://named-data.net/index.html 11 Layong Luo, Gaogang Xie, Kave Salamatian, Steve Uhlig, Laurent Mathy, Yingke Xie, A Trie Merging Approach with Incremental Updates for Virtual Routers, IEEE INFOCOM, 2013 12 D. Zhou, B. Fan, H. Lim, D. G. Andersen, and M. Kaminsky. “Scalable, High Performance Ethernet Forwarding with CuckooSwitch.” In Proc. 9th International Conference on emerging Networking EXperiments and Technologies (CoNEXT), Dec. 2013. 13 B. Fan, D. G. Andersen, and M. Kaminsky. “MemC3: Compact and concurrent memcache with dumber caching and smarter hashing.” In Proc. 10th USENIX NSDI, Lombard, IL, Apr. 2013.

8

Proposal for Improving Linux XIA Routing Performance -

qiaobinf@bu.edu. March 16, 2015. 1 Abstract .... [12] proposed to design a software-based Ethernet switch-CuckooSwitch, which is built upon a memory-efficient ...

Download PDF

465KB Sizes 5 Downloads 306 Views

Report

Proposal for Improving Linux XIA Routing Performance -

Recommend Documents