Proposal for Improving Linux XIA Routing Performance Qiaobin Fu [email protected] March 16, 2015

1

Abstract

Linux XIA is a native implementation of XIA in the Linux kernel, which can intrinsically support network evolution, collaboration, and interoperability. XIA uses eXpressive Internet Protocol (XIP) to incrementally introduce changes (i.e., principals) to its network layer. Furthermore, principals introduce eXpressive IDentifiers (XIDs) to forward packets. Specifically, each XID is the pairing of a principal type (32 bits) and a name or ID (160 bits). One of the most challenging issue in XIA is the lookup and update of the 160-bit address, whose addressing and routing is maintained per-principal. Because of the diversity of principals, we may hardly find an efficient routing algorithm to handle all those principals while supporting fast lookup as well as update. In this proposal, I decompose the lookup of principal XIDs into three categories: longest prefix matching (e.g., IP lookup, Name lookup in NDN), exact matching (e.g., MAC address table), and range matching (e.g., packet classification, substring matching). Then, I introduce a group of Bloom filter-based routing algorithms to improve the routing performance in Linux XIA. Finally, I propose some potential related projects and research problems.

2 2.1

Introduction Background and Motivation

With the rapid growth of Internet technologies, such as video applications, cloud computing, the demands for routers with high throughput are urgent. In addition, the growth of Forwarding Information Base (FIB) size has been accelerating. Therefore, the fast growth of FIB sizes and throughput demands bring significant challenges to routing algorithms. We are highly desired to explore both time-efficient and space-efficient routing algorithms.

2.2

Technical Challenges

Technically, the routing performance issues in Linux XIA [1] are much more challenging than the traditional routing for the following reasons. The first technical challenge is to handle much longer addresses, i.e., 160-bit addresses. To illustrate this issue, we can compare it with IPv4, which has 232 possible addresses. Even for IPv6, we generally only lookup the first 64 bits. However, researchers are still working on the design of better algorithms to reduce the memory consumption and improve the lookup speed. While the XIA address has 160 bits, allowing 2160 addresses, or 2128 , times as many as IPv4, which is tremendous. The second technical challenge is to deal with diverse routing schemes. On one hand, since Linux XIA makes effort to support both the legacy and emerging network architectures, different network architectures may have completely different routing schemes. For instance, the IPv4 uses 1

Proposal for Improving Linux XIA Routing Performance

By Qiaobin Fu

the longest prefix matching rule, while LIPSIN [2] uses zFilter to transfer information. On the other hand, we may have to define some specific routing tables to support principals with specific functions, such as QoS, load balance, packet classification, network measurement, etc. All of them may require different routing schemes. The third technical challenge is to cope with multiple FIBs. In Linux XIA, each principle maintains its own “FIB” (e.g., addressing, routing, etc.), however, there can be a lot of principles running in our Linux XIA at the same time. In addition, we may need to support Network Virtualization in the near future, considering the recent advance in virtualization technologies. In such situations, each principle may have more than one FIBs. Therefore, a big issue is how to manage these FIBs in a single machine.

2.3

Solutions

I have a few preliminary ideas for the above three challenges, all of which are based on Bloom filters or Cuckoo filters [3]. For the first challenge, for a general-purpose routing algorithm, I think the length of the address is too long for most of the current routing algorithms, which can consume tremendous memory. However, due to the memory efficiency of Bloom filters, we may try to introduce Bloom filter-based routing algorithms to manage the FIBs. For the second challenge, we may divide the FIBs into three different categories: such as Longest Prefix Matching, Exact Matching, and Range Matching. Then, we can explore different algorithms to deal with different kinds of routing schemes. For the third challenge, we can maintain additional information, such as T ags, bit-vector [11], etc., to distinguish different FIBs. Taking the IPv4 lookup algorithms for example, there are some algorithms supporting multi-FIBs, we may implement them in our XIA first, and then others.

3 3.1

Bloom Filters and Their Application Bloom Filter

A Bloom filter is an array of m bits for representing a set S = {x1 , x2 , . . . , xn } of n elements. Initially all the bits in the filter are set to zero. The key idea is to use k hash functions, hi (x), 1 ≤ i ≤ k to map items x ∈ S to random numbers uniform in the range 1, . . . , m. The hash functions are assumed to be uniform. An element x ∈ S is inserted into the filter by setting the bits hi (x) to one for 1 ≤ i ≤ k. Conversely, y is assumed a member of S if the bits hi (y) are set, and guaranteed not to be a member if any bit hi (y) is not set.

Figure 1: Example of standard Bloom filter

2

Proposal for Improving Linux XIA Routing Performance

By Qiaobin Fu

Figure 1 shows an example of a Bloom filter [4]. This filter begins as an array of all 0’s. Each item xi in the set S is hashed k times, with each hash yielding a bit location; there bits are set to 1. To check if an element y is in the set, hash it k times and check the corresponding bits. The element y1 cannot be in the set, since one of the bits is a 0. The element y2 is either in the set or it is a false positive.

3.2

Cuckoo Filter

A cuckoo filter [3] is a compact variant of a cuckoo hash table that stores only fingerprints−a bit string derived from the item using a hash function−for each item inserted, instead of keyvalue pairs. The filter is densely filled with fingerprints (e.g., 95% entries occupied), which confers high space efficiency. A set membership query for item x simply searches the hash table for the fingerprint of x, and returns true if an identical fingerprint is found.

Figure 2: Illustration of cuckoo hashing A basic cuckoo hash table consists of an array of buckets where each item has two candidate buckets determined by hash functions h1 (x) and h2 (x). The lookup procedure checks both buckets to see if either contains this item. Figure 2(a) shows the example of inserting a new item x in to a hash table of 8 buckets, where x can be placed in either buckets 2 or 6. If either of x’s two buckets is empty, the algorithm inserts x to that free bucket and the insertion completes. If neither bucket has space, as is the case in this example, the item selects one of the candidate buckets (e.g., bucket 6), kicks out the existing item (in this case “a”) and re-inserts this victim item to its own alternate location. In this example, displacing “a” triggers another relocation that kicks existing item “c” from bucket 4 to bucket 1. This procedure may repeat until a vacant bucket is found as illustrated in Figure 2(b), or until a maximum number of displacements is reached. If no vacant bucket is found, this hash table is considered too full to insert. Although cuckoo hashing may execute a sequence of displacements, its amortized insertion time is O(1).

Figure 3: Space and lookup cost of Bloom filters and three cuckoo filters Figure 3 compares space-optimized Bloom filters and cuckoo filters with and without semi3

Proposal for Improving Linux XIA Routing Performance

By Qiaobin Fu

sorting. In conclusion, Cuckoo filters improve upon Bloom filters in three ways: (1) support for deleting items dynamically; (2) better lookup performance; and (3) better space efficiency for applications requiring low false positive rates ( < 3%).

3.3

Hierarchical Bloom Filters

Figure 4: Example of inserting a string into a hierarchical Bloom filter Shanmugasundaram et al. presented a data structure called Hierarchical Bloom Filter to support substring matching [5][6]. This structure supports the checking of a part of string for containment in the filter with low false positive rates. The filter works by splitting an input string into a number of fixed-size blocks. These blocks are then inserted into a standard Bloom filter. By using the Bloom filter, it is possible to check for substrings with a block-size granularity. This substring matching may result in combinations of strings that are incorrectly reported as being in the set (false positives). For example, a concatenation of two blocks from different strings would be incorrectly recognized as an inserted substring. Figure 4 illustrates the hierarchical nature of this construction. The hierarchical Bloom filter construction improves matching accuracy by inserting the concatenation of blocks into the filter in addition to inserting them separately. This means that two subsequent single block matches can be verified by looking up their concatenation. This approach generalizes to a sequence of blocks; however, storage space requirements grow as more block sequences are added to the structure. This filter was used to implement a payload attribution system that associates excerpts of packet payloads to their source and destination hosts. The filter was used to create compact digests of payloads. The system works by dividing the payload of each packet into a set of blocks of a certain fixed size. Each block is appended with its offset in the payload: (content k offset). The blocks are then hashed and inserted into a Bloom filter. A hierarchical Bloom filter is a collection of the standard Bloom filters for increasing block sizes. When a string is inserted, it is first broken into blocks which are inserted into the filter hierarchy 4

Proposal for Improving Linux XIA Routing Performance

By Qiaobin Fu

starting from the lowest level. For the second level, two subsequent blocks are concatenated and inserted into the second level. This block-based concatenation continues for the remaining levels of the hierarchy. The resulting structure can then be used to verify whether or not a given string occurs in the payload. The search starts at the first level and then continues upwards in the hierarchy to verify whether the substrings occurred together in the same or different packets.

3.4

Bloom filter based Routing Algorithms

Dharmapurikar et al. [7] proposed the PBF algorithm where they use Bloom filters to first find the longest matching prefix length in on-chip memory and then use hash tables in off-chip memory to find the next hop. Figure 5 illustrates the basic configuration of PBF algorithm.

Figure 5: Basic configuration of Longest Prefix Matching using Bloom filters Minlan et al. [8] proposed BUFFALO, a Bloom Filter Forwarding Architecture for Large Organizations, where they construct one Bloom filter for each next hop (i.e., outgoing link), and store all the addresses that are forwarded to that next hop. By checking which Bloom filter the addresses match, they perform the entire address lookup within the fast memory for all the packets. Figure 6 shows the BUFFALO switch architecture, where it combines standard Bloom filter with Counting Bloom filter for fast update. Dong Zhou et al. [12] proposed to design a software-based Ethernet switch-CuckooSwitch, which is built upon a memory-efficient, high-performance, and highly-concurrent Cuckoo hash table for compact and fast FIB lookup. This paper claimed that the CuckooSwitch can process 92.22 million minimum-sized packets per second on a commodity server equipped with eight 10 Gbps Ethernet interfaces while maintaining one billion entries in the FIB, consisting of destination MAC addresses. 5

Proposal for Improving Linux XIA Routing Performance

By Qiaobin Fu

Figure 6: BUFFALO Switch Architecture Bin Fan et al. [13] proposed an optimistic cuckoo hashing scheme, applied it to the popular Memcached system, and substantially improved both its memory efficiency and throughput. Lim et al. [9] proposed to use one bloom filter to find the longest matching prefix length. However, this algorithm has limited scalability, since it needs a trie-based data structure, which is hard to support longer addresses. I do not discuss it here, if you are interested in it, please refer to the original paper.

4

Solutions

Here, I decompose the FIBs into three different categories: Longest Prefix Matching, Exact Matching, and Range Matching. Notice that, we can simply use Cuckoo filters to support deletions dynamically instead of the combination of standard Bloom filter with Counting Bloom filters. (A) For Longest Prefix Matching 1. We may take the advantages of the PBF algorithm, and extend it. For such situations, we may first define the matching unit, such as a bit in IP lookup, a name chunk in NDN [10], etc. Then, we can build a similar architecture in Figure 5. Furthermore, to reduce the number of Bloom filters needed to lookup, we can insert part of a current prefix into all the Bloom filters representing prefixes that shorter than the current one. When we lookup a prefix, we can apply binary search on these Bloom filters, while increasing the memory. 2. We can extend the Hierarchical Bloom Filters to allow longest prefix matching. For example, we can see each prefix as a substring of the whole string. In addition, we can allow each fixed-size block to support strings with variable lengths. An issue is how to encode the length. (B) For Exact Matching 1. We can apply the algorithms proposed by Minlan et al. to this situation. 2. We can apply the Hierarchical Bloom Filters. (C) For Range Matching, we can potentially apply the Hierarchical Bloom Filters. 6

Proposal for Improving Linux XIA Routing Performance

By Qiaobin Fu

(D) For additional information encoding, we may discover different methods, and compare them. For example, using additional data structures, shifting bloom filters, etc.

5

Project Outline

(A) Conduct research on the Bloom filter-based routing algorithms. (B) Implement an evaluation environment for different routing algorithms. (C) Implement the algorithms to run in the evaluation environment. (D) Evaluate all the implemented algorithms based on the evaluation environment. (E) Implement the IPv4 and IPv6 lookup algorithms based on Bloom filters in Linux XIA, which may have some interesting performance. (F) Implement the general-purpose routing algorithms. (G) Extending the XIP command to control the matching principals. (H) Write papers to publish the previous work.

6

Research Questions

(A) Figuring out how to define the format of the addresses instead of using arbitrary ones may help. We may take the advantage of the design of IPv6 address. (B) For different kinds of matching, we may learn something from the advance of packet classification algorithms. (C) The potential fourth technical challenge is to manage the dependencies among different principles. We may see the the dependency as next hops or groups, we may use Bloom filters supporting groups to manage the dependencies dynamically. (D) How to encode the multi-FIBs in Bloom filter based algorithms? (E) Considering the long address, whether can we propose some new hashing schemes to reduce the hashing computation? Modular Hashing Functions? (F) How to handle false positives? (G) The Hierarchical Bloom Filter-based algorithms may potentially support more complex routing lookup? Or look for better ones?

7

References

1 https://github.com/AltraMayor/XIA-for-Linux/wiki 2 Jokela, Petri, et al. “LIPSIN: line speed publish/subscribe inter-networking.” ACM SIGCOMM Computer Communication Review 39.4 (2009): 195-206. 3 Fan, Bin, et al. “Cuckoo Filter: Practically Better Than Bloom.” Proceedings of the 10th ACM International on Conference on emerging Networking Experiments and Technologies. ACM, 2014. 7

Proposal for Improving Linux XIA Routing Performance

By Qiaobin Fu

4 Broder, Andrei, and Michael Mitzenmacher. “Network applications of bloom filters: A survey.” Internet mathematics 1.4 (2004): 485-509. 5 Shanmugasundaram, Kulesh, Herv Brnnimann, and Nasir Memon. “Payload attribution via hierarchical bloom filters.” Proceedings of the 11th ACM conference on Computer and communications security. ACM, 2004. 6 Tarkoma, Sasu, Christian Esteve Rothenberg, and Eemil Lagerspetz. “Theory and practice of bloom filters for distributed systems.” Communications Surveys & Tutorials, IEEE 14.1 (2012): 131-155. 7 D. Sarang, K. Praveen, and T. D. E. “Longest prefix matching using bloom filters”. In Proc. ACM SIGCOMM, pages 201-212, 2003. 8 Yu, Minlan, Alex Fabrikant, and Jennifer Rexford. “BUFFALO: Bloom filter forwarding architecture for large organizations.” Proceedings of the 5th international conference on Emerging networking experiments and technologies. ACM, 2009. 9 H. Lim, K. Lim, N. Lee, and K.-H. Park. “On adding bloom filters to longest prefix matching algorithms.” IEEE Transactions on Computers (TC), 63(2):411-423, 2014. 10 http://named-data.net/index.html 11 Layong Luo, Gaogang Xie, Kave Salamatian, Steve Uhlig, Laurent Mathy, Yingke Xie, A Trie Merging Approach with Incremental Updates for Virtual Routers, IEEE INFOCOM, 2013 12 D. Zhou, B. Fan, H. Lim, D. G. Andersen, and M. Kaminsky. “Scalable, High Performance Ethernet Forwarding with CuckooSwitch.” In Proc. 9th International Conference on emerging Networking EXperiments and Technologies (CoNEXT), Dec. 2013. 13 B. Fan, D. G. Andersen, and M. Kaminsky. “MemC3: Compact and concurrent memcache with dumber caching and smarter hashing.” In Proc. 10th USENIX NSDI, Lombard, IL, Apr. 2013.

8

Proposal for Improving Linux XIA Routing Performance -

qiaobinf@bu.edu. March 16, 2015. 1 Abstract .... [12] proposed to design a software-based Ethernet switch-CuckooSwitch, which is built upon a memory-efficient ...

465KB Sizes 5 Downloads 277 Views

Recommend Documents

Proposal for Improving Linux XIA Routing Performance -
Mar 16, 2015 - issue in XIA is the lookup and update of the 160-bit address, whose ... of Internet technologies, such as video applications, cloud computing, the.

Improving Energy Performance in Canada
Sustainable Development Technology Canada –. NextGen ..... through education and outreach, as well as through .... energy science and technology by conducting ...... 171 026. Oct. 11. 175 552. Nov. 11. 167 188. Dec. 11. 166 106. Jan. 12.

Improving Energy Performance in Canada
and Canadian businesses money by decreasing their energy bills ... oee.nrcan.gc.ca/publications/statistics/trends11/pdf/trends.pdf. economy in ...... 2015–2016.

Improving UX through performance - GitHub
Page 10 ... I'm rebuilding the Android app for new markets ... A debug bridge for Android applications https://github.com/facebook/stetho ...

XIA - GitHub
Easier to port existing application, or create multi-network applications. 23. Page 24. Building"and"Using"an"XIA"Network" .... Mobile Connectivities ...

Master's Project Proposal: High Performance ...
Jun 15, 2009 - software package to create and run workflows on the Grid. ... For our purposes, each activity is an execution of an application (such as BLAST) ... Resources hosted on the TeraGrid typically include clusters, massively parallel ...

An Adaptive Strategy for Improving the Performance of ...
Performance of Genetic Programming-based. Approaches to Evolutionary ... Evolutionary Testing, Search-Based Software Engineering,. Genetic Programming ...

An Adaptive Strategy for Improving the Performance of ...
Software testing is an ... software testing. Evolutionary Testing. Evolutionary. Algorithms. +. Software ... Let the constraint selection ranking of constraint c in.

Techniques for Improving the Performance of Naive ...
... negatively. In such cases,. 1 http://people.csail.mit.edu/people/jrennie/20Newsgroups/. 2 http://www.cs.cmu.edu/afs/cs.cmu.edu/project/theo-20/www/data/ ...

Method and apparatus for improving performance on multiple-choice ...
Feb 4, 2003 - 9/1989. (List continued on next page.) Koos et al. Hatta. Yamamoto. Fascenda et al. Graves . ... 1 and 7—9. ..... desktop or notebook computer.

Techniques for Improving the Performance of Naive Bayes ... - CiteSeerX
and student of the WebKB corpus and remove all HTML markup. All non-alphanumeric ..... C.M., Frey, B.J., eds.: AI & Statistics 2003: Proceedings of the Ninth.

Method and apparatus for improving performance on multiple-choice ...
Feb 4, 2003 - system 12 is used by the computer 4 to control basic computer operations. Examples of operating systems include WindoWs, DOS, OS/2 and UNIX. FIGS. 2A and 2B are block diagrams of a ?rst embodi ment of a learning method according to the

Improving the Packet Delivery Performance for ... - IEEE Xplore
Abstract—In this letter, we investigate the properties of packet collisions in IEEE 802.15.4-based wireless sensor networks when packets with the same content ...

Improving Simplified Fuzzy ARTMAP Performance ...
Research TechnoPlaza, Singapore [email protected]. 3Faculty of Information Technology, Multimedia University,. Cyberjaya, Malaysia [email protected].

Improving Student Performance Through Teacher Evaluation - Gallup
Aug 15, 2011 - 85 Harvard Graduate School of Education Project on the. Next Generation of Teachers. (2008). A user's guide to peer assistance and review.

Improving Student Performance Through Teacher Evaluation - Gallup
Aug 15, 2011 - the high school level and in subject areas beyond reading and math in elementary and middle schools. The Common. Core State Standards initiative continues to move ahead in developing common assessments.65 Consequently, the likelihood i

Improving Student Performance Through Teacher Evaluation - Gallup
15 Aug 2011 - In Cincinnati, the Teacher Evaluation. System gathers data from four observations — three by trained evaluators and one by the principal — and a portfolio of work products (such as teacher lesson plans and professional development a

Performance Enhancement of Routing Protocol in MANET
Ghaziabad, U.P., India ... Service (QoS) support for Mobile Ad hoc Networks (MANETs) is an exigent task due to dynamic topology and limited resource. To support QoS, the link state ... Mobile ad hoc network (MANET) is a collection of mobile devices,

Improving Performance of Communication Through ...
d IBM Canada CAS Research, Markham, Ontario, Canada e Department of Computer .... forms the UPC source code to an intermediate representation (W-Code); (ii). 6 ...... guages - C, Tech. rep., http://www.open-std.org/JTC1/SC22/WG14/.