Multidimensional Visualization of Communication ...

Viewer
Transcript

Multidimensional Visualization of Communication Networks Yaron Singer, Ohad Greenshpan, Alfred Inselberg School of Computer Science Tel Aviv University {yaronsin, greensh, aiisreal}@post.tau.ac.il

Abstract— In this paper we analyze structure and functionality of communication networks using parallel coordinates visualization. We transform a network with n nodes to n points in an n-dimensional space by using topological measures on the nodes in the network. We study communication networks on different levels of abstraction. We begin with visualization of subgraphs of the Internet AS graph and continue with visualization of networks on the IP level. Our method provides visual interpretation to network properties such as stability in instances of nodelink failures, node back-up, node interdependence, and unique topological patterns common in communication networks. We show how these can be utilized in network study.

I. I NTRODUCTION Analyzing networks of large data sets often raises difficulties due to their complexity. Information visualization techniques are applied in a wide spectrum of research areas such as artificial intelligence [1], information and software systems [2] and biological sciences [3], to aid in their study. In network analysis various methods and approaches have been introduced in different research areas [4], [5], [6], [7] all of which deal with complexity and data of large-scale networks. Visualization techniques of communication networks have been studied in various aspects. Methods such as SeeNet [4] which provides static displays, interactive controls and animation of the Internet and Otter [8] which considers data transferred in the network, are specifically designed to meet the needs of Internet analysis. Other advanced graph visualization methods such as ones suggested by Ware et al. [9] and Gasner et al. [10] are topological based, and have been effectively used for Internet analysis as well. Internet data visualization tools have also been introduced. Interesting examples are a visual tool for detection of anomalous origin AS changes as suggested by Teoh et al. [11] and a tool for examination of visual fingerprints left by common network attacks as introduced by Conti et al. [12]. Parallel coordinates has been introduced by A. Inselberg [13], and been used in a variety of applications [14], [15], [16], and specifically used for visualization of large data sets [17],[18]. While applications of parallel coordinates have been done on Internet data [12], we have not encountered a method which performs topological network analysis using multidimensional visualization exclusive of the one presented here. The Node Extraction Visualization (NEVIS) [19] introduced by Singer et al. transforms a given network into an

equivalent representation in parallel coordinates. Here, we use similar techniques for the Internet, and show how through transforming Internet autonomous systems (AS) nodes into multi-dimensional points, we obtain a visual tool which allows analysis of the topology of the Internet AS graph model. We show how key traits such as backup, inter-relationship between nodes and unique topological patterns can be studied as well as other properties relevant to research of communication networks. The rest of this paper is organized as follows. The next section briefly discusses the Internet AS graph and the method’s construction and algorithm on this Internet model. In Section III we display multidimensional visualization on the subsets of the Internet AS graph and communication networks on the IP level and analyze their different topologies. We discuss our conclusions and future work in section IV. II. T RANSFORMATION OF THE D IRECTED AS G RAPH INTO PARALLEL C OORDINATES We introduce and discuss the directed AS graph, extracted networks, the χ function, and parallel coordinates which are the basic elements of our visualization method. We continue with detailed specification of the method’s construction under restrictions imposed by its application on the model of the Internet AS graph. A. The Internet Modeled by the Directed AS Graph The Internet today consists of thousands of subnetworks, each with it own administrative management, called autonomous systems (ASes). Each such AS uses an interior routing protocol (such as OSPF, RIP) inside its managed network, and communicates with neighboring ASes using the BGP routing protocol which enables each administrative domain to decide which routes to accept and which to announce. Through the use of the BGP protocol the ASes select the best route, and impose business relationships between them on top of the underlying connected topology. As a result, paths in the Internet are not necessarily the shortest possible, but rather the shortest that conform to the ASes’ policies. Such routing is called policy-based routing. Commercial agreements between the ASes create the following peering relationships: customer-provider, providercustomer, peer-to-peer, and siblings. As the customer pays the provider for transit services, the provider transfers data to and

from the customer, though the customer does not transfer data for its provider. As a result, the AS graph is directed though does not necessarily maintain transitivity of paths. ASes which maintain peering relationship agree to transfer data between one another, thus data transferred between peers can only continue its route down to the peer’s customers. Instances of sibling relationship often occur as a result of financial acquisition, mergers or small ISPs which unify networking services. In such instances data can be transferred in all directions between sibling ASes. Refereing to customer-provider paths as up hill paths, and provider-customer paths as down hill paths, Lixin Gao has presented an algorithm which categorizes the relationship between ASes [20]. Gao has established that legal AS paths are an up hill path, followed by a down hill path, or alternatively an up hill path, followed by a peering link, followed by a down hill path. Thus, as up hill paths cannot follow down hill paths, legal paths in the AS graph model are described as valley free paths. The Internet therefore cannot be modeled by a normal graph, as it fails to obey laws of transitivity. Recent study by Dolev et al. has reexamined the Internet’s resiliency under these restrictions using valley free connectivity algorithms [21]. In this work we consider valley free routing to enable analysis of the subnetworks in the AS graph. B. Extracted Networks A key element of our method is study of the network under the changing condition of node extraction. To this end, we define extracted networks as follows.

the effect of extraction on the nodes in the network. For a network S = (V, E), we apply χ : V −→ R on nodes in the extracted network Si as follows: P 1 vj 6= vi vk 6=vj δi (vj ,vk ) χ(vj ) = −∞ vj = vi , Where δi (vj , vk ) denote the shortest distance between vj and vk in Si . We use the following conventions: 1) δ(vi , vi ) = 0 ∀i ∈ {1, 2, . . . , n} 2) δ(vi , vj ) = ∞ ⇐⇒ vj cannot be reached from vi . 1 = 0. 3) ∞ When applied on Si , the χ function measure allows to examine the effect of extracting vi from the network. It is easy to see that χ monotonically increases with respect to change in both centrality and connectivity of the node. Therefore it is a favorable candidate for measuring nodes’ centrality in the network. With appropriate calculation of δ, it is possible to apply χ on both directed and undirected graphs, either weighted or unweighed. To apply on the Internet AS graph, we have applied χ by using valley-free shortest paths which conform to the routing policy in the Internet as explained above. 1) Multidimensional Representation: In the interest of making this presentation self-contained, recall that in parallel coordinates, an n-tuple is represented by a polygonal line. Figure 1 describes the points p1 = (1, 5, 3.3, 0, 2.5), p2 = (5, 6, 1, 2, 2) and p3 = (4, 2, 3, 1, 2.9) on a 5 dimensional coordinate system.

D EFINITION 2.1: For the network S = (V, E) with nodes V = {v1 , v2 . . . , vn } and edges E = {(vi , vj ) ∈ V × V | vi is connected to vj }, let the ith extracted network, denoted by Si , to be the original network S without the node vi and without any edges connected to it: Si := (V \{vi }, E\{(vj , vk ) ∈ V × V | j = i ∨ k = i}) (1)

As explained in detail further, transformation of a network with n nodes as above requires producing a set of n extracted networks {S1 , S2 , . . . , Sn } which produces an n-dimensional space. Although node extraction on the AS level is rare, we use simulated extractions to reveal which nodes in the AS graph have the greatest contribution to functionality of data transfer in the network. C. Measuring Effect of Extraction Topological properties are frequently used to measure networks’ functionality, which in turn depends on shortest paths and connectivity. Our interest here is in applying an appropriate measure which enables classification of nodes by these criteria. To this end the χ function below is used to determine

Fig. 1.

Three Points in parallel coordinates.

D. Implementation on the AS Graph Having introduced its key elements, we now offer detailed construction specification of our method. For a network S as described above, we produce n extracted networks S1 , S2 , . . . , Sn . By extracting the node vi from S, we produce the ith extracted network Si which consists of n − 1 nodes. For each extracted network, the ASBFS algorithm to obtain shortest path distances between AS nodes in accordance to

the valley-free routing is used. We denote δ¯ij as the vector of shortest distances from vj on Si . This process produces a matrix of order n × n. We give formal specification of this procedure in the CREATEASMATRIX description below. Algorithm 1 The CREATEASMATRIX procedure receives an input graph G with predefined AS relations between nodes. CREATEASMATRIX

for all vi ∈ V do Si ← ExtractN ode(vi , S) for all vj ∈ V \{vi } do δ¯ij ← ASBFS(Si , vj ) χ(vj ) ← 0 for all vk ∈ V \{vi , vj } do χ(vj ) ← χ(vj ) + δi (v1j ,vk ) Mij ← χ(vj ) Mii ← −∞ return M The matrix is then transformed to its equivalent multidimensional representation by considering each row as an ndimensional point. With parallel coordinates each point is represented as a polygonal line. We obtain visualization with the networks S1 , S2 , . . . , Sn represented by the multidimensional coordinates system with 1st, 2nd,. . . , nth axes respectively. In turn, the nodes v1 , v2 , . . . , vn are represented by n polygonal lines. Summarizing, we transform a network with n nodes into n points in an n-dimensional space. The n extracted networks serve as the components of the n - dimensional space, and the nodes’ χ values are transformed into points in that space. The following section shows implementation on the Internet and discusses some of the properties of this transformation and their use in network analysis. III. A PPLICATION ON THE I NTERNET AS GRAPH We now discuss some of the results and show examples of applying our method on the Internet AS graphs on two levels of abstraction. We first apply our method on the Internet AS graph, where each AS represents a node in the network. We continue with visualization of the AS subnetworks on the IP level. A. Visualization on the AS Level A network such as the Internet AS graph is comprised of over 20,000 nodes and over 100,000 directed edges. We therefore select subsets of nodes for visualization which reduces both computational and visual complexity. To this end, we have chosen to present analysis of subsets of nodes according to their geopolitical locations. We have chosen to present visualization of AS nodes in different countries of Europe and discuss their topological traits using our visualization method. Although extraction on the AS level is rare, simulated extractions can be used to reveal which nodes in the AS graph have the greatest contribution to functionality and data transfer in the network.

Figure 2 displays visualization of the Internet connections between ASes in Switzerland. The clear segmentation into layers shows the distinction in centrality values of AS nodes in the network. Note that the polygonal lines on the bottom are extremely horizontal which indicate their stability to extraction of nodes in the network. While there are several drops of nodes to values of 0, which indicate their failure due to extraction of another node in the network, the network appears to be stable. The pattern displayed is common to other European networks we have studied. Spain, Austria, Sweden and other countries produced similar visual results.

Fig. 2.

Visualization of AS connections in Switzerland.

In contrary to the example above, the AS network of Portugal displayed in figure 3 appears to be far less stable. The drops of polygonal lines in the 5th, 6th and 20th axis show that the ASes which represent v5 , v6 and v20 in the network have the greatest effect on its functionality. We examine the polygonal lines which correspond to nodes of which their extraction accounts for the drops (the polygonal lines which represent the nodes extracted are ones which drop to −∞ and a discontinuation is appear in the polygonal line). The green, grey and blue polygonal lines which are found at the high end of the centrality correspond to the extracted nodes. This fact is immediate using interactive software tools. The node link diagram confirms this fact as it shows that indeed the green, red and blue nodes are highly connected. The conclusion which can be drawn in this case is that these nodes are not backed up in the network, as their extraction largely affects other nodes in the network. Furthermore, it seems that the topology of this network is largely centered around these nodes. Figure 4 displays visualization of Germany’s AS Graph. Germany displays a communication network which is significantly greater in size than of other average countries in Europe. The node-link diagram displayed in this figure is an example of the difficulties which are raised using nodelink diagram visualization. Aside from its greater size, there is a clear distinction between this network and other shown thus far. The drop of all polygonal lines on one of the axes indicates that there is one node in the network for which its failure implies failure of the entire network. This is in complete contrast to the relatively stable structure of the network which is manifested in relatively horizontal polygonal lines. It can

Fig. 3.

Visualization of AS connections in Portugal.

be assumed that the network in this instance is structured in a manner where very small subnetworks connect to one main node, and all data transfers through this provider.

Fig. 4.

Visualization of AS connections in Germany.

Figure 4 also displays the difficulties in such visualization when applied on large networks. The advantage of our visualization method is in its ability to use both combinatoric and multivariate aggregation algorithms which enable reduction of visual and computational complexity (see [19], [22], [23], [24], [25]). B. Application on the IP level Networks on the IP level differ from the Internet AS graph in two significant manners. First, size of such networks is relatively small in comparison with the Internet AS graph. Second, paths in the AS are not restricted by business agreements like the AS graph. We have applied our method on IP networks of different ASes. We show examples of application on AS 8121 and AS 22561 and their analysis in figures using our method. 1) Visualization of AS 8121: Clearly, the χ values are well differentiated which mirrors the different hierarchies of centrality in the network. It is interesting to notice that these hierarchies are maintained throughout extractions in this network. The lower horizontal lines indicate that there is another connected components which is not effected by the extractions. Note also that there is an area which is clear of polygonal lines. We conclude that no node receives χ values in that area, and that it may testify to segmentation of at least two

Fig. 5. Visualization of IP network of AS 8121. Each node represents a unique IP address in the AS subnetwork and links between IP addresses are represented by edges.

separate connected components in the network. The horizontal polygonal lines in the lower values support this hypothesis. There are 3 nodes which their extraction has significant effect on the network. This is manifested in the drops of polygonal lines in the 5th, 25th and 41st axes. This signifies that these nodes have relatively poor backup in the network, and that their extraction effects the network’s functionality. An interesting fact is that both node 5 and node 41 in the graph represented by the green and brown polygonal lines respectively, have average and low χ values. This shows that a node can be crucial for resilience and functionality while having relatively low centrality. 2) Visualization of AS 22561: Studying the visualization of network of AS 22561 it is evident that its topology differs from the one of AS 8121. Perhaps the most apparent difference is that there are more drops of polygonal lines throughout the axes though the drops are less dense, namely, less polygonal lines drop at each axis. We conclude that the difference between the two networks is that while the network of AS 8121 had three nodes which their extraction was significant, this network has better backup as extraction of nodes have local effect on the network, rather than a global one as in AS 8121. This is despite of the χ values being well differentiated as in AS 8121.

Fig. 6.

Visualization of IP network of AS 22561.

With all said above it is evident that the AS 8121 is com-

prised of a several stars and some more connected components with few members. The topology of AS 22561 however is a more complicated one. 3) Data Sets Used: We used the combined data from the DIMES [26] and RouteViews [27] projects for March 2006. The AS graph is comprised of 20,103 ASes and 57,272 AS links. We approximate the AS relationship by comparing the k-core index [28] of two ASes and taking the one with the highest k-core index as the provider of the other. If the k-core index is the same the ASes are treated as peers. IV. C ONCLUSIONS AND F UTURE W ORK We have introduced a new method for topological analysis of the communication networks with the application of NEVIS on the directed AS graph model and have shown how its properties can be utilized to deduce significant information about the network under study. Regarding its application, the advantages of this method are in its simplicity and flexibility. Its modular implementation allows application of graph pre-processing algorithms, and methods of parallel coordinates suggested in the past. Similar to all visualization methods of large data sets, our method is limited in the size of networks which can be studied effectively. For future research we intend to experiment with methods which enable reducing visual complexity and their application on larger Internet AS subnetworks and perform similar analysis as done here. Also, one of our main goals is further research on the theory and application of transformation of networks into parallel coordinates and multivariate data sets. R EFERENCES [1] Nguyen, T.D., Ho T.B., Shimodaira H. “A Visualization Tool for Interactive Learning of Large Decision Trees”. Tools with Artificial Intelligence, 2000. ICTAI 2000. Proceedings. 12th IEEE International Conference on Tools with Artificial Intelligence. 2000. pp. 28 - 35. [2] Schauer R., Keller R.K. “Pattern Visualization for Software Comprehension”. Proceedings of the 6th International Workshop on Program Comprehension (IWPC’99) . pp 4 - 12. [3] Ciccarese P., Mazzocchi S., Ferrazzi F., Sacchi L. “Genius: A new tool for Gene Networks Visualization. Intelligent Data Analysis in Medicine and Pharmacology (IDAMAP) Proceedings (2004). pp. 107-111. [4] Becker, R.A., Eick, S.G., Wilks, A.R. “Visualizing of Network Data”. IEEE Transactions on Visualization and Computer Graphics (1995), Vol. 1-1; 16-21. [5] Forman J.J., Clemons P.A., Schreiber S.L., Haggarty S.J. “SpectralNET - an Application for Spectral Graph Analysis and Visualization”. BMC Bioinformatics. Oct 2005 19;6:260. [6] Bolser D., Dafas P., Harrington R., Park J., Schroeder M. “Visualisation and Graph - Theoretic Analysis of a Large-Scale Protein Structural Interactome”. BMC Bioinformatics 2003; 4:45. [7] McGrath C., Hardt D.C., Blythe J. “Visualizing Complexity in Networks: Seeing Both the Forest and the Trees”. Connections 25(1) 2002. pp. 3741 [8] Bradley Huffaker, Nemeth E., Claffy K. “Otter: A General - Purpose Network Visualization Tool”. Proceedings of the International Networking Conference (INET) 1999. [9] Ware C., Bobrow R. “Supporting Visual Queries on Medium - Sized Node - Link Diagrams. Information Visualization (2005) Vol 4. pp. 49 - 58. [10] Gansner E.R., Koren Y., North S.C. “Topological Fisheye Views for Visualizing Large Graphs”. IEEE Transactions on Visualization and Computer Graphics 2005 Jul-Aug;11(4). pp. 457-68.

[11] S.T. Teoh, K.-L. Ma, S.F. Wu, and X. Zhao. “A Visual Technique for Internet Anomaly Detection”. Proceedings of Computer Graphics and Imaging (CGIM) 2002. [12] G. Conti and K. Abdullah. “Passive Visual Fingerprinting of Network Attack Tools”. ACM Conference on Computer and Communications Security’s Workshop on Visualization and Data Mining for Computer Security (VizSEC). October 2004. [13] Inselberg, A. “Parallel Coordinates : VISUAL Multidimensional Geometry and its Applications”. Springer-Verlag, New York, 2006. [14] Shneiderman B. “The Eyes Have It : A Task By Data Type Taxonomy for Information Visualizations”. Proceedings of the 1996 IEEE Symposium on Visual Languages pp. 336-343. [15] Wegenkittl R., Loffelmann H., Groller E. “Visualizing the Behaviour of Higher Dimensional Dynamical Systems”. Proceesings of IEEE VIS 1997. pp. 119-126. [16] Goel A., Backer C.A., Shaffer C.A., Grossman B., Watson L.T., Haftka R.T., Mason W.H. “VizCraft: A Problem-Solving Environment for Aircraft Configuration Design”. Computing in Science and Engineering 2001. 3-1. pp. 56-66 [17] Inselberg, A. “Visualization and Data Mining of High-Dimensional Data”. Chemometrics and Intelligent Laboratory Systems, 60-1, 2002, pp. 147-159(13) [18] Keim D.A. “Information Visualization and Visual Data Mining”. IEEE TVCG, 7-1, 2002, pp. 100-107 [19] Yaron Singer, Ohad Greenshpan. “Network Visualization with Parallel Coordinates”. VISUAL Multidimensional Geometry and its Applications. Springer-Verlag, New York, 2006. [20] Gao L. “On Inferring Automonous System Relationships in the Internet”. IEEE Global Internet, Nov. 2000. [21] Dolev D., Jamin S., Mokryn O., Shavitt Y. “Internet Resiliency to Attacks and Failures Under BGP Policy Routing” Computer Networks. [22] Fua Y.H., Ward M.O., Rundensteiner E. A. “Hierarchical Parallel Coordinates for Exploration of Large Datasets”. Proceedinds of IEEE Visualization 1999. pp. 43 - 50. [23] Ankerst M., Berchtold S., Keim D. “Similarity Clustering of Dimensions for an Enhanced Visualization of Multidimensional Data”. Proceedings of the 1998 IEEE Symposium on Information Visualization. pp. 52. [24] Artero A.O., Ferreira de Oliveira M.C., Levkowitz H. “Uncovering Clusters in Crowded Parallel Coordinates Visualizations”. INFOVIS 2004. pp. 81 - 88. [25] Andrienko, G., Andrienko, N. “Parallel Coordinates for Exploring Properties of Subsets”. Proceedings of the Second International Conference on Coordinated and Multiple Views in Exploratory Visualization 2004. Vol 0. pp. 93 - 104. [26] Shavitt Y., Shir E. DIMES: Let the Internet Measure Itself”. ACM SIGCOMM Computer Communication Review. vol. 35, no. 5 2005. [27] “University of Oregon Route Views Project”. http://www.antc.uoregon.edu/route-views/. [28] Carmi S., Havlin S., Kirkpatrick S., Shavitt Y., Shir E. “Medusa: New Model of Internet Topology using K - Shell Decomposition”. Tech. Rep., arXiv, Jan. 2006.

Multidimensional Visualization of Communication ...

visual tool for detection of anomalous origin AS changes as suggested by Teoh et ... show how key traits such as backup, inter-relationship between nodes and ...

Download PDF

367KB Sizes 2 Downloads 223 Views

Report

Multidimensional Visualization of Communication ...

Recommend Documents