Visualization for Cybersecurity

Visual Correlation of Network Alerts

Stefano Foresti, James Agutter, Yarden Livnat, and Shaun Moon University of Utah Robert Erbacher Utah State University

ociety’s dependence on information systems has made cybersecurity an increasingly important issue. Computer networks transport financial transactions, sensitive government information, power plant operations, and personal health information. The spread of malicious network activities poses great risks to the operational integrity of many organizations and imposes heavy economic burdens on life and health. Of particular concern is the identification of sophisticated attacks. Naive attacks are easily detected and have small likelihood of success—for instance, system administrators and network analysts aren’t very conThe VisAlert visual cerned with script kiddies or unsophisticated vulnerability exploits correlation tool facilitates because intrusion detection systems (IDSs) readily detect them. Port situational awareness in scans are another example. These attacks try to identify the services complex network running on a system by sending network packets to that service (a specenvironments by providing ified network port). A naive scan uses simple TCP connect packets a holistic view of network sent as quickly as possible. An IDS can easily detect port scans because security to help detect of their close proximity in time and high volume. Sophisticated attacks malicious activities. are harder to detect because they use stealthy mechanisms and more capable techniques. A sophisticated port scan can use alternatives to TCP connect packets or dilute the scan over time such that there is a delay of 0.4 seconds, 15 seconds, 5 minutes, or even longer between packets. This delay prevents easy algorithmic identification and can cause activities to be lost in the noise. IDSs analyze network traffic and host-based processes in an attempt to detect malicious activity. When they identify anomalous activity or activity matching known malicious activity, these systems generate an alert to notify the administrators or analysts of their impending doom. Each alert identifies the threat type using the alert type classification system. IDSs often store these alerts in stove-piped databases that aren’t easily corre-

S

48

March/April 2006

lated to other alerts or logs on the network. Thus, network analysts must use a myriad of tools that show different information in different formats, making it difficult for them to gain an overall understanding of the network’s security status. The high rate of false positives that these systems generate compounds this complexity. Because attacks are dynamic, if analysts can’t absorb and correlate the available data, it’s difficult for them to detect sophisticated attacks. Developing tools that increase the situational awareness and understanding of all those responsible for the network’s safe operation can increase a computer network’s overall security. System administrators are typically limited to textual or simple graphical representations of network activity (Bejtlich1 describes many available capabilities and their applications). Information visualization techniques and methods in many applications have effectively increased operators’ situational awareness, letting them more effectively detect, diagnose, and treat anomalous conditions.2 A growing body of research validates the use of visualization to solve complex data problems3-5 (see the “Previous Work” sidebar). Visualization elevates information comprehension by fostering rapid correlation and perceived associations. To that end, the display’s design must support the decision-making process: identify problems, characterize them, and determine appropriate responses. It must also present information in a way that’s easy for the user to process. Our visualization technique integrates the information in log and alert files into an intuitive, flexible, extensible, and scalable visualization tool—VisAlert—that presents critical information concerning network activity in an integrated manner, increasing the user’s situational awareness.

Objectives and assumptions We based our research and development on several premises to ensure that visualization for cybersecurity reflects the needs of operational environments. In general, the visualization techniques must be scalable, robust, and effectively and intuitively represent the data and relationships that are relevant to decision making. The objective is to overcome the limitations of existing

Published by the IEEE Computer Society

0272-1716/06/$20.00 © 2006 IEEE

Previous Work Historically, visualization has been applied to network monitoring and analysis, primarily for monitoring network health and performance. Initial visualization techniques for intrusion detection system (IDS) environments focused on simple scales and color representations to indicate state or threat level. The need for better analysis mechanisms for security and IDS-related data has motivated the exploration of more advanced visualization techniques. Many of these techniques effectively visualize malicious activities such as worm or denial-of-service (DoS) attacks. However, these visualization techniques tend to focus on specific problems rather than general alert correlation for an entire enterprise. Other techniques have focused on visual pattern matching⎯that is, the representation of known attacks. Teoh et al.1,2 analyze worms and other large-scale attacks on Internet routing data. Similarly, McPherson et al.3 developed a technique for visualizing port activity that’s geared toward monitoring large-scale networks for naive port scans and DoS attacks. Yin et al.4 and Lakkaraju et al.5 focus on representing netflows and associated link relationships. Such techniques are critical for analyzing attacks and IDS data, but they quickly suffer scalability issues and are limited as to the number of representable parameters. Wood6 describes basic graph-based visualization techniques, such as pie charts and bar graphs, and how analysts can apply them to typical network data available to all system administrators. This work describes how users can implement visualization and apply it to such data, as well as the meaning behind the identified results. The technique is limited only in the visualization’s simplicity, which currently can’t analyze the high-volume, high-dimensional data generated by today’s environments. This remains a major challenge for IDS data analysis in general. Traditional representations and network alert-reporting techniques tend to use a single sensor-single indicator display paradigm. Each sensor uniquely represents its information (indicator) and doesn’t depend on information

cybersecurity tools and visualizations that focus on narrow problems, work on small data sets, or don’t effectively map to the human visual and decision-making processes. To this end, our premises include: ■ Analyst involvement. We worked with security ana-

lysts with experience in large government networks. Their continual interactive involvement has ensured our work’s value and validity and thus a good fit between problem and solution, based on user needs. ■ Realistic data. We developed a realistic scenario to validate the design and used simulated data for testing. ■ Data size and completeness. The visualization handles an organization’s subnets and hosts, numerous data sets, and disparate relationships across multiple logs. Our scalability solution has widespread applicability in visualization research. ■ Holistic view. Providing a visual holistic view of the network’s status—the least fulfilled need in state-of-

gathered by other sensors. The benefit of such an approach lies in the separation of the various sensors. The user can thus optimize each sensor’s indicator for the data it produces, and then can choose which sensors to use in an analysis. Furthermore, the failure of one sensor doesn’t impact the rest of the system’s capability. Consequently, the separation between sensors is also the weakness of this representation technique. Because each indicator is isolated, the user must observe, condense, and integrate information generated by the independent sensors across the entire enterprise. This process of sequential, piecewise data gathering makes it difficult to develop a coherent, real-time understanding of the interrelationship between the information being displayed⎯particularly the identification of malicious attacks.

References 1. S. Teoh et al., “Case Study: Interactive Visualization for Internet Security,” Proc. IEEE Conf. Visualization, IEEE CS Press, 2002, pp. 505–508. 2. S. Teoh, K. Ma, and S. Wu, “Visual Exploration Process for the Analysis of Internet Routing Data,” Proc. IEEE Conf. Visualization, IEEE CS Press, 2003, pp. 523–530. 3. J. McPherson et al., “Portvis: A Tool for Port-Based Detection of Security Events,” Proc. CCS Workshop Visualization and Data Mining for Computer Security, ACM Press, 2004, pp. 73-81. 4. X. Yin et al., “Vis-Flowconnect: Netflow Visualizations of Link Relationships for Security Situational Awareness,” Proc. CCS Workshop Visualization and Data Mining for Computer Security, ACM Press, 2004, pp. 26-34. 5. K. Lakkaraju, W. Yurcik, and A. Lee, “NVisionIP: Netflow Visualizations of System State for Security Situational Awareness,” Proc. CCS Workshop Visualization and Data Mining for Computer Security, ACM Press, 2004, pp. 65-72. 6. A. Wood, Intrusion Detection: Visualizing Attacks in IDS Data, Global Information Assurance Certification (GIAC) Practical, SANS Inst., 2003.

the-art technology—helps analysts quickly decide how pervasive and severe problems are, and how to direct further attention. ■ Environment extensibility. We gave users the ability to add new data sources, alert types, attack signatures, and data views, as well as to enrich the visualization with user suggestions. Our goal is to aid analysts’ decision making by providing a visual correlation mechanism. We don’t try to solve the entire intrusion-detection problem, nor do we aim to make decisions for the user.

Interdisciplinary design process We employed a user-centered interdisciplinary methodology6 for developing information displays that promotes design as a function of human behavior and interaction between subject and object. We drew our research techniques from several disciplines, including

IEEE Computer Graphics and Applications

49

Visualization for Cybersecurity

solid task analysis at the onset tend to be consistently more useable, lead to better human performance, and require less training.10 In the study, we used the knowledge of intrusiondetection analysts, network administrators, and security-assessment professionals. The goal of the analysis is to ensure that the intended users will find the visualizations meaningful and intuitive, identify the components from a list of alternatives, and extract useful information from the domain-specific design. To achieve an understanding of the user’s mental model, we

Analysis Problem space analysis Team

Mental model analysis Refinement

Data analysis

Design Conceptual representation scheme Information architecture

Scenario representation

Usability testing Design evaluation protocol

Implementation Computer science

Refinement

Team

■ performed background analysis, including a litera-

Refinement

User interaction

Technical representation assessment

ture review and informal consultations with researchers; ■ conducted semistructured interviews with administrators, security analysts, and decision makers; ■ made unstructured naturalistic observations of problem solving; and ■ organized and reported the data into workflow diagrams. During the domain analysis, we attempted to gain understanding in these key areas:

Prototype development ■ rules of thumb or tricks of the trade that guide rea-

soning; Cognitive psychology

Formal testing

■ empirical knowledge gained by experience, drawing

High-fidelity simulation

■ expert’s overall model of the problem; and ■ tasks, including control, prediction, diagnosis, plan-

on laws and relationships;

ning, monitoring, instruction, and interpretation. Company

Success

1 Interdisciplinary design methodology using techniques from cognitive psychology, architecture and design, and computer science. architecture, cognitive psychology, and computer science (see Figure 1). We loosely based our design approach on Snodgrass and Coyne’s hermeneutical circle concept,7 which is an iterative process of implementing a design, learning and understanding from discussion and feedback, and subsequently refining the design.

Domain analysis Our domain analysis study aims to identify the most important objects and operations in the chosen domain, these objects’ attributes, the relationships among objects, and how people in the domain interact with them.8 The result is a conceptual model representing the system scenarios and the functional relationships and criticality among variables, whether objective or subjective (in the user’s mental model). This is necessary to design the software and the visual displays that fulfill a group of people’s needs for a particular purpose.9 Systems that have been designed or modified with a

50

March/April 2006

We’ll submit the specifics of the procedure used for domain analysis and the details of the cognitive analysis studies to a cognitive and human-factor studies publication.

Decision-making process The domain analysis work identified six discrete steps in the decision-making process. These steps identify critical areas where analysts need additional support, and where visualization can provide the greatest benefit. 1. Identify an incident related to the computer network that the individual is responsible for (that is, detect that an incident occurred). 2. Evaluate the incident to see if it’s a benign alarm or an indication that further investigation is needed (that is, is the detected incident suspicious?). 3. Determine how prevalent the problem is and what else is being affected. The analyst determines the problem’s boundaries by analyzing other information to gain knowledge about the problem’s criticality. Analysts also explore what other machines are experiencing these problems. 4. Drill down data to identify patterns and test hypotheses. The analyst tests multiple hypotheses with detailed information about the questionable matter. 5. Report and mark results to communicate information to others. After identifying a problem, the ana-

Network flow data Size

Protocol

Port

IP address

Heuristic knowledge

External information Hint from Jonzy: 20 suspicious source IP’s sending too much email

Query: view 20 suspicious source IP’s flow data Suspicious source IP isolated by previous experience; others were legitimate Contacts suspicious source IP user Check suspicious source IP flow Destination IP not in use Machine is probably compromised and sends large amounts of email Port scan on suspicous IP: blocked by firewall Check protocols used by suspicious source IP Discovers protocols not frequently used Internet storm center states protocol is increasingly dangerous Will check change tomorrow; will repeat process

2

A portion of the network analysts’ workflow diagram resulting from the domain analysis.

lyst records and describes it within the larger context. 6. Direct a response. The analyst directs the responsible individuals to respond appropriately to the problem. Figure 2 shows the workflow diagram section resulting from the domain analysis. These workflow diagrams help designers determine the most relevant information to visualize different stages of the decision-making process.

their network of responsibility. The source IP might become an object of interest and investigation after they detect a problem. ■ Detecting potentially dangerous attacks requires the query and correlation of enterprise-wide large data sets. Users want access to all sorts of data, but need the capability to filter and remove clutter. Our findings provided guidelines and priorities for designing the visualization.

Visualization design Relevant factors in data analysis and user requirements The domain analysis work also let us identify the data analysis priorities and process. Our relevant findings include: ■ A false-positive alert shouldn’t appear correlated to

other alerts, but a sustained attack will likely raise several alerts. Furthermore, real attack activities will likely generate multiple alerts of different types. ■ Users need a primary view of the destination IPs in

The first step in the design phase is to develop a set of visual metaphors and descriptors along with rules defining why, how, and where to use each descriptor. The objective is to represent information by exploiting perceptual abilities innate to human beings and embedding them into a set of objects’ graphic properties, behaviors, and relationships. We use basic 2D and 3D design principles such as ■ mapping data values to 1D, 2D, and 3D geometrical

primitives;

IEEE Computer Graphics and Applications

51

Visualization for Cybersecurity

determined that alerts must possess what we term the W3 premise: the when, where, and what attributes. This concept lets us visually correlate multiple alerts.

When

■ When refers to the point in time at which the alert

occurred. ■ Where refers to the local network node—for example,

an IP address that the alert pertains to. ■ What refers to an indication of the alert type—for

example, ($log = snort, gid = 1, sid = 103$).

Where

What

3

The VisAlert W3 visualization concept: a line connecting an alert type (what) at time (when) to a resource (where) represents an alert instance.

■ assigning graphic attributes such as color and texture; ■ using graphic associations such as proximity, loca-

tion, similarity, and contrast; and ■ assigning transformations such as changes in the

design geometry or organization. For instance, the application of perceptual grouping (using color, similarity, connectedness, motion, sound, and so on) can facilitate the understanding of the relationships between individual pieces of data. Proper presentation of information also affects the speed and accuracy of higher-level cognitive operations. Modern human factors theory suggests that for effective data representation we must present information in a manner consistent with the user’s perceptual, cognitive, and response-based mental representations. When the information is consistent with cognitive representation, performance is often more rapid, accurate, and consistent. Conversely, failure to use perceptual principles appropriately can lead to erroneous information analyses. It’s therefore imperative that we present information in a manner that facilitates the user’s ability to process it and minimizes any mental transformations that must be applied to the data. This qualitative filtering and depiction of information toward achieving a clear end essentially constitutes representation design.11,12

We typically correlate alerts based on their when or what attributes. If we group the alerts based on their what attributes, we correlate them within their groups based on additional attributes associated with that attribute. However, the alert’s real value relates to the local resources it pertains to. Preserving the resources’ status and integrity is in fact an IDS’s main focus. The alerts’ what and when attributes have little if any inherent value by themselves. Consequently, visually correlating alerts with respect to resources is the key factor of this work. A discussion of prior work and issues of correlating alerts is available elsewhere.13 The need to correlate the who attribute is secondary in the decision-making process. Using the W3 concept lets us simplify the representation, considering the visual clutter that would arise from such a huge domain as remote IPs. We can thus concentrate on the local resources, which are what analysts try to protect. However, we incorporate the who to obtain a full representation of who, when, where, and what (W4) using the virtual log, which we describe later.

Visualization concept Figure 3 shows our design layout, which maps an alert’s where attribute into the center of the circle. We represent this using a topology map of the network under scrutiny. The layout maps an alert instance’s what attribute to the different sections of the outside circular element. This arrangement allows for flexibility with regard to the number of alert types as well as easy integration of new alert types. The layout maps the when attribute of an alert instance to the circle’s radial sections, moving from most recent (closest to the topology map) to the least recent as it radiates outward. We can now visualize alert instances as lines from ρ(what, when) → (angle, radius) on the outer ring, to Ψ(where) → (x, y) in the inner circle, where ρ and Ψ are general projections of the alerts into our two domains. Our system lets the user dynamically control and configure these two projections as necessary. To reduce the possible visual clutter when showing all alerts simultaneously, we divide the when space into varying intervals and show only the alert instances for the most recent history period. The remaining history periods show only the number of alert instances that occurred during that period.

W3 concept The main problem in correlating alerts from disparate logs is the seeming lack of mutual grounds on which to base any kind of comparison between alerts. We’ve

52

March/April 2006

Additional visual indicators We incorporated additional visual indicators that encode information to increase the user’s situational

awareness. In the design’s first iterations, we used color to identify alert classifications. In current display implementations, color indicates that user-determined thresholds have been exceeded. For instance, red indicates high priority, while green indicates low priority. We’ve also adopted a method of increasing the icon size for nodes experiencing several alerts. The assumption is that a resource or node on the topology that’s experiencing multiple unique alerts from both host- and network-based sources has a higher probability of malicious activity than one experiencing only one alert. A scan of a particular machine is an example. Although the scan might generate a Snort alert, the activity might be benign; however, a standard IDS will catch this simple probe and reject the traffic. If, on the other hand, a machine is receiving a Snort alert in addition to a Windows log alert, that machine might be experiencing an intrusion attempt or even a successful attack. The node’s size is a clear indicator and easily distinguishes the node from other machines, thus attracting the attention of the user, who can correct the problem on the suspect machine. The alert beams encode a problem’s persistence. If many of the same alerts are triggered on a particular machine over a given time interval, the line thickens to show the number of alerts (see Figure 4). In this manner, continual or recurring problems quickly become evident, letting the user take swift action. A beam’s color encodes the alert’s severity when available⎯for example, Snort associates a severity level with each alert. Thus, more severe problems become immediately distinguishable from other alerts.

Visual filters VisAlert provides many ways to filter the data to reduce visual clutter or help network analysts focus on particular events of interest. Users can turn the alert beams on and off globally, resulting in small lines indicating which alert has been triggered on the particular nodes using color and orientation. Users can selectively turn particular alert beams on or off by clicking the desired beam. Users can turn alert groupings and individual alerts on or off through a dialog box. This can help users fine-tune the display to show only alerts that are relevant and of high priority to their organization, eliminating many instances they would otherwise observe. In addition, users can filter the data to show machines experiencing a certain number of alert types, with specific IP ranges, experiencing the same alerts, or that have the same outside IP associated with them or a particular alert.

Simulated attack scenario We used a simulated attack scenario to validate the display’s efficacy prior to implementation. The sequence of the images in this scenario shows how a malicious attack emerges out of the background noise in our visualization design, helping users to rapidly detect and identify the attack. The attacks consisted of exploiting a vulnerable host to gain access to more secure machines. A security assessment expert developed the scenario. He generated an attack using different methods and broke

When

Where

What

4 VisAlert exhibiting multiple alerts and additional relevant visual indicators, including alert type using color coding, larger node size showing more alert types, and larger beam size for persistence of a particular problem. the attack into different stages. To add sufficient noise, we fed this information into a data set polluted with other network traffic. We characterize this scenario as an external attacker with five distinct stages. During the five stages, as it moves from normal network activity to data exfiltration, the visualization will show how the node under attack slowly emerges out of the background because of the number of types of alerts it receives.

Stage 1: reconnaissance Reconnaissance is the identification of hosts and services on a targeted network. This form of reconnaissance often involves simple Web queries, social engineering, and dumpster diving. Figure 5 (next page) shows the network’s status during the reconnaissance stage. Given the attacker’s lack of presence on the network, this can also be considered normal network activity with multiple instances of Snort alerts tripped at a particular time. In this initial attack stage, the attacker is generally passive with respect to the network. At this time, identifying an attack in the noisy normal network activity is unlikely.

Stage 2: probe In this context, a probe is an attacker’s attempt to gather information about services on a targeted host or hosts discovered during the reconnaissance phase. Analysts could see the Internet Protocol Communication (IPC) violations during this phase because of a particular Snort alert that was tripped on a machine on their

IEEE Computer Graphics and Applications

53

Visualization for Cybersecurity

Checksum -45 -35 -25 -15 -5 5 -0

Win do w

ev en ta ler ts

P FT ale

rt s H TTP a le r t s

Sn ta or

le r

ts

5 In stage 1, the attacker is doing reconnaissance—that is, looking for hosts and services on the network. VisAlert exhibits normal activity.

requirements. This often indicates a forged packet⎯ that is, an attacker created a packet not conforming to a proper connection. This could indicate an attempt to hijack a session, scan a system, or attack a vulnerability. The line’s thickness indicates the persistence of the same Snort alert over time. A persistent Snort alert indicates its recurrence. This is typical of naive scans in which an attacker begins scanning a sequence of ports on a single or multiple machines. In this case, the attacker has targeted a single host with a long-running scan. Such a scan can not only identify what services are running but can also potentially identify what version of the services are in use, as well as the version of the operating system. An attackers can use this type of detailed information to identify detailed vulnerabilities for known attacks—that is, it can identify a version of a service with a known buffer-overflow vulnerability. The environment’s extensibility lets the visualization represent any alert, no matter what instrument generated it. In other words, if a new instrument generates other types of alerts, VisAlert can directly incorporate its results through a plug-in architecture. Figure 6 shows a probe and a connection (correlation) between the IPC interface (shown with a higher-priority Snort alert) and a Windows VMTools alert. Such a correlation between events indicates a progressing attack.

Stage 3: attack Checksum -45 -35 -25 -15 -5 5 -0

Win do w

ev en ta ler ts

P FT ale

rts H TTP a le rt s

Sn ta or

le r

ts

6 In stage 2, the attacker probes the network. VisAlert exhibits persistence of an alert on a host. Simultaneously, the attacker triggers a second alert type. network topology. An IPC violation occurs when a connection attempts to violate defined TCP or IP interface

54

March/April 2006

In this context, an attack on a vulnerable system is an attempt to gain unauthorized access to a network host, usually by exploiting a vulnerable network service. We captured several attacks during this simulation. The first attack was an attempt to access the vulnerable system by guessing the administrator password— a common brute-force attempt to break into a system. Computer logs indicate repeated failed passwords as attempted logins. The second attack exploited a vulnerability in the Windows Local Security Authentication Subsystem Service. LSASS has a known buffer-overflow vulnerability in several of its versions. Snort uses pattern recognition to identify packets containing the compromised code for this vulnerability and generates an alert on identifying such a packet. MS Windows uses LSASS for all authentication, thus it appears in this attack multiple times. Figure 7 shows another attack, which involves generating heavy scanning activity on another host on the network as a diversion. Sophisticated attackers often create noise to cover their tracks. Generating many alerts through port scanning makes it far more difficult for an analyst to pick out and identify the more noteworthy alerts. The heavy lines that emerge out of the background represent two machines experiencing persistent indications of a scan.

Stage 4: dig-in Dig-in is a catch-all term for describing actions taken by an attacker that leverages newly gained privileges on the compromised system. This could include download-

ev en ta ler ts

P FT ale

r ts H TT P a le r t s

le rts

7 In stage 3, the attacker attempts to access a vulnerable system and trigger multiple alerts on the host while diverting attention by heavily probing another host. Checksum -45 -35 -25 -15 -5 5 -0

Win do w

ev en ta ler ts

P FT ale

rts H TTP a le r t s

Migration is a human attacker’s attempt to use a compromised system to attack other systems within the targeted network. Migration relies on the fact that the attacker has gained access to a host on the secure side of the firewall, and will be able to see hosts and services not visible from an external host. In this simulation, the attacker generated a successful attack on the victim, followed by a TFTP session to download a toolkit, followed immediately by rapid scans for other vulnerable hosts. Figure 9 (next page) shows the correlation of these almost simultaneous alert triggers of different kinds on the same host, while other hosts have triggered alerts, but of one kind. The node’s increasing size lets analysts focus their attention on the host that’s actually being attacked, while the divertive or normal activity remains in the background, cause for lesser concern.

Win do w

ta or

Stage 5: migration

Checksum -45 -35 -25 -15 -5 5 -0

Sn

ing toolkits or modifying files on the compromised system to hide malicious activity. The end goal is installing a rootkit, which will let the attacker gain easy access in the future, cover his or her tracks, provide complete access to all system resources, and let the attacker identify and attack additional systems using the just-compromised system as a jumping-off point. In this simulation, the attacker generated a Trivial File Transfer Protocol (TFTP) GET command, commonly generated by compromised systems and using automated attack tools and worms. (This TFTP command is part of the first LSASS attack described earlier.) The attack’s goal here is to download the appropriate rootkit and the attacker’s toolkit for use against other systems in the network. The attack then redirected a Windows command prompt, followed by multiple TFTP GET commands. This redirection let the attacker execute command from a file and subsequently download an entire set of files in rapid succession. Figure 8 illustrates this attack. In this stage, the attacked node begins to expand, which might indicate to a network analyst the need for action. The node’s size indicates the number of alerts associated with that host. A large number of distinct alerts suggests a progressing attack.

Testing VisAlert

Sn ta or

To test our system’s capabilities with larger and more complex data, we used a data set generated by Skaion Corporation for use by the Intelligence Community, Advanced Research and Development Activity (ICARDA) research projects. This data set, which contained numerous disparate logs and alerts from various sensors and hardware, simulated attack scenarios in large notional unclassified intelligence community environments. Because of the research’s sensitive nature, we can’t provide additional details on the specifics of the data or attack scenarios. Figure 10 shows alerts for Snort, dragon, and firewall logs. The firewall generated numerous alerts (blocked traffic), but of only two types. On the other hand, the Snort log had thousands of alert types but few were actually triggered in the tests we present here. In con-

le r

ts

8 In stage 4, the attacker attempts to access other systems, triggering multiple alerts on the already compromised system. trast, the dragon log provides a rich set of alerts, many of which were triggered.

IEEE Computer Graphics and Applications

55

Visualization for Cybersecurity

Checksum -45 -35 -25 -15 -5 5 -0

Win do w

ev en ta ler ts

P FT ale

r ts H TTP a le r t s

Sn ta or

le r

ts

(a) Checksum -45 -35 -25 -15 -5 5 -0

Win do w

ev en ta ler ts

P FT ale

r ts H TTP a le r t s

Note the correlation between alerts from the dragon (blue), Snort (green), and firewall logs (orange). Figure 10c shows a large attack on many nodes. This view includes the virtual log (top talker), which shows the attack’s who attribute. These outside IPs show in one view what alerts they’ve generated at what time and on what local machine. Using this view, a user could easily see a distributed attack on one node on their system. We deployed the VisAlert prototype at the Air Force Research Lab (AFRL) in Rome, New York. We worked with system analysts with a decade of experience and network-wide responsibility for specific AFRL sites. Such key analysts have been a focal point in our new technology’s development and the network data’s analysis. In this installation, VisAlert generated a positive response. Users specifically noted its effectiveness, simplicity, and flexibility. They stated that it might increase situational awareness by letting them see a holistic view of their network security status. AFRL staff want to integrate VisAlert with their tools because it lets them see information that their systems might not currently identify. Specifically, they used VisAlert as a visualization front end to demonstrate their Air Force Enterprise Defense system to the US Department of Defense. To a great extent, we’ve incorporated the analysts’ suggestions, resulting in a more usable and useful tool. AFRL continues to evaluate the tool, and we incorporate analysts’ suggestions as we receive them. Evaluation and testing is scheduled at the Army Research Lab and at the US National Security Agency. We presented VisAlert at the Information Assurance Workshop (Philadelphia, February 2005) and other meetings where it was exposed to analysts and higherlevel officials within the intelligence community and other organizations in the Department of Defense. They expressed interest in performing formal testing in operational environments, including VisAlert in a software bundle for their customers, and further developing the tool, including its incident reporting functionality.

VisAlert features and limitations

Sn ta or

le r

ts

(b)

9 In stage 5, (a) the attacker attempts to access a vulnerable system, triggering multiple alerts on that host, while diverting attention by heavily probing another host; and (b) the analyst has filtered out activities of hosts that aren’t of interest. The images in Figure 10 show different examples of the visualization in different scenarios. Figure 10a shows normal traffic. A few machines are experiencing alerts; however, the alerts are uncorrelated, as expected. Figure 10b shows an attack on several local machines.

56

March/April 2006

The VisAlert software already has several interactive features allowing it to filter out or expand details, including the implementation of virtual logs (see the “The Virtual World” sidebar, pg. 58) and the level of detail of the when and what attributes. In the when axis, VisAlert lets users configure different time increments to explore potential patterns at different time scales. In the what axis, VisAlert software lets users collapse and expand alert groupings, allowing varying detail levels in the log hierarchies. In its current implementation, VisAlert’s ability to interact with the where attribute space is limited. We’re currently implementing automatic topology generation, which is a priority for testing in different environments. Future research includes detail level in the topology display and the representation of dynamic networks. We distilled the domain analysis underpinning VisAlert’s visualization concept into a decision-making process that’s common among many of the analysts we observed. However, VisAlert might be limited in its ability to, or inappropriate for, enhancing some problem types experienced by certain analysts and organizations.

Future work Ongoing and future work is in several areas. First, we plan to design additional visualization structures to let analysts perform analysis and hypothesis testing of alert details, and to let decision makers view incident reports (the VisAlert system will evolve in a visual continuum to allow seamless transition from a holistic view of the system to detail drill-down). We’ll also develop feature enhancements to let users encode and correlate their own alert algorithms, and enhanced capabilities for selecting and displaying detail level. In addition, we’ll deploy VisAlert in an operational environment. Finally, we’ll perform formal testing—that is, measure performance with respect to existing tools on equivalent scenarios—in a simulated environment. Formal testing of VisAlert will show whether VisAlert improves recognition and identification of a compromised computer network or workstation. We’ll use various simulated network states, both threatened and nonthreatened, to assess the visualization tool’s applicability. We’ll test users individually in two experimental sessions, counterbalancing network conditions to control for order effects. We also hypothesize that the visualization tool will reduce analysts’ workload, as workload assessments measured by NASA’s task load index tool should indicate. We believe the anticipated difference in workload will derive from the integrated and intuitive presentation of information afforded by the visualization tool. ■

(a)

Acknowledgments We thank the network security experts and managers from Battelle, the AFRL, and the University of Utah (Information Security Office, NetCom, Center for High Performance Computing, and Scientific Computing and Imaging Institute), who significantly contributed to the domain analysis work. We also thank AFRL and NSA for hosting tests of the VisAlert system, and to the Skaion Corporation and IC-ARDA for providing us with their attack simulation data set. Special thanks to Jeff Thomas for creating the simulated attack described in this article, Kirsten Whitley for providing valuable feedback and access, and Marty Sheppard for providing continuous feedback and suggestions on the technology development. A grant from the IC-ARDA (with contracting and technical management by AFRL Information Directorate) and the Utah State Center of Excellence Program partially supported this work.

(b)

References 1. R. Bejtlich, The Tao of Network Security Monitoring: Beyond Intrusion Detection, Addison-Wesley Professional, 2004. 2. E. Tufte, The Visual Display of Quantitative Information, Graphics Press, 1983. 3. K. Lakkaraju, W. Yurcik, and A. Lee, “NVisionIP: Netflow Visualizations of System State for Security Situational Awareness,” Proc. CCS Workshop Visualization and Data Mining for Computer Security, ACM Conf. Computer and Comm. Security, ACM Press, 2004, pp. 65-72.

(c)

10

Visualization of alerts. (a) Normal activity. (b) Attack on specific machines. A purple color log represents the attack’s who attribute. (c) Multiple attacks on many machines and a firewall blocking a scan activity.

IEEE Computer Graphics and Applications

57

Visualization for Cybersecurity

The Virtual World To expand the domain over which VisAlert operates, we introduce the notion of a virtual world—that is, a domain of information or metadata about the logs and alerts stored in the database. In accordance with our general approach, we don’t generate new alerts based on alerts in the database. Other intrusion detection systems (IDSs) perform data mining and create new types of logs and alerts. The key difference is that these IDSs generate persistent data that are stored in a database. Our virtual world extension is temporary. The information is gathered on the fly, depends on the current user setup, and isn’t archived.

Virtual alerts A virtual alert represents any kind of information that occurs during a particular time period and can be gathered from the alerts. We call this information an alert because we provide it to VisAlert via the regular alert mechanism. For example, a key issue raised by the analysts we collaborated with is the notion of top talkers. In the context of our discussion, top talkers are nodes outside the installation that generate the most alerts during a specific time period (for example, the most recent history period or the innermost ring). Obviously, such information can be computed and gathered in the database, but it isn’t explicitly stored or computed ahead of time. To facilitate this talkalot example, we define new alerts whose type indicates a remote machine. The alert contains the number of alerts that the remote node generated in the specified time period with respect to our local nodes. Given a specific time period, we aggregate the alerts in the database based on the remote machine, sort them based on the number of alerts per machine, and then select the top 10 talkers.

Virtual views The top talkers in particular, and the virtual alerts in general, extend the model domain and increase the number of alert types. As such, we can use the same presentation methods we applied to the regular persistent alerts, such as hierarchical grouping and multiple views.

For example, we can group the top talkers based on their IP addresses, or, if we list the top 100 talkers, we can organize them in groups of 10. We can also use a view in which we place the top talkers in order along the circle based on the number of alerts. The problem with this approach is that in the likely event that a top talker in a particular time period is also one of the top talkers in the next period, the relative position might differ. In this case, the user might lose track of the top talker and not notice the problem’s persistence. An alternative view might consider the top talkers in the previous time period. Once a top talker is assigned a position around the circle, it stays in that position for as long as it’s part of the top-talker group. This approach provides consistency, but requires the user to notice when the top talker drops out of the top group and is replaced by a new top talker. To help the user notice such changes, we add a dark red background to the top talker’s name (its IP address). If the top talker remains in place after the next clock cycle, the background becomes brighter, signaling this top talker’s persistence. We can also ask for the top talkers with respect to the number of types of alerts (signatures) these remote machines triggered rather than the number of alerts they generated. In this case, the top talker definition differs (total number of alerts versus number of unique signatures) and thus these two views are essentially two different (virtual) logs. However, because these virtual logs represent two views of the same concept (top talker), we can regard them as two views of a single log.

W4 and top talkers Top talkers are an example of how to correlate relevant who attribute information, thus filtering the immense source IP data set. The who information might also be of interest when requesting event details: the source IP can be included in a pop-up display.

4. K. Vicente, K. Christoffersen, and A. Pereklita, “Supporting Operator Problem Solving through Ecological Interface Design,” IEEE Trans. Systems, Mass, and Cybernetics, vol. 25, 1995, pp. 529-545. 5. J. Agutter et al., “Evaluation of a Graphic Cardiovascular Display in a High Fidelity Simulator,” Anesthesia and Analgesia, vol. 97, 2003, pp. 1403-1413. 6. J. Bermudez et al., “Interdisciplinary Methodology Supporting the Design Research & Practice of New Data Representation Architectures,” Proc. European Assoc. for Architectural Education/Architectural Research Centers Consortium (EAAE/ARCC) Research Conf., Dublin Inst. of Technology, 2004, pp. 223-230. 7. A. Snodgrass and R. Coyne, “Models, Metaphors, and the Hermeneutics of Designing,” Design Issues, vol. 9, no. 1, 1992, pp. 56-74. 8. D. Monarchi and G. Puhr, “A Research Typology for Object-

58

March/April 2006

9. 10.

11. 12.

13.

Oriented Analysis and Design,” Comm. ACM, vol. 35, no. 9, 1992, pp. 35-47. R. Priéto-Díaz, “Domain Analysis: An Introduction,” ACM Sigsoft/Software Eng. Notes, vol. 15, no. 2, 1990, pp. 47-54. W. Zachary, J. Ryder, and J. Hicinbothom, “Building Cognitive Task Analyses and Models of a Decision-Making Team in a Complex Real-Time Environment,” Cognitive Task Analysis, Lawrence Erlbaum Assoc., 2000, pp. 365384. C. Ware, Information Visualization: Perception for Design, Morgan Kaufmann, 2000. A. Triesman, “Preattentive Processing in Vision,” Computer Vision, Graphics, and Image Processing, vol. 31, 1985, pp. 156-177. Y. Livnat et al., “A Visualization Paradigm for Network Intrusion Detection,” Proc. IEEE Workshop Information Assurance and Security, IEEE CS Press, 2005, pp. 92-99.

Stefano Foresti is cofounder and director of the Center for the Representation of Multi-Dimensional Information (CROMDI), senior scientist at the Center for High-Performance Computing at the University of Utah, and president of Intellivis. His research interests include visualization, user-interaction design, security, distributed computing, intellectual property, and technology commercialization. Foresti has a doctorate in mathematics from the University of Pavia, Italy. Contact him at [email protected]. James Agutter is an assistant research professor in the College of Architecture + Planning, University of Utah, and assistant director of CROMDI. His research interests include information visualization, human– computer interaction, user interface design, and technology transfer. Agutter has an MS in architecture from the University of Utah. Contact him at [email protected]. Yarden Livnat is a research scientist at the Scientific Computing and Imaging Institute at the University of Utah. His research interests include visual analytics with emphasis on situational awareness, scientific visualization, and software common components architecture. Livnat has a PhD in computer science from the University of Utah. Contact him at [email protected].

Robert Erbacher is an assistant professor in the Computer Science Department at Utah State University. His research interests include computer security, intrusion detection, computer forensics, data visualization, and computer graphics. Erbacher has an ScD in computer science from the University of Massachusetts-Lowell. Contact him at [email protected]. Shaun Moon is a research assistant at CROMDI and is pursuing an MS in computational design at Carnegie Mellon University. His research interests include communication design and information visualization. Moon has a BS in architectural studies from the University of Utah. He is a student member of the IEEE and the Information Architecture Institute. Contact him at [email protected].

For further information on this or any other computing topic, please visit our Digital Library at http://www. computer.org/publications/dlib.

Join the IEEE Computer Society online at www.computer.org/join/ Complete the online application and get • immediate online access to Computer • a free e-mail alias — [email protected] • free access to 100 online books on technology topics • free access to more than 100 distance learning course titles • access to the IEEE Computer Society Digital Library for only $118 Read about all the benefits of joining the Society at

www.computer.org/join/benefits.htm

IEEE Computer Graphics and Applications

59

Visual Correlation of Network Alerts

an organization's subnets and hosts, numerous data sets, and .... domain analysis and the details of the cognitive analy- ..... often involves simple Web queries, social engineering, ..... number of alerts versus number of unique signatures) and.

5MB Sizes 1 Downloads 199 Views

Recommend Documents

Implementation of a neural network based visual motor ...
a robot manipulator to reach a target point in its workspace using visual feedback. This involves primarily two tasks namely, extracting the coordinate information ...

Alerts, Notifications and Warnings Final.pdf
Section 255 of the Telecommunications Act of 1996. • requires telecommunications products and services to be accessible to. people with disabilities. Page 3 of ...

new-alerts-site-guide.pdf
האוניברסיטה )אותה הסיסמא המשמשת גישה למקורות האוניברסיטה כגון moodle , הספרייה, הפורטל ועוד(. https://msgs.haifa.ac.il/~edumod/ :היא האתר כתובת. אופן פרסום ה×

Gmail Verification Alerts -
Soloway, Mark S . 27 maggio 2013 08:17. Gmail Verification Alerts, Find attached. Sincerely,. Gmail Team! Gmail Verification Letter.pdf. 100K. Pagina 1 di 1. Gmail - Gmail Verification Alerts. 27/05/2013 https://mail.google.com/mail/u/0/?ui=2&ik=5540

D:\Zulfiqar\Morning Alerts\Dail -
Salient features of Q4FY07 results. ❍ On top of ... The overseas investment limit (total financial commitments) for Indian companies investments in joint ... No change in any of the policy rates is a big relief for banking industry; the central ban

Correlation characteristics of the secular variation eld
the distance between points on the core surface by and the distance on the Earth's surface by a. Kliorin et al. 1988] suggested a model of the auto- correlation ...

Gannett NJ Text Alerts
68% of people 18-34 use SMS (Short Message Service). • Text-based ads can be used for branding or, better yet, short calls to to action: e.g. “Reply Y for coupon.” • Simple interactive messages have the highest recall: 80% for SMS Reply to 57

Carpooling System with SMS Alerts
app is working as social networking site so the security is big issue. We have trace this .... The most popular one & the one recognized as the official servlet/JSP ...

Dynamical and Correlation Properties of the Internet
Dec 17, 2001 - 2International School for Advanced Studies SISSA/ISAS, via Beirut 4, 34014 Trieste, Italy. 3The Abdus ... analysis performed so far has revealed that the Internet ex- ... the NLANR project has been collecting data since Novem-.

Correlation of Diffuse Emissions as a Function of ...
Feb 17, 2014 - This memo came about to investigate the non-zero phase shift seen in correlations from two co- linear dipoles spaced very close together, where a zero phase shift was expected. The formulation expands on that of LWA Memo 142 [1], and m

SHORTWA'U'E RADIO PROPAGATION CORRELATION WITH ...
over the North Atlantic for a fire-year period, and the reiatioc position of the planets in the solar system, discloses some oerp interesting cor- relations. As a result ...

Informationally optimal correlation - Springer Link
May 3, 2007 - long horizon and perfect monitoring of actions (when each player gets to ..... Given a discount factor 0 < λ < 1, the discounted payoff for the team induced ..... x and y in (0, 1), recall that the Kullback distance dK (x y) of x with 

Multiple Choice Questions Correlation - Groups
A research study has reported that there is a correlation of r = −0.59 between the eye color (brown, green, blue) of an experimental animal and the amount of nicotine that is fatal to the animal when consumed. This indicates: (a) nicotine is less h

Visual Network Forensic Techniques and Processes - Semantic Scholar
Cyber security has become a major point of concern given ... What did they gain access to? ..... source of the attack then all access to that port can be filtered.