This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author’s institution, sharing with colleagues and providing to institution administration. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright

Author's personal copy

Future Generation Computer Systems 24 (2008) 452–459 www.elsevier.com/locate/fgcs

A framework for distributed content-based web services notification in Grid systemsI Andres Quiroz ∗ , Manish Parashar Department of Electrical and Computer Engineering, Rutgers University, 94 Brett Road, Piscataway, NJ 08854, United States Received 1 March 2007; received in revised form 23 May 2007; accepted 10 July 2007 Available online 14 July 2007

Abstract We describe a content-based distributed notification service for Publish/Subscribe communication that implements the WS-Notification (WSN) standards. The notification service is built on a messaging framework that supports an associative rendezvous messaging mechanism on a structured overlay of peer nodes. The entire system acts as a notification broker, so that notification publishers and subscribers outside the network can achieve loosely coupled communication through a decentralized, scalable service, by interacting with any of the broker peers. Self-optimizing mechanisms are built into the framework to reduce notification traffic within as well as from the peer network. We describe the framework and its performance and evaluate the effectiveness of the self-optimizing mechanisms. c 2007 Elsevier B.V. All rights reserved.

Keywords: Distributed systems; Web-based services; Network communication

1. Introduction Web services have emerged as one of the key enabling technologies for Grid systems, providing platform-independent interactions between distributed applications and resources. The WS-Notification (WSN) specification [1] is a set of web service standards that define protocols for realizing the publish/subscribe communication pattern. Most existing implementations of this emerging standard (see Section 6) are essentially bindings to programming languages that include extensible API’s for application developers, and are thus purposefully generic. As such, they are not meant to address some important issues not regulated by the standard, such as mechanisms for the matching of publishers to subscribers, and the efficient and scalable management of subscriptions and routing of notifications. Of these issues, the former pertains to the degree of coupling between publishers and subscribers, I The research presented in this paper is supported in part by National Science Foundation via grants numbers ACI 9984357, EIA 0103674, EIA 0120934, ANI 0335244, CNS 0305495, CNS 0426354 and IIS 0430826, and by the Department of Energy via grant number DE-FG02-06ER54857. ∗ Corresponding author. E-mail addresses: [email protected] (A. Quiroz), [email protected] (M. Parashar).

c 2007 Elsevier B.V. All rights reserved. 0167-739X/$ - see front matter doi:10.1016/j.future.2007.07.001

while the latter deals with the distribution and interaction between the providers of the notification service. With respect to these issues, the nature of pervasive Grid environments creates specific challenges. Grids are large-scale systems where potentially many publishers and subscribers (sensors, computation services, agents, etc.) need to exchange information. As a result, it is critical that the notification service be distributed, decentralized, and scalable. The degree of coupling between Grid elements can be characterized in the following way. Tight coupling implies that interacting elements have specific information about each other and are able to communicate directly and efficiently to achieve a given task or workflow. In general, however, the tasks and workflows that run on Grids are highly dynamic and may be made across application and organizational domains, requiring decoupled interactions without global synchronization of addresses or naming conventions. However, some tasks or workflows might still be more tightly coupled than others, given the physical or logical proximity (control) of computing elements. This paper describes the design and implementation of a notification broker service for subscription management and notification dissemination targeting highly dynamic pervasive Grid environments that adopt the WSN standards.

Author's personal copy

A. Quiroz, M. Parashar / Future Generation Computer Systems 24 (2008) 452–459

A notification broker is an entity defined by the WSBrokeredNotification (WSBrN) specification for mediating decoupled interactions between publishers and subscribers, but it is left up to particular implementations to determine how this mediation will take place. Our service design is based on a distributed and decentralized architecture that supports loosely coupled communication by semantic synchronization between communicating entities (i.e. a content-based [2] communication mechanism). Additionally, the notification service builds on a self-managing overlay that can handle coupling within and between separately communicating node clusters for enacting different kinds of tasks or workflows. To further support the efficiency and scalability of our approach, we design selfoptimization mechanisms for reducing the number of messages transmitted by the system. These optimizations are meant to alleviate the overhead of notification flows. We show experimental data that demonstrates the effectiveness of the self-optimizing behavior. The rest of the paper is organized as follows. Section 2 presents a brief overview of the WSN family of specifications. Section 3 then presents our system design and functionality, and Section 4 provides the details of our distributed framework. Section 5 then describes and evaluates the system’s selfoptimizing mechanisms. We outline related work in Section 6 and finally present our conclusions and directions for further work in Section 7. 2. Overview of WSN WSN was developed within the context of the WS-Resource Framework (WSRF), which describes how to implement OGSA capabilities using Web services [3]. The WSN specification consists of three interrelated standards: WS-BaseNotification (WSBN) [4], WS-BrokeredNotification (WSBrN) [5], and WS-Topics (WST) [6]. WSBN specifies the basic elements of the notification pattern: the NotificationConsumer (NC) that accepts notification messages, the NotificationProducer (NP) that formats and generates these messages, and the Subscription, a consumer-side initiated relation between a producer and a consumer. The only fixed field in a subscription is the consumer reference, which by itself implies the consumer’s interest in all of the notifications generated by the producer to which the subscription was made. Optionally, a subscription can contain a filter, specified as a FilterType element, that forces a producer to send only those notifications that pass (match) the filter. WSBN does not regulate the syntax or use of the FilterType element, but suggests three basic types: topic filters, message content filters, and producer properties filters. WSBN also regulates subscription management, which the consumer can perform given the reference it receives in response to a subscription. This reference is meant to contain enough information to enable it to contact and interact directly with the subscription as a resource, as defined by WSRF. In addition to the push-style pattern of notification, where producers send notifications directly to consumers, WSBN defines a pull-style pattern, where messages are stored at a predefined location (a pull-point) until they are retrieved by the consumer.

453

WSBrN defines the NotificationBroker (NB) entity and its expected functionality. A notification broker is an intermediary Web service that decouples NC’s from NP’s [5]. A broker is capable of subscribing to notifications on behalf of consumers and is capable of disseminating notifications on behalf of producers. Consumers and producers thus interact dynamically through the NB without the need for explicit knowledge of each other’s identities or locations. Management of this knowledge is delegated to the broker. An NB essentially implements the NC, NP, and other interfaces defined in WSBN. As a specific functionality, a notification broker can accept producer registrations, which is meant for realizing the demand-based publishing pattern. Using this pattern, publishers avoid the (possibly expensive) task of creating notifications when no subscriptions (and, thus, no consumers) exist for them. To this end, an NP must register with the NB, providing a set of topics. When subscriptions are made that correspond to or include topics in a particular producer’s registration, the NB subscribes to the producer for those topics. Only then does the producer start sending notifications. Finally, WST tries to standardize the way in which topics are defined, related, and expressed. It defines the notion of a topic space, where all of the topics for an application domain should be defined and organized, possibly in a hierarchical way. A topic expression is the representation of a particular topic or range of topics. The syntax of a topic expression is identified by the topic expression’s dialect. WST defines three dialects: Simple, Concrete, and Full. A simple topic expression is just the qualified topic name. A concrete topic expression is used for hierarchical topic spaces, and is given in a path notation, such as in myNamespace:news/tv/cnn. Here myNamespace identifies the topic space, and each of the subsequent identifiers belong to successively deeper levels in the hierarchy. A full topic expression is the same as a concrete expression, except that it uses special operators and wildcard sequences for spanning multiple topics within the topic hierarchy. 3. Distributed content-based notification broker Our system, referred to henceforth as the notification service, is a distributed and decentralized notification broker. Each of the nodes within the system is a peer that implements the NotificationBroker interface, and an external client (producer or consumer) can interact with any of them. Thus, the whole system essentially acts as a single NB, as illustrated in Fig. 1. This is important because the interface is not a bottleneck, and the system has no single point of failure. Service providers participate by making nodes available to the notification system, and can in turn make use of the system through these nodes. A peer-to-peer design avoids the need for centralized control and gives the service providers the flexibility to join or leave the system at will. Through the NB interface, clients and brokers realize the message exchanges defined in the WSN specifications, using XML messages that represent subscriptions, notifications, etc. Topic expressions are used by the notification service as identifiers for these messages, and provide the means for

Author's personal copy

454

A. Quiroz, M. Parashar / Future Generation Computer Systems 24 (2008) 452–459

Fig. 1. Layout of the broker network. Matching subscriptions and notifications will be routed to the same rendezvous node, which will perform the matching and relay the notification.

matching between them. Unlike a purely topic-based system, such as WSN with topic filtering, topics in the notification service are meant to be content-based. However, unlike message content filtering as described in WSN, which requires parsing the entire payload content, the idea is to remain as close as possible to the notation and semantics defined in WS-Topics for topic expressions. This is meant both to support applications that implement this standard and to simplify content-based indexing. Topics in the notification service are expressed using the same notation as in the Concrete dialect of WST (see Section 2); the difference is that, while normally the topic is an atomic unit that should correspond to an actual path in the topic space definition, here each of the identifiers is taken as a value from a separate dimension in a multidimensional information space, where only the type of each dimension is defined. Thus, the range of topics is limited only by the possible combinations of the values along each dimension. To observe the difference, consider a weather monitoring application that subscribes to sensor data. In this example, the application may define the information space with three dimensions: geographic area, barometric pressure, and temperature. An application may define a content-based interest, such as weatherService:*,NJ/>25/<50, for which a matching content-based topic could be weatherService:Piscataway, NJ/29/48.9. Defining a hierarchical topic space to enable this type of subscription would not be practical, since individual topic identifiers would be needed for each geographic location, and, worse still, for each numeric value. The system uses a rendezvous-based messaging model [7], in which matching messages “meet” at some node within the network, referred to as a rendezvous node. The matching and routing of messages to service nodes is done by parsing topic expressions and mapping their constituent values to the node identifier space, as explained in the next section. The mapping used ensures that matching topics will be routed to at least one common rendezvous node. The messaging model also applies the concept of reactive behaviors, by which the behaviors at

rendezvous nodes are determined by actions embedded in the message request (in this case, subscribe, notify, etc.). The application of this model to the main operations of WSN is described below. Subscribe and notify: Both subscription and notification messages handled by the notification service must contain a topic expression within the FilterType element as explained in Section 2. Subscriptions are first received and routed to one or more rendezvous nodes (a subscription topic can contain wildcards or ranges, so that the subscription may span multiple topics, which may correspond to one or several nodes), which store each subscription under a unique identifier. Since notifications with matching topic expressions will be routed to an intersecting set of rendezvous nodes, matches can be determined at each node, which relays notifications to subscribers based on the contact data provided in each subscription message. Fig. 1 illustrates this rendezvous process. Subscription management: As mentioned earlier, subscriptions are stored with a unique identifier at the rendezvous nodes. This identifier is returned to the subscriber when the subscription is completed, and thus can be used by it to realize the operations defined in the SubsriptionManager and PausableSubscriptionManager interfaces. Note that, since service nodes can enter and leave the system at any time, in which case stored subscriptions are offloaded to other nodes, the full topic expression must still be used for routing to the rendezvous node(s) currently holding the subscription. This differs from the use of endpoint references in WSRF that are not suitable for this dynamic context. Pull-style notification: Pull-style notification in the notification service is done in a very similar way to the way regular subscriptions are handled. The only difference is that, when a pullpoint creation request is made to a broker, a message repository is also created at each rendezvous node where the subscription is stored. Notifications are stored in these repositories rather that being relayed directly to the consumers. Finally, when a consumer invokes the GetMessages command on a broker, it queries the network with the subscription reference to obtain the notifications stored at the repositories, constructs a single response with all of these notifications, and sends them back to the client. Demand-based publishing: Publisher registration occurs in the notification service in exactly the same way as a subscription. The registration topic is used to route a registration message to a node or nodes in the network. In order to accommodate demand-based publishing, however, the procedure for a subscription detailed above must now include a query for publisher registrations that match (or, rather, overlap with) the subscription’s topic expression. If such registrations exist, then the NB that received the original subscription subscribes in turn to the producer(s) for the topic(s) given in the registration(s). Fig. 2 illustrates this mechanism. 4. Framework components The infrastructure that supports the notification service is made up of three layers, shown in the left side of Fig. 3. These

Author's personal copy

A. Quiroz, M. Parashar / Future Generation Computer Systems 24 (2008) 452–459

455

optional elements to conform to both the Java platform and our own implementation architecture. After the Java objects were created from the schema, the WSDL documents were used to create the Java service interfaces and implementing classes, which were then deployed as a service endpoint for an NB. This endpoint can then be run on an Apache or similar web service container to receive and respond to client requests like the one shown in Fig. 3(1). 4.2. Rendezvous layer

Fig. 2. Publisher registration and subsequent subscription. Notice that a subscription and notification are not necessarily the same, but as long as they overlap as some node, the registration will be retrieved.

The implementation of the rendezvous messaging model and reactive behaviors in terms of which the WSN operations were defined is based on the Meteor content-based communication middleware [8]. Meteor defines the exchange of messages of the form (header, action, data) (Fig. 3(2)), where header contains the keywords extracted from topic expressions and used for routing to the appropriate rendezvous node(s) (Section 4.3); action specifies the operation (subscribe, notify, etc.) to be carried out at the receiving node; and data holds the rest of fields and payload associated with each WSN message. 4.3. Content-based routing

layers support each of the requirements of the notification service as follows. First, the WS-Interface is responsible for parsing and formatting messages received and sent by the broker nodes, as per the WSN specification. Correspondingly, this layer also constructs and interprets the messages exchanged within the peer network. These messages are handled by the Rendezvous layer, so that matching topics are delivered to intersecting sets of nodes, as described in Section 3. This is ensured by the overlay layer and the content-based routing mechanism. The following sections give further details of the functionality and implementation of each of these layers.

To support content-based routing necessary to ensure that messages are delivered to peers based on topic expressions, we use the Squid routing engine [9]. Squid uses Hilbert Space Filling Curves (SFC’s) [10] to realize the mapping between the multidimensional topic space and the one-dimensional identifier space used by peers in the overlay. Fig. 3 shows how a topic in the multidimensional space is mapped to a node identifier. Notice that, with this mapping, continuous ranges in the identifier space correspond to continuous regions in the multidimensional space. This enables Squid to efficiently handle queries with partial keywords, wildcards, and ranges, as these queries map to a reduced number of peer nodes. Notice also that matching topics, corresponding to intersecting regions in the multidimensional space, will map to intersecting ranges in the ID space, which ensures that matching topics will “meet” at rendezvous nodes. SFC’s are discrete mappings, so that continuous data types used for any of the dimensions must be discretized by rounding or truncating values to a fixed precision. Squid’s routing mechanism ensures that queries will be resolved with bounded costs in terms of number of messages and number of nodes involved. Both Squid and Meteor (which builds on Squid) have been evaluated on a university-wide Grid, as well as on PlanetLab in terms of these performance bounds as well as end-to-end latency and load distribution and balancing (see [9,8,11]).

4.1. WS-interface

4.4. Network overlay

The Web Service interface for the notification broker was implemented in Java using the JWSDP 2.0 API and development tools. First, the XML schema for base notification and brokered notification, provided in [1], were transformed into Java objects using the JAXB binding tools, modifying some

Our overlay design is based on the Chord overlay [12]. Chord has a ring-based topology, where every node is randomly assigned an identifier in the range [0, 2m ). As nodes dynamically arrive and depart, they self-organize according to their identifier to form the ring. Nodes join by contacting

Fig. 3. Architecture and behavior of the distributed system. An XML message is received by the WS interface, and the action and topic expression are extracted. The topic expression is mapped from its multidimensional space to a node ID, which it then uses to route the message on the overlay to the rendezvous node, which must execute the corresponding action.

Author's personal copy

456

A. Quiroz, M. Parashar / Future Generation Computer Systems 24 (2008) 452–459

any other node already in the overlay, from which they obtain references to their successor and predecessor in the ring, as well as a number of extra references (chords or fingers) to allow for faster traversal of the ring (O(log n) hops). References are kept in a data structure of size m called a finger table. Successor and predecessor references are enough for successful routing in Chord. We have implemented a two-level overlay design, in which nodes are organized into independent Chord rings (clusters) based on physical or logical proximity or locality, so that most interactions take place within these clusters. For each cluster, nodes’ finger tables are no different than in a singlelevel Chord implementation. However, each node additionally keeps a reference to its successor in each of the other clusters, which can be done by querying any node in a remote cluster as in a regular join operation. The result of this design is a fully connected higher level overlay at relatively small cost (assuming the number of individual clusters is much smaller than the number of nodes), where any node can be used as a link between clusters. Thus, there are no specialized nodes or converging paths as potential bottlenecks or single points of failure. Because the successor–predecessor relationship is maintained for nodes between clusters, routing between clusters is expected to take a better than average number of hops, where the average case corresponds to starting a search from an arbitrary node in the remote cluster [13]. This is because a query will already be closer to the destination node when it enters the remote cluster than if an arbitrary node were used. Inter-cluster routing in the two-level overlay requires an extra parameter that indicates the cluster within which the search is to be carried out. Two wildcard values are accepted for this parameter: ANY, which is used to route to a node on any cluster, starting with the local cluster, and ALL, which is used to route to nodes on all clusters. Both values first initiate routing normally to a node on the local cluster. For ANY, a query is propagated further only if it cannot be resolved on that node. For ALL, the query is sent in parallel to every other cluster. Note that in both cases, because the access point to remote clusters is the successor of the responsible rendezvous node in the local cluster, routing in remote clusters is expected to be resolved in a better than average number of hops, as explained above. 5. Self-optimization mechanisms The number of messages sent within the system can be reduced at the notifications level, which is important because any reduction in the number of messages leads to a reduction in the overhead involved in packaging and delivering each individual message, and to an improvement in scalability. In the case of Web Services, this overhead is incurred mainly by XML and SOAP headers. In addition, the messaging within the JXTA framework that supports our current overlay implementation also adds considerable overhead. To see how much bandwidth is actually consumed by overhead in one implementation, a sniffer program was used to capture the packet flows between the nodes in the network for notifications. For messages

between network nodes, the combined overhead of XML and JXTA for each message is just over 3.5 KB, which amounts to about 28 Kbps in a message flow of one message per second. The following mechanisms are used by the system to reduce the number of individual messages in the network. 5.1. Grouping of notifications by buffering This optimization is meant to reduce the flows of small and frequent notifications. A simple way to deal with these notification flows is to buffer and group several notifications within a single notification message, a mechanism which is allowed by the WSN XML schema. This way, the headers that would have been transmitted with every individual message are reduced to a single header on a grouped message. Determining when and how many messages to buffer, however, depends on several factors, and thus it is worthwhile to equip the system with logic that allows it to autonomously determine the most appropriate level of message aggregation based on high-level constraints. Without application-specific considerations, messages can be grouped based on two criteria. The first is on messages that correspond to the same topic, and the second is on messages that match the same subscription. These criteria are not necessarily the same, since, depending on how broad a subscription is made (with wildcards or ranges), several different topics may match a single subscription. The system can benefit from applying both criteria, since grouping based on topic equality can be done when messages enter the system at an interface node, which does not necessarily know about subscriptions for that topic, and then subscription-based grouping can be determined at the rendezvous nodes. The mechanism, however, is the same in both cases, so we will describe grouping based on topic equality. The mechanism for grouping and packaging of notifications is as follows. Each interface node keeps a separate buffer of messages for every topic it receives (garbage collection can be employed to eliminate buffers for which no messages arrive for a period of time). Each buffer is configurable by setting the length of the period during which messages are accumulated. This buffering level is determined by managers associated with each buffer, the design of which is described below. If the buffering period is determined only with respect to bandwidth utilization (the number of messages), then the solution is trivial because a higher buffering level (more messages grouped together) always increases the saving achieved. If a limit is set on the buffering period, according to the maximum latency allowed for each individual message, then the solution would always be set to this limit. However, a more balanced solution should consider the tradeoff between bandwidth utilization and message latency. An optimal point can be found between a buffering period of zero (minimum latency, maximal bandwidth consumption) and one equal to the maximum allowed latency (highest buffering level, minimal bandwidth consumption). This is the range used in Eq. (1) below, although the reciprocal of the incoming rate is used as a lower bound instead of zero (any period set smaller than the

Author's personal copy

A. Quiroz, M. Parashar / Future Generation Computer Systems 24 (2008) 452–459

457

incoming message period would result in no buffering). Instead of manually assigning a weight to each extreme, a dynamic solution is determined based on the relative size of the payload with respect to the total message size (Eq. (2) below). The rationale behind this is that the relative saving in bandwidth is greater for small messages because the overhead constitutes a larger fraction of the total data sent, whereas for large messages the overhead becomes relatively insignificant. In the former case, there is greater payoff for sacrificing latency, and thus buffering should have a larger weight. For the latter case, the reverse is true. Finally, the period is calculated by obtaining a value within the range determined by the weight, using Eq. (3). If the incoming rate is very low, with a period higher than the maximum latency, then Eq. (3) is not used, and rather the period is set directly to zero. range = maxLatency − avgIncomingRate−1 avgPayload weight = overhead + avgPayload period = maxLatency − weight × range.

(1) (2) (3)

As a proof-of-concept, experiments were conducted for single message flows of different incoming rates and payload sizes. The maximum latency allowed for messages at each node is 1 s. The buffering period, as well as the actual groups of messages transmitted by the system, were observed to obtain the overhead bandwidth consumption. Fig. 4 plots the results. Notice that savings in bandwidth utilization are substantial, even though buffering periods are distributed within the range of allowable latencies. The lowest buffering period set in this case is 373 s for message rate of 20 messages per second and 10 000 B per message. For irregular notification flows, possibly originating from several producers publishing notifications on the same topic at different time intervals, several complications are possible, such as short bursts of notifications at high rates, high variability in the incoming rate, and concurrency. A number of mechanisms were used to reduce the sensitivity of the system to these conditions. To emphasize the self-managing aspect of the system, the use of fixed low level parameters was avoided. For example, instead of using a fixed threshold for the minimum change in the buffering period, the threshold is calculated dynamically based on whether or not the change in buffering would cause at least one message more or less to be buffered at the current estimated incoming message rate. This new parameter (the change in the number of messages for which a change in period is allowed) is at a higher level and is more meaningful than the period alone. To test the behavior of the system under these conditions, an interface node was set to receive messages with the same topic from 32 different producers, each one of which sent messages of random payload size between 10 and 500 B at random intervals of up to 5 s. The combined effect of these notifications produces a high message rate, with high variability. Fig. 5 shows the changes in the buffering period during the time of the test.

Fig. 4. Results for buffering with different incoming rates and payload sizes. Lines correspond to the different payload sizes, bracketed by the minimum and maximum values for each measure. Top: Overhead bandwidth, grows with payload size. Bottom: Buffering period set, decreases with payload size.

Fig. 5. Change in the buffering period with highly variable incoming periods.

5.2. Demand-based notification relay Ideally, notifications should not be sent if no subscribers exist for them. Demand-based publishing, explained in Section 3, is WSN’s provision for dealing with this issue. However, demand-based publishing depends on producers registering their topics with the notification broker, which particular publishers may not choose to do or may not be able to do if they do not implement the NP interface. To further optimize messaging, the system implements a mechanism which is similar to that of demand-based publishing but that is based on the topics of individual notifications. The idea is that interface nodes should determine when not to relay notifications to rendezvous nodes based on the existing subscriptions.

Author's personal copy

458

A. Quiroz, M. Parashar / Future Generation Computer Systems 24 (2008) 452–459

Unlike publisher registrations that define the topics that will be produced beforehand, an interface node has no way of knowing which topics it will have to handle. Registering for every topic received would also be inefficient. Thus, the mechanism devised is implemented as follows. Each interface node keeps subscription caches associated with particular topics. If there is no cache associated to a particular topic when a notification for it is received, the interface node queries the network for subscriptions for that topic. If any are found, they are placed in the subscription cache, which is marked as empty otherwise. Subsequent notifications with the same topic will only be relayed if the corresponding subscription cache is not empty. To avoid making a query for every topic received, locality is exploited by checking a topic against all cached subscriptions. New queries are only made if no subscriptions exist in these caches (note that for these topics, the notification is relayed in any case). Meanwhile, at the rendezvous nodes that responded to the query, a temporary registration is kept of the interface nodes and their corresponding queries. This ensures that if a subscription did not exist at the time of the query, a matching subscription made thereafter can be made known to the interested nodes, so that a notification for which a subscription exists is never dropped. The same happens for the cancellation of subscriptions. Because they are potentially more numerous than publisher registrations, rather than keeping these registrations indefinitely, they are deleted once they are used. Thus, interface nodes must requery the network once a cache for a particular topic becomes empty. The overhead of this mechanism for each topic are the query and its corresponding response (two messages), as well as one message per update of a subscription or its cancellation. New queries are only triggered after cancellations. Thus, the overhead is small and can easily be made up when large notification flows are not relayed while no subscriptions exist, unless rates of subscriptions and cancellations are in the same order as the rate of notifications. 6. Related work To date, there are several implementations of WSN, including Apache’s Pubscribe [14], for Java, WSRF.NET from the University of Virginias Grid Computing Group [15], for Microsoft’s development platform, pyGridWare [16], a Python-based implementation, and GT4 from the Globus Toolkit [17], with bindings for both Java and C. Apache’s project is derived from GT4-Java. The primary focus of these implementations is WSRF, and, as a result, they provide different levels of functionality for WSN. For example, the pyGridWare and GT4 implementations of WSN are meant primarily for providing notifications about the state of resource properties. Pubscribe extends these capabilities and fully supports both WS-BaseNotification and WS-Topics, but does not implement WS-BrokeredNotification. WSRF.NET, which was developed using ASP.NET and the IIS infrastructure, supports all of the specifications. A thorough comparison of these implementations can be found in [18].

The above implementations are meant to be development tools, providing technology-specific bindings of the standards and extensible API’s. Like the standards themselves, they do not address the issues that arise when actually composing systems that make use of the notification protocols and standards, such as service discovery, and efficient and scalable routing of requests and messages. These issues have been addressed in the context of messaging infrastructures such as the Enterprise System Bus (ESB) architecture [19,20], which mediates the interactions of different web services, including WSN service implementations, by service virtualization. An ESB hides implementation and location details of the services that register to it and is capable of spanning wide area networks and involving multiple infrastructure servers. However, as its name suggests, an ESB is an enterprise-level solution and there is no reference implementation for it. NaradaBrokering [21] is a distributed middleware framework that supports peer-topeer systems and message-oriented interactions. It implements a variety of messaging technologies and is currently working on an implementation for WSN. The objectives and approach of the NaradaBrokering framework are essentially the same as those that underlie the present work; it manages a network of brokers through which end systems can interact, providing scalability, location independence, and efficient routing. The difference is that Narada brokers are organized in a hierarchical structure which must be maintained through tighter coupling and control mechanisms that do not allow uncontrolled connections and disconnections. Content-based publish/subscribe over Distributed HashTables (DHT’s) is a topic for which there is much current work. DHT functionality is usually built using some sort of structured overlay network, the most popular of which are Chord [12], used here, Pastry [22], and CAN [23], because they provide scalability, search guarantees and bounds on messaging within the network, as well as some degree of self-management and fault tolerance with respect to the addition/removal of nodes. With this foundation, designing content-based publish/subscribe systems requires an efficient mapping between content descriptors and nodes in the overlay network, as well as efficient techniques for routing and matching based on these content descriptors, which can contain wildcards and ranges for complex queries. The work in [24– 27] addresses these issues to some extent. Meteor and Squid differ from these approaches mainly in the locality-preserving mapping used. The Meghdoot system [27] also uses a localitypreserving mapping, but the multidimensional address space used by its overlay CAN is costlier to maintain than Chord’s one dimensional overlay. To our knowledge, none of these systems are as yet used to implement the WSN standards. Of these systems, [24] also proposes buffering of messages, but it does so statically by setting the buffering level as a multiple of the incoming period. Another messaging optimization designed for web services message exchanges is described in [28]. Here, minimum overhead is achieved by setting up headerless message streams from messages with common headers. These headers are instead stored separately in a context-store, implemented according to the WS-Context

Author's personal copy

A. Quiroz, M. Parashar / Future Generation Computer Systems 24 (2008) 452–459

standard, and used to disambiguate the messages in the streams. While this solution is optimum is terms of overhead, the management of the context-store adds to system complexity, and, more importantly, streams have to be known and set up a priori, which is contrary to the dynamicity and burstiness of notification traffic. 7. Conclusion We have described the implementation of a distributed content-based notification broker service for WS-Notification in the context of large-scale, dynamic Grid environments. The issues of scalability and dynamism are addressed by our system design, and by our implementation, which is based on a scalable and self-managing underlying infrastructure. We also described and evaluated self-optimization mechanisms to reduce the number of notification messages transmitted within the network. The two-level overlay organization of the messaging framework can support different degrees of coupling within and between physically or logically organized clusterings of Grid elements that execute tasks or workflows. In this sense, remote communication is highly decoupled and event-driven, and local communication, while more tightly coupled, must still be created dynamically and require content-based interactions to deal with this dynamism. This system is yet to be tested in a WAN setting with real application data. As described in [18], compatibility with other implementations must be ascertained due to differences in the tools and development platforms used. We are currently developing a more lightweight overlay implementation that does not depend on JXTA. Other optimization mechanisms, as well as improvements to those presented here, can also be explored. References [1] OASIS WSN Technical Committee. http://www.oasisopen.org/committees/tc home.php?wg abbrev=wsn. [2] P.T. Eugster, P.A. Felber, R. Guerraoui, A.-M. Kermarrec, The many faces of publish/subscribe, ACM Computing Surveys 35 (2) (2003) 114–131. [3] Globus Alliance OGSA webpage. http://www.globus.org/ogsa/. [4] S. Graham, D. Hull, B. Murray, Web Services Base Notification 1.3 (WSBaseNotification), Oasis Public Review Draft 01, July 2005. [5] D. Chappell, L. Liu, Web Services Brokered Notification 1.3 (WSBrokeredNotification), Oasis Public Review Draft 01, July 2005. [6] W. Vambenepe, Web Services Topics 1.3 (WS-Topics), Oasis Public Review Draft 01, December 2005. [7] I. Stoica, et al., Internet indirection infrastructure, in: Proceedings of ACM SIGCOMM, Pittsburgh, PA, 2002, pp. 73–86. [8] N. Jiang, C. Schmidt, V. Matossian, M. Parashar, Enabling applications in sensor-based pervasive environments, in: Basenets 2004, San Jose, CA, 2004. [9] C. Schmidt, M. Parashar, Flexible information discovery in decentralized distributed systems, in: 12th IEEE International Symposium on High Performance Distributed Computing, HPDC-12’03, 2003. [10] H. Sagan, Space-Filling Curves, Springer-Verlag, 1994. [11] N. Jiang, C. Schmidt, M. Parashar, A decentralized content-based aggregation service for pervasive environments, in: Proceedings of the International Conference of Pervasive Services, ICPS, June 2006. [12] I. Stoica, et al., Chord: A scalable peer-to-peer lookup service for internet applications, in: Proceedings of ACM SIGCOMM, San Diego, CA, 2001, pp. 149–160.

459

[13] A. Quiroz, Two-level structured overlay design for cluster management in peer-to-peer networks, Technical Report TR-275, CAIP Center, Rutgers University, 2006. [14] Apache Pubscribe project home. http://ws.apache.org/pubscribe/. [15] WSRF.NET project homepage. http://www.cs.virginia.edu/gsw2c/wsrf. net.html. [16] pyGridWare project homepage. http://dsd.lbl.gov/gtg/projects/ pyGridWare/. [17] GT4 tutorial. http://gdp.globus.org/gt4-tutorial/multiplehtml/index.html. [18] M. Humphrey, et al., State and events for web services: A comparison of five WS-Resource framework and WS-notification implementations, in: 14th IEEE International Symposium on High Performance Distributed Computing, HPDC-14, Research Triangle Park, NC, 2005, pp. 24–27. [19] P. Niblett, S. Graham, Events and service-oriented architecture: The OASIS web services notification specifications, IBM Systems Journal 44 (4) (2005) 869–886. [20] M.T. Schmidt, B. Hutchison, P. Lambros, R. Phippen, The enterprise service bus: Making service-oriented architecture real, IBM Systems Journal 44 (4) (2005) 781–797. [21] S. Pallickara, G. Fox, NaradaBrokering: A middleware framework and architecture for enabling durable peer-to-peer grids, in: Proceedings of ACM/IFIP/USENIX International Middleware Conference, 2003, pp. 41–61. [22] A. Rowstron, P. Druschel, Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems, in: Proceedings of IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), Heidelberg, Germany, 2001, pp. 329–350. [23] S. Ratnasamy, et al., A scalable content-addressable network, in: Proceedings of ACM SIGCOMM, San Diego, CA, 2001, pp. 161–172. [24] R. Baldoni, C. Marchetti, A. Virgillito, R. Vitenberg, Content-based publish-subscribe over structured overlay networks, in: Proceedings of the 25th International Conference on Distributed Computing Systems, ICDCS ’05, Columbus, OH, June 2005. [25] D. Tam, R. Azimi, H.-A. Jacobsen, Building Content-based Publish/Subscribe Systems with Distributed Hash Tables, in: Lecture Notes in Computer Science, vol. 2944, 2004, pp. 138–152. [26] I. Aekaterinidis, P. Triantafillou, Internet scale string attribute publish/subscribe data networks, in: Proceedings of the ACM 14th Conference on Information and Knowledge Management, CIKM, Bremen, Germany, October 2005. [27] A. Gupta, O.D. Sahin, D. Agrawal, A.E. Abbadi, Meghdoot: ContentBased Publish/Subscribe over P2P Networks, in: Lecture Notes in Computer Science, vol. 3231, 2004, pp. 254–273. [28] S. Oh, G. Fox, Optimizing web service messaging performance in mobile computing, Future Generation Computer Systems 23 (4) (2007) 623–632. Andres Quiroz is a Ph.D. student of Computer Engineering at Rutgers University. He obtained an M.S., also in Computer Engineering, from Rutgers in 2007, and a B.S. in Systems Engineering at Eafit University, Colombia, in 2004. His research interests are in the area of distributed systems, currently in middleware for peer-to-peer, grid, and autonomic computing.

Manish Parashar is Professor of Electrical and Computer Engineering at Rutgers University, where he also is co-director of the Center for Advanced Information Processing (CAIP) and director of the Applied Software Systems Laboratory (TASSL). He received a B.E. degree in Electronics and Telecommunications from Bombay University, India, and M.S. and Ph.D. degrees in Computer Engineering from Syracuse University. He has received the Rutgers Board of Trustees Award for Excellence in Research (2004–2005), NSF CAREER Award (1999) and the Enrico Fermi Scholarship from Argonne National Laboratory (1996). His research interests include autonomic computing, parallel and distributed computing, scientific computing, and software engineering.

This article was published in an Elsevier journal. The ...

The entire system acts as a notification broker, so that notification publishers and subscribers outside the network can achieve loosely coupled communication through a decentralized, scalable service, by interacting with any of the broker peers. Self- ..... to bandwidth utilization (the number of messages), then the solution is ...

805KB Sizes 0 Downloads 25 Views

Recommend Documents

This article was published in an Elsevier journal. The ...
and noiseless EEG data [10]. To overcome the drawbacks above, .... data were recorded on a laptop computer and stored for later off-line analysis using MatLab ...

This article was published in an Elsevier journal. The ...
crown scorch may easily recover their canopy (e.g. Pausas,. 1997). Thus ... presents data on cork oak mortality as a function of bark age in. Algeria, showing only ...

This article was published in an Elsevier journal. The ...
Aug 16, 2007 - regarding Elsevier's archiving and manuscript policies are encouraged to visit: ..... and neutral moods [24], in these data both happy and.

This article was published in an Elsevier journal. The ...
It was presented during a random time ranging between 600 and 800 ms, then .... amplitude from 250 to 450 ms over fronto-central electrodes (FC1/C1//FCz/Cz.

This article was published in an Elsevier journal. The ...
tool in the synthesis of various and important fluorine- ... fax: +86 21 66133380. E-mail address: ... affected by the electronic effect of the substituent group R2 on.

This article was published in an Elsevier journal. The ...
May 10, 2007 - natural recovery or removing. ... recovery and death can be ignored. ... In the standard network SI model, each individual is represented by a ...

This article was published in an Elsevier journal. The ...
E-mail address: [email protected] (A. Catenazzi). ... March 1998 (http://www.imarpe.gob.pe/argen/nina/premota/premota.html). ... In order to establish a link between geckos (which feed on arthropods) and their primary source of energy.

This article was published in an Elsevier journal. The attached copy is ...
websites are prohibited. In most cases authors ... article (e.g. in Word or Tex form) to their personal website or ..... Stock solutions (1.00 mM) of metal per- chlorate ...

This article was published in an Elsevier journal. The ...
Sep 17, 2007 - a Cercia, School of Computer Science, The University of Birmingham, Birmingham B15 ..... is, to some degree, a window on cortical processes.

This article was published in an Elsevier journal. The ...
article (e.g. in Word or Tex form) to their personal website or .... Specific and social phobias ... ences and so not all people undergoing the same learning.

This article was published in an Elsevier journal. The ...
Available online 31 December 2007. Abstract ... characterized by a singular hydrographical system com- posed of .... information on 18 large gastropod species, all previously ..... Bachelor Thesis, Universidad de la Repu´blica, Montevideo.

This article was published in an Elsevier journal. The ...
education use, including for instruction at the author's institution, sharing with colleagues and ... article (e.g. in Word or Tex form) to their personal website or institutional repository. ...... Technology 10 (3) (2005) 372–375. [9] Y.J. Yin, I

This article was published in an Elsevier journal. The ...
Careful analysis of the reaction data revealed that the best conditions for the ... entry 8) led to recovery of the vinyl bromide starting mate- rial and the ..... (b) E. Vedejs, S.C. Fields, R. Hayashi, S.R. Hitchcock, D.R. Powell,. M.R. Schrimpf, J

This article was published in an Elsevier journal. The ...
{xn} to x. Let E be a real Banach space and T be a mapping with domain D(T) and range R(T) in E. T is called non- ... Does the Mann iteration process always converge for continuous ..... [15] W.R. Mann, Mean value methods in iteration, Proc.

This article was published in an Elsevier journal. The ...
1 Mbps. By using a pure MS (multi-sender only), for each receivers, we should .... Software and Middleware (COMSWARE07) Bangalore, India, Jan- uary 2007.

This article was published in an Elsevier journal. The ...
is that anxiety becomes conditioned to these CSs and the more intense the panic attack, the more robust the condition- ing will be (Forsyth, Eifert, & Canna, ...

This article was published in an Elsevier journal. The ...
better data and methods for addressing a number of medical needs effectively. ... clinical management and 2D cephalometric analysis. We have developed an ..... graphical, clinical tool for research purposes in orthodon- tics. It is becoming ...

This article was published in an Elsevier journal. The ...
Effects of exotic conifer plantations on the biodiversity of understory plants, epigeal .... 2.1. Study area. The study was conducted in Nahuel Huapi National Park.

This article was published in an Elsevier journal. The ...
aerobic exercises and walking along different directions. ...... 3, Biloxi, MS, USA, 2002, pp. ... degree from Laval University in Computer Science in 2003.

This article was published in an Elsevier journal. The ...
and a Sony DXC-970MD camera (Brain Institute, University of Florida core ..... Fire, A., Xu, S., Montgomery, M.K., Kostas, S.A., Driver, S.E., Mello, C.C.,. 1998.

This article was published in an Elsevier journal. The ...
Sep 17, 2007 - a Cercia, School of Computer Science, The University of Birmingham, Birmingham B15 .... Considering the chaotic and non-stationary nature of EEG data, an ... (Bandt and Pompe, 2002) to map a continuous time series onto a.

This article was published in an Elsevier journal. The ...
Oct 9, 2006 - retrospective and prospective retrieval in hippocampal regions. CA3 and ... recall activity found in experimental data from hippocampal regions.

This article was published in an Elsevier journal. The ...
websites are prohibited. ... reactions to threatening social stimuli (angry faces) in human ... motivational network that involves various brain regions (see. LeDoux ...

This article was published in an Elsevier journal. The ...
synchronization of two coupled electromechanical devices .... The mechanical part is composed of a mobile beam with mass m ..... found wide application in the construction of various vibro-technical devices. ..... He expresses special thanks to the S