Jennifer Macaulay Subject Review Paper September 17, 2006

System Interoperability

Information systems analysis and design is a complex area of study with an endless number of subtopic areas for study. One critical area of information systems analysis and design is that of interoperability. In today’s world, the majority of institutions have several disparate systems which need to be able to communicate with each other and seamlessly pass data back and forth. In a 2004 article from the ACM Transactions on Information Systems, entitled “Information Systems Interoperability: What Lies Beneath?”, Jinsoo Park and Sudha Ram discuss interoperability and suggest a prototype that can allow heterogeneous systems to interact with each other in order to allow users to query them at the same time.

Data integration is an important concept for those who work with information systems. In today’s world, many institutions have heterogeneous systems with data stores that exist in multiple and incompatible formats. To leverage existing information structures, it is important to find ways for these disparate systems to communicate effectively. Park and Ram write “. . . establishing semantic interoperability among heterogeneous and distributed information sources has been a critical issue attracting significant attention from research and practice” (Park and Ram, 2004). Semantic interoperability is an issue because of the various ways in which databases and systems store information. Often times, separate databases contain the same fields. However, they may be named differently or the data may be stored in different formats. Dates are a great example of how data is often stored in different formats – sometimes as mm-dd-yy, mm-dd,yyyy, dd-mm-yyyy or they may be written out as in month, day, year. In order for systems to communicate with each other successfully, they need to be aware of these types of incompatibilities.

Before Park and Ram propose their prototype for designing an interoperability schema, they discuss the existing approaches to semantic interoperability. “Previous research in semantic interoperability can be categorized into three broad areas: mapping-based, intermediary-based and query-oriented approaches” (Park and Ram, 2004). In the mapping-based approach, schemas are developed to construct one-toone mappings between semantically similar fields in separate systems. This is accomplished by developing a federated (or global) schema to establish the mappings between local systems. The biggest drawback to this approach is that the federated

schema is tied to the local systems that it is developed to map; it is not independent and cannot easily be used to map other disparate systems. The use of the intermediary-based approach depends upon the use of intermediaries, or mediators in order to allow for semantic interoperability. The mediators often consist of ontologies which allow for standardized vocabularies and protocols by which systems communicate and pass data. While this approach is theoretically possible, it is not feasible in reality. The intermediary-approach creates shared ontologies at the domain level, but not the local level. This makes this approach infeasible to maintain due to the dynamic and complex nature of most local systems. The third approach is the query-oriented approach which is based upon interoperable languages. The idea is to develop queries that can span the different systems. The biggest drawback to this approach is that users need to understand the differences in the systems being queried. The end user is responsible for solving semantic conflicts.

In order to explain how they intend to resolve semantic conflicts, Park and Ram next discuss the types of semantic conflicts that exist. Such conflicts may exist at the data level or the schema level. Conflicts at the data level are simple differences in data that may be caused by different representations of data or interpretations of data. These types of conflicts may be data-value conflicts (the values may mean different things), data representation conflicts (ways in which data are represented), data-unit conflicts (miles versus kilometers), and data precision conflicts (differences in rankings, etc.). On the other hand, schema-level conflicts are caused by actual structural differences or inconsistencies in the different systems or databases. Examples of schema-level conflicts are naming conflicts (same fields with different names in different databases), entity-identifier conflicts (different primary keys to the same concepts in different databases), schema-isomorphism conflicts (occurs when the same concept is tagged with different attributes), generalization conflicts (when the same type of data is modeled differently), aggregation conflicts (happens when data that is aggregated is represented in or corresponds to data in another system), and schematic discrepancies (conflicts in logical structures of data or databases). Those who design systems or create architectures must take into account all of these types of semantic conflicts when attempting to create system interoperability.

Given the problems with the three approaches to semantic interoperability, mappingbased, intermediary based and query-based approaches, the authors develop a generic approach to creating semantic interoperability that they label CREAM (Conflict Resolutions Environment for Autonomous Mediation). Park and Ram explain that “CREAM . . . allows users to access a large number of autonomous information systems without prior knowledge of their information content” (Park and Ram, 2004). In order for CREAM to work, local schema must be defined for each local database or system. This local schema will describe the data organization in the database in a very detailed manner. Additionally, a federated or global schema must also be constructed using the same type of semantic data model used to create all of the

local schemas. It is important to then map the federated schema to the local schemas – and all fields that contain similar data must be mapped to each other. This mapping is the schema mapping knowledge which is the key to the entire CREAM architecture. This schema mapping knowledge will also be used to generate valid local queries that each database can understand.

The authors developed something called SCROL (Semantic Conflict Resolution Ontology) in order to provide a dynamic method for “comparing and manipulating contextual knowledge about each information source, which is useful in facilitating semantic interoperability” (Park and Ram, 2004). In the CREAM approach to interoperability, it is the SCROL function that will actually resolve all semantic conflicts that arise during queries. SCROL is a complex algorithm that distinguishes between concepts and instances, both of which are nodes on a SCROL tree. This tree structure, along with the ontology relationship knowledge, it the key to the semantic conflict resolution in Park and Ram’s generic approach to system interoperability. The ontology relationship knowledge is the actual reasoning process for semantic conflict resolution which is based upon the local and federated mapping schemes created previously. When queries are executed in Park and Ram’s system, it is the semantic mediation service and semantic mediators that parse the query and send it to the appropriate local schema. A mediator is a “software module that exploits encoded knowledge about a particular dataset to bring the source information into a common form for a higher layer of application” (Park and Ram, 2004). The semantic mediators developed by Park and Ram include the following: coordinator, conflict detector, selector, query generator, data collector, conflict resolver, and message generator. These are the particular semantic mediators that identify conflicts and actually resolve them, or achieve semantic transformation.

In this article, Jinsoo Park and Sudha Ram develop a strategy to develop a system to achieve semantic interoperability – a highly desirable, but difficult to achieve state for disparate information systems. Park and Ram developed their own protocol, named CREAM. As part of this architecture, they discussed key information-systems concepts such as semantic interoperability, the mapping-based, intermediary-based and query-oriented approaches to achieving semantic interoperability, data-level and schema-level conflicts, schema mapping knowledge and ontology relationship knowledge. All of these concepts are important parts of developing semantic interoperability in information systems analysis and design. Reference

Park, Jinsoo and Sudha Ram. (2004). Information Systems Interoperability: What Lies Beneath. ACM Transactions on Information Systems. 22(4), 595-632.

List of Tools/Concepts

Semantic interoperability – Semantic interoperability is the ability for disparate systems to understand the semantics, or meanings, or each other despite incompatibilities in data formats, data meanings, etc. Semantic interoperability exists at the knowledge level and results from incompatibilities in implicit meanings, perspectives and assumptions. This is contract to syntactic interoperability which exists at the application level. Syntactic interoperability often happens in the form of software conflicts. Mapping-based approach – This is an approach to developing systems with semantic interoperability. In this approach, it is necessary to develop or construct mappings between the information sources that are related semantically. This is accomplished by developing a federated or global schema, then constructing mappings between the federated schema and the local schemas for each information system. The problem with this approach is that it is not independent of the federated and local schemas for which it is developed. This means the solution is not portable and does not adapt well to the addition of new systems. Intermediary-based approach – This approach relies upon the development of intermediary mechanisms such as mediators, agents or ontologies in order to achieve interoperability. Most often, this approach relies upon created ontologies which allow the use of shared standardized vocabularies or protocols to allow systems or databases to communicate with each other. The ontology is domain specific, but is independent of local schemas and applications. As such, it is not feasible to maintain such ontologies due to the dynamic, autonomous and heterogeneous nature of local schemas. Query-oriented approach – The query-oriented approach depends upon interoperable languages (usually logic-based languages or extended SQL). The important way that this approach stands out is in its ability to formulate queries to span several databases. The main drawback to this approach is that a heavy burden is placed upon the user to understand the differences in the different databases and to resolve semantic conflicts themselves. Data level conflicts – One level where semantic conflict can occur is at the data level. Generally, data level conflicts are differences in data which can be caused by multiple representations and interpretations of similar data. Examples of data level conflicts are data-value conflicts, data representation conflicts, data-unit conflicts, and data precision conflicts. Data-value conflicts are conflicts in data values. Data

values may mean different things depending on their relationships to other factors. Data representation conflicts happen when the same data is represented in different ways (dates can be represented as 9/17/2006, 17-9-2006 and/or September 17, 2006). Data-unit conflicts are those where the same values are represented in different units – feet, yards, meters, etc. Data precision conflicts happen when the same type of data is represented in ways that differ conceptually. For example, different systems may rate the same item, but use different rating schemes. Schema level conflicts – Schema level conflicts involve differences at the structural level of the systems. Examples of schema level conflicts are naming conflicts, entityidentifier conflicts, schema-isomorphism conflicts, generalization conflicts, aggregation conflicts, and schematic discrepancies. Naming conflicts happen when labels of the same schema elements are different from local schema to local schema. Entity-identifier conflicts arise when different primary keys are assigned to the same concepts in different databases. Schema-isomorphism conflicts happen when the same concept is described by different, non-compatible attributes. Generalization conflicts occur when concepts or data values are modeled differently in various databases. As an example, the category of students can be classed in different ways – by year of graduation, school affiliation, etc. Aggregation conflicts happen when “aggregation is used in one database to identify a set of entities in another database” (Park and Ram, 2004). Schematic discrepancies arise when the data structure in one local schema has a different structure in another one. Schema mapping knowledge – The schema mapping knowledge is created by establishing mappings between the disparate local schemas and then mapping the local schemas to the federated schema. It is essential that semantically similar concepts, ideas and data are identified. Park and Ram point out that human intervention is essential in this part of the system development process. This makes the schema mapping knowledge one of the most important part of the CREAM model developed by Park and Ram. Ontology relationship knowledge – This knowledge is the foundation of the reasoning process for semantic resolution. In this knowledge structure there are three different types of relationships: parenthood, sibling and domain-value relationships. The parenthood relationship is a vertical relationship (parent to child). The sibling relationship is a horizontal relationship between constructs or concepts. The domainmapping relationship is used by the “semantic mediators to determine whether the actual data values that are mapped to instances can be transformed from one value to another and vice versa” (Park and Ram, 2004).

Jennifer Macaulay Subject Review Paper September ...

formats – sometimes as mm-dd-yy, mm-dd,yyyy, dd-mm-yyyy or they may be written out as in month, day, year. In order for systems to communicate with each ...

51KB Sizes 0 Downloads 131 Views

Recommend Documents

Jennifer Macaulay
ILS566-Library Personnel Management. Dr. Bielefield. June 14, 2007. Unit 2: Interview Questions ... CD-ROM technology, PC-based applications software, Web formats,. MARC formats, LC classification, OCLC cataloging .... The new hire will have several

Jennifer Macaulay
attended Cornell University and graduated in 1940 with a degree in history. ... field of information sciences: “the image of the library . . . must be changed from.

Jennifer Macaulay
EHR Directorate for Education & Human Resources. Start Date: September 1 ... The web interface is linking the collection to NRC National. Science Education ...

Jennifer Macaulay
Mar 13, 2006 - the information profession and what it does” (Taylor, 1992/1993) has been a life long quest. Robert ... The Hampshire College Library Center opened in 1970 when the ... the American Society for Information Science's Best Book award i

ePub Princeton Review MCAT Subject Review ...
EXTENSIVE GLOSSARIES in each book for immediate reference and review. ONLINE STUDENT TOOLS for up-to-the-moment info on any late-breaking.

eBOOK #PDF Princeton Review MCAT Subject Review ...
tables for easy visual comprehension. CHAPTER ... ONLINE STUDENT TOOLS for up-to-the-moment info on any late-breaking ... "MCAT Critical Analysis and.

Main-Exam-English-Compulsory-Question-Paper-Subject-Code ...
Such violations harm the ... operasional perusahaan dengan mengukur dan mengevaluasi kecukupan kontrol serta efisiensi ... permanent damage to the brain. ... -Compulsory-Question-Paper-Subject-Code-QDB-22-Advt-09-014-15.pdf.

7285 - Subject - Engineering Physics Paper - II - B.E. First Semster ...
Page 1 of 2. TKN/KS/16/7285 1 P.T.O. B.E. First Semester (Fire Engineering) (C.B.S.). Engineering Physics Paper - II. P. Pages : 2 TKN/KS/16/7285. Time : Two Hours Max. Marks : 40. Notes : 1. All questions carry marks as indicated. 2. Solve Question

2011 Review of Conditionality--Overview Paper; IMF Policy Paper ...
Jun 19, 2012 - 11 The recent ―Report of the Task Force on the Fund's Technical .... year of Fund-supported programs with low-income countries, education ...

2011 Review of Conditionality--Overview Paper; IMF Policy Paper ...
Jun 19, 2012 - dialogue with authorities and increased analysis of both long-term benefits ..... 16 As discussed in the 2011 Fund paper Analytics of Systemic ...

September
Sep 1, 2017 - September. 2017. Badminton Open Gym. Sunday. Monday. Tuesday. Wednesday. Thursday. Friday. Saturday. 1. 2. 3. 4. 5. 6. 7. 8. 9. Lindbergh.