A DATA MODEL FOR KNOWLEDGE CONTENT OBJECTS Wernher Behrendt, Nitin Arora Salzburg Research/The Metokis Project, Salzburg, Austria {wernher.behrendt | nitin.arora}@salzburgresearch.at

Abstract. We report on work in progress aimed at developing a standardised container for knowledge-enriched multimedia content. A Knowledge Content Object (KCO) contains nine components. Several of these components are subdivided into further KCO elements. At the "atomic" level, it is intended that each of the leaf elements is associated with well-defined operational semantics to enable organisations to quickly deploy KCOs as part of their information infrastructure. This is in contrast to much of the current efforts in the semantic web community where the modelling of ontologies is usually focused on taxonomic or terminological interoperation. While KCOs are also rooted in semantic web technology – using an extension of the DOLCE foundational ontology for their definition – our application interest is more strongly geared towards what information and knowledge exchange processes can be actively supported by semantic web technology. In particular, we are interested in how traditional digital "media" content can be enhanced to qualify as "knowledge" content.

1. Introduction METOKIS (Methodology and Tools Infrastructure for the creation of Knowledge Units) [7] could be seen as an attempt to create an environment for software objects that include media for human consumption as well as a translation of those media for machine consumption. This way, the new software objects become a (surrogate) means of communication. The analogy would be that the actors (humans or machines) are writing a special sort of letters to each other. The inner structure of these letters helps the machines to separate out what is meant for them and what is for the humans to interpret. This special sort of "letter" - which can be exchanged between humans and humans, humans and machines, as well as machines and machines - we call a knowledge content object (KCO).

Through this article, we will first discuss the structure of this special object in detail, then a generic content carrier architecture (Knowledge Content Carrier Architecture; KCCA [9]) which enables the interoperation of heterogeneous content repositories and finally we will explore the application areas (as in METOKIS) to validate this structure. Section 2 discusses the structure of a KCO in detail. The first and probably the most complex component is what we call the propositional content description. Here, we represent media data (e.g. image, text, video or audio files) through media tokens and then associate concepts from a domain ontology with these tokens. The result of this is a graph, which connects media with their signifiers (the tokens) and in turn, connects the tokens with ontological concepts and relations. The second component is the specification of time-based spatial rendition. Given the media tokens, we specify on one or more temporal tracks, when the associated media data will be rendered, and how (in terms of spatial arrangements). This component uses elements of the SMIL [10] multimedia synchronisation language. The third component deals with interaction and dialogue. Here, the semantic annotation specifies whether the presentation is entirely pre-programmed, whether it is entirely open (e.g. web based navigation) or whether it follows some dialogue pattern where humans and system take conversational turns in order to navigate the knowledge/information structure. The fourth component contains interfaces to existing meta data standards, notably those in the cataloguing and rights management areas. Its purpose is to enable the migration of media data and their associated meta data into the KCO structure. The fifth component describes the usage context of the media content. This covers user tasks (formally described by reference to a task ontology; user community, describing the roles that users would take in order to manipulate or consume the content; and usage history, keeping traces of previous use in order to support workflow systems as well as collaborative filtering systems. The sixth component contains a specification of the business semantics associated with the KCO. This comprises the elements License (terms of use); Contract (what you have to sign); Pricing (what the user or the broker has to pay); Negotiation (the protocol that is being used to close a deal); and Trading (the history of how the knowledge content object has been bought and sold in the past). The seventh component contains the trust semantics of the KCO. This can range from formal trust measures (e.g. though clearinghouses) to informal ones (e.g. "recommended by 1000 happy readers"). Component number eight holds the actual access semantics for the KCO: user authorisation specifies who is allowed to do what with this KCO, under which circumstances; and processing policies

specify which services/processes/agents are allowed to do what with this KCO. Finally, the ninth component is the KCO self-description, which holds meta-level information such as the KCO schema itself, or the location at which the KCO resides. In Section 3, we explore a distributed architecture (KCCA) of information managing nodes that can act on KCOs and can implement a variety of knowledge exchange markets by making use of the functionality that is structurally supported by KCOs. First application demos are expected for Spring 2005. In Section 4, we discuss the three application cases that validate the KCO.

2. KCO Facets In this section we explore the structure of KCO in detail. The structure of a KCO contains nine KCO components or facets. Several of these components are subdivided into further KCO elements. 2.1.

Propositional Content

This is a graph structure which allows the definition of a semantic network over a set of media assets. For example, we can express that a specific scene in a film is a parody of a section of text in a novel and that the protagonist of the novel is called "Don Quixote". The structure is also selfdescribing in terms of knowledge representation language used for the specification of the semantic network. The meta model distinguishes between media resources, segments of such resources, semantic annotations which describe the resource or the segment, and semantic relationships between the resources / segments. 2.2.

Time based spatial rendition

This component of the KCO specifies how each of the identified media items (full items or specified segments) will be rendered in time and space. For instance, if we had a semantic structure for telling jokes then we would probably choose a rendition that starts with the intro, sets up an expectation, and then delivers the punch line. Our jokes could be semantically annotated in this fashion. In order to be rendered correctly, we would now need a specification that states for all jokes, that they are best told in the order intro, expectation, and punch-line.

A further specification mechanism could be to use a mapping between the knowledge structures and e.g. the SMIL description language [10], thus creating a multimedia presentation from applying rendering rules to the knowledge structure that overlays the media network. 2.3.

Interaction based rendition

This component of the KCO specifies how each of the identified media items interacts with an end user (if such interaction is defined for the type of media and/or for the knowledge structure described by the logic description). For example, if we had a psychological study on computer games and we wanted to let users play an episode of a game and then let them answer questions before carrying on playing, then we need to be able to specify that rendition, e.g. "for each episode e, ask the user to answer the questionnaire q(e)". 2.4.

Multimedia metadata

The intuition behind this component is that KCOs should not re-invent information models that have already been developed, but should offer ways in which existing information standards can be linked to the world of the semantic web without having to re-invest much effort into new formats. A KCO is primarily a description of some (mixed!) media content but it is not necessarily tied to any specific instances of the media referred to in the description. Whether or not a tight coupling is desirable depends on the intended usage of the media in conjunction with the description. The description then specifies how tight the coupling is intended to be and how strongly that coupling is intended to be enforced. It is a technical option for a KCO to carry the actual multimedia content (audio, video, documents etc.) "on-board". This should be done by setting the MediaURI relative to the KCO's Root URI. For each MediaURI, it is possible to have a Metadata Description which will be at least partially compliant with one or more of various standards (e.g. MPEG-7 [5], MPEG-21 [4], Dublin Core [2] - a metadata standard for image files of digital cameras). However, we propose that KCOs comply with a unified metadata ontology that allows mediation between the different, overlapping content description standards. The most likely starting point for such a unified meta data ontology is the ABC model [6]. We must

strive for information preserving mapping i.e. information that the KCO Model does not understand must be retained so as to enable interpretation by an external application. The Metadata Description intends to address the following aspects: 2.4.1. Intellectual and original provenance of the content 2.4.2. Properties of the media that encode the content 2.4.3. Classification of the content according to traditional description schemes Note that this aspect of the KCO tries to answer questions such as: "who is your creator?", "what are you made of?", and "what are you about?". These questions address endurant aspects of the KCO. 2.5.

Usage context

With the usage context, we intend to give formally described "hints" to another system, indicating what the content of this KCO could - in principle - be used for. However, it is not intended to be used as a prescriptive workflow specification, although it may be used as a descriptive workflow specification as long as the decision what to do with the KCO content is made outside the KCO. There are three elements of usage context: 2.5.1. User task This is a formal description of the tasks in which this content can be used (and how). 2.5.2. User Community This is the intended community of users, the roles that would use the content and the rights one would give to the roles. Not to be confused with the actual access rights that are defined elsewhere. 2.5.3. Usage history

This is the aggregation of information on how the content has been used and manipulated in the past (trails, history). 2.6.

Business semantics

All business relevant attributes are described by business ontologies. This layer includes information such as on applicable pricing schemes and negotiation schemas. It also contains information on contractual issues and links into corresponding contracts. Industry specific business ontologies currently emerge that will provide the basis for libraries of business ontologies. Business conceptualization, such as provided by UDDI [11] and ebXML [3], are considered as starting points. Both are meta-models for business ontologies that are instantiated by domain specifications. Legal (Rights management, IPR, copyright): Legal or regulatory ontologies contains information about legal aspects such as intellectual property rights and copyrights. 2.7.

Trust

The purpose of trust semantics is twofold: on the one hand, users can evaluate the trustworthiness of a resource if other users can leave (genuine and verifiable) endorsements. On the other hand, any kind of quality feedback is also of interest to the owner of the KCO as it allows improvements on the basis of trustworthy evaluation by users. Evaluation (deals with quality of information represented by a KCO): Ratings, reviews, and other qualifications about the content of an information object are described by evaluation ontologies. Evaluations are highly context-dependent. For instance, ratings that are applicable in one domain might be irrelevant in another domain. Background mappings between evaluation ontologies and in particular ratings are required. There is a relationship between user trails and user feedback. Both will need to be used to express various dimensions of trust. At present, research into trustable knowledge and content objects is in its infancy and therefore, specification of the KCO trust semantics will be deferred to a later version of KCOs. It should be noted that the issue is of interest and importance, e.g. in the application case of assessments of news stories by senior executives.

2.8.

Access Semantics

KCO access semantics are not related to the content and utility of a KCO but to control and technical processing issues. Security/Access Permissions: The security layer contains information about how a KCO can be used regarding security issues. If a KCO is encoded, the security layer describes the kind of security protocol by which it can be accessed.

2.9.

KCO self description

This layer carries a self-description of the KCO (i.e. the meta-level description or ontology schema of KCOs).

3. Distributed Architecture for KCOs Having defined the structure of KCOs, we now need an architecture that complements the richness of KCOs through appropriate functionality that provides interoperability amongst heterogeneous systems. The architecture consists of two key parts: KCCA platform and KCTP (Knowledge Content Transfer Protocol) [9]. The KCCA Platform acts as a middleware providing support for building content management applications. The KCTP provides interaction and communication support between multiple systems. The KCCA Platform provides the basic middleware support for exchanging KCOs and for defining operations on KCOs. It provides support for semantic definition of tasks and the services: - Repository Service which provides interfaces with databases for storage of content, metadata and ontologies. It also acts as storage for KCOs. - The KCTP (Knowledge Content Transfer Protocol) Service provides access to KCCA Middleware to other external KCCA Systems. The KCCA Middleware system can exchange KCOs with other KCCA aware systems by a simple request/response protocol.

The KCCA presupposes that generic reasoning engines (reasoning service) and task execution (task service) environments will be available for inclusion in the near future. The KCO services implement the semantics of the KCO facets and offer a service-oriented access to the KCO functionality. Figure 1 shows a diagrammatic representation of the KCCA as discussed. non KCO related Applications / Services GUI

Services

Business Logic

KCO API

...

KCO API ext.

Env. API

Domain Specific Ser.

KCO Registry

App. Spec. Services

KCCA Manager

Business

Usage

Multimedia Metadata

License

Propositional Content

: :

KCCA

KCO Services

Message Handler Reas. Service

Task Service

Reasoning Engine

Task Execution Env.

KCTP Service Ser

DSer

Send

Rec

Network

Data Source V

V

C

C

W

Figure 1: Knowledge Content Carrier Architecture (KCCA)

4. Application Cases 4.1. Clinical Trial Design Objects The Clinical Trial Objects contain information consisting of the procedure and the data related to a specific clinical trial. The Clinical Trial ontology like HL7 describes the structured content of the Clinical Trial Objects. External Templates (ontology describing the compliance of clinical trial data e.g. which data fields are mandatory , which are optional; the rules building the particular user interface so that a user fills the data in a particular sequential fashion) along with Clinical Trial Ontology enables popping of specific databases with clinical trial data. The data when shared by multiple systems is shared as a KCO.

The Propositional Content of the KCO in Clinical Trial Objects will contain the actual clinical trial data referring to the particular ontology which has been used for building the Clinical Trial Object. The KCO Usage Semantics will contain the context in which this data can be used e.g. only for private use or for government compliance etc. The KCO Business Semantics will provide the contract and copyright information. KCO Trust Semantics will define the quality of Clinical Trial Object e.g. if it is suitable for use for government regulations etc. The Access Semantics will define who can access the Clinical Trial KCOs. 4.2. Klett Education Platform A KCO in the Klett Education Domain will consist of multimedia resources stored in a repository, any business plan objects that take part in the workflow of the Educational Metokis Platform. In case of learning objects the KCO will contain the LOM [1] or SCORM [8] metadata. It will also contain classification of media (assets) via ontology. The different syllabuses will be tagged via an educational domain specific ontology. 4.3. Senior Executive Information Objects The KCO in the Senior Executive domain consist of news articles which can be shared by multiple systems. A single KCO will consist of multiple multimedia documents consisting of text, audio, video or documents etc. The Metadata Description of KCO will contain multimedia description of the news articles consisting of metadata such as MPEG-7 [5], Dublin Core [2]. The actual logic description of the multimedia will be part of the Propositional Content. It is envisaged that the richness of the description will depend on provider to provider and some providers in future can provide KCO's for news articles directly. The KCO Usage context will contain the context in which this data can be used e.g. only for personal use or public use or can be used by aggregators etc. The KCO Business Semantics will provide the contract and copyright information (e.g. creative commons license for digital items). KCO Trust Semantics will define the quality of the News Articles e.g. the actual rating can be dependent on external rating systems. The Access Semantics will define who can access and in which fashion a particular KCO can be accessed.

5. Summary With the KCO we have formulated a standardized object for knowledge enriched multimedia content. KCOs should foster knowledge exchange both where structured data and communication over structured data is involved such as in Web Services and also in un-structured data such as multimedia content, multimedia documents etc. which are kept within content repositories and are shared amongst systems. The focus for KCO is to be able to hold enough semantics to be able to foster knowledge exchange amongst middleware platforms. The application cases as described here are domain specific KCOs and give reasonable hints to experts in other domains to define or model and use their own KCOs.

6. Acknowledgements The authors gratefully acknowledge fruitful discussions with Wolfgang Maass, Aldo Gangemi and Rupert Westenthaler, notably on the ontological framework for KCOs. The METOKIS project is part-funded by the EC under the Sixth Framework Programme, under Strategic Objective 2.4.7. - Semantic Based Knowledge Systems.

References [1] Draft Standard for Learning Object Metadata

http://ltsc.ieee.org/wg12/files/LOM_1484_12_1_v1_Final_Draft.pdf. [2] Dublin Core, http://dublincore.org/. [3] ebXML Specifications, http://www.ebxml.org/specs/index.htm#technical_specifications. [4] Jan Bormans, Keith Hill MPEG-21 Overview v.5 http://www.chiariglione.org/mpeg/standards/mpeg-21/mpeg-21.htm. [5] José María Martínez Sanchez, Rob Koenen, Fernando Pereira: MPEG-7: The Generic Multimedia Content Description Standard, Part 1. IEEE MultiMedia 9(2): 78-87 (2002). [6] Lagoze, C. and J. Hunter (2001) "The ABC Ontology and Model". Journal of Digital Information, 2 (2) http://metadata.net/harmony/JODI_Final.pdf.

[7] METOKIS Web site,

http://metokis.salzburgresearch.at [8] Sharable Content Object Reference Model (SCORM) 2004, http://www.adlnet.org/index.cfm?fuseaction=DownFile&libid=648&bc=false. [9] Sunil Goyal, Wernher Behrendt, Rupert Westenthaler 2004: Knowledge Content Carrier Architecture, METOKIS Deliverable: D10. [10] Synchronized Multimedia Integration Language (SMIL 2.0), www.w3.org/TR/smil20/. [11] UDDI Specifications, http://www.uddi.org/specification.html.

A data model for knowledge content objects

(what the user or the broker has to pay); Negotiation (the protocol that is being used to ... Component number eight holds the actual access semantics for the ...

393KB Sizes 2 Downloads 213 Views

Recommend Documents

Biology: Content Knowledge
Test Name. Biology: Content Knowledge. Test Code. 0235. Time. 2 hours. Number of Questions. 150 ..... Extraction of mineral and energy resources (e.g., mining, drilling) ... Renewable and/or sustainable use of resources. D. Ethical and ...

A Hybrid Prediction Model for Moving Objects - University of Queensland
for a data mining process. ... measures. • We present a novel data access method, Trajectory Pat- ..... node has free space, pk is inserted into it, otherwise it splits.

A Hybrid Prediction Model for Moving Objects - University of Queensland
a shopping center currently (9:05 a.m.), it is unreasonable to predict what her ..... calls, hence, the computing time is significantly reduced. ..... an eight hour period. .... 24. 26. 28. 30. 32. 34. 36. 38. Eps. ( a ) number of patte rn s. Bike. C

a model for generating learning objects from digital ...
In e-Learning and CSCL there is the necessity to develop technological tools that promote .... generating flexible, adaptable, open and personalized learning objects based on digital ... The languages for the structuring of data based on the Web. ...

a model for generating learning objects from digital ...
7.2.9 Tools for generating Learning Objects. ....................................................... ... 9 Schedule of activities . ..... ones: Collaborative Notebook (Edelson et. al. 1995) ...

A Relational Model of Data for Large Shared Data Banks
banks must be protected from having to know how the data is organized in the machine ..... tion) of relation R a foreign key if it is not the primary key of R but its ...

Content-based retrieval for human motion data
In this study, we propose a novel framework for constructing a content-based human mo- tion retrieval system. Two major components, including indexing and matching, are discussed and their corresponding algorithms are presented. In indexing, we intro

can orthographic knowledge modify phonological knowledge? data ...
DATA FROM A STUDY OF PORTUGUESE CHILDREN'S ... University of Porto – Faculty of Letters. Centre of Linguistics of the University of Porto. (Portugal).

A Content and Structure Website Mining Model
May 26, 2006 - A Content and Structure Website Mining Model ... content and structure organization of a website. .... that use a content management system). 3.

A MARTE-Based Reactive Model for Data-Parallel ...
cessing, Internet connectivity, electronic commerce, etc. High-performance ...... Sale is then used in the BrokeredSale to depict a more complex collaborating.

DAMSEL - A Data Model Storage Library for Exascale Science - CUCIS
Jul 26, 2011 - DAMSEL - A Data Model Storage Library for. Exascale ... Storage data models developed in the 1990s; Network. Common Data ... Big Picture.

DAMSEL - A Data Model Storage Library for Exascale Science - CUCIS
Jul 26, 2011 - Proposed API and implementation, Data layout (In Progress). 2 ... Here, we have identified data models used in the motifs .... Big Picture.

#61 - CONCEPTUAL DATA MODEL FOR RESEARCH ...
Whoops! There was a problem loading more pages. Retrying... #61 - CONCEPTUAL DATA MODEL FOR RESEARCH COLLABORATORS.pdf.

CONCEPTUAL DATA MODEL FOR RESEARCH COLLABORATORS.pdf
Master in Computer Engineering / Knowledge Engineering and Management / Federal. University of Santa Catarina (EGC/UFSC) / [email protected] / ...

FSO General Knowledge Model Papers.pdf
Whoops! There was a problem loading more pages. FSO General Knowledge Model Papers.pdf. FSO General Knowledge Model Papers.pdf. Open. Extract.

FSO General Knowledge Model Papers.pdf
Page 1. Whoops! There was a problem loading more pages. Retrying... FSO General Knowledge Model Papers.pdf. FSO General Knowledge Model Papers.pdf.

FSO General Knowledge Model Papers.pdf
Sign in. Page. 1. /. 1. Loading… Page 1 of 1. Page 1. FSO General Knowledge Model Papers.pdf. FSO General Knowledge Model Papers.pdf. Open. Extract.

A Behavioural Model for Client Reputation - A client reputation model ...
The problem: unauthorised or malicious activities performed by clients on servers while clients consume services (e.g. email spam) without behavioural history ...