Digital Enterprise Research Institute

www.deri.ie

Linked Data and Live Querying for Enabling Support Platforms for Web Dataspaces Jürgen Umbrich1, Marcel Karnstedt1, Josiane Xavier Parreira1, Axel Polleres2, Manfred Hauswirth1 1DERI, National University of Ireland, Galway, Ireland 2Siemens AG Österreich, Vienna, Austria

© Copyright 2010 Digital Enterprise Research Institute. All rights reserved.

1

Outline Digital Enterprise Research Institute

 Web as a set of interlinked Web dataspaces  Enabling DSSP for Web dataspaces  Linked Data  Missing components  Challenges  Efficient query processing  Challenges  Index consistency study  Hybrid query processing mechanism

www.deri.ie

The World Wide Web Digital Enterprise Research Institute

www.deri.ie

CSV HTML

href http

href

rel

href http

HTML

href href

CSS

HTML

img PNG

PDF

Web of Documents

The World Wide Web Digital Enterprise Research Institute

www.deri.ie

CSV HTML

href http href

rel

CSS

href

href • Unstructured • Heterogeneous http HTML • Data integration href HTML mostly manual img PNG

PDF

Web of Documents

The World Wide Web Digital Enterprise Research Institute

www.deri.ie

RDF CSV HTML

href

href

CSS

rel

href http

HTML

href

HTML

img PNG

PDF

RDF

RDF

Web of Data

The World Wide Web Digital Enterprise Research Institute

www.deri.ie

RDF CSV HTML

CSS

rel

href • Standards href

href

href • URIs as identifiers • Typed links http HTML • Web heterogeneous HTML img distr. DB PNG

PDF

RDF

RDF

Web of Data

Dataspace Support Platforms Digital Enterprise Research Institute

www.deri.ie

[Franklin 2005]

Dataspace Support Platforms Digital Enterprise Research Institute

www.deri.ie

• data management for smallscale loosely connected heterogeneous source • services hide complexity of data management

[Franklin 2005]

Two directions, similar goals Digital Enterprise Research Institute

CSV

www.deri.ie

RDF

CSS HTML

HTML

HTML

PNG PDF

RDF RDF

Web of Data

web-scale heterogeneous distributed database

Dataspace

data management for small-scale loosely connected heterogeneous source

Proposed Solution Digital Enterprise Research Institute

CSV

RDF

www.deri.ie

CSS HTML

HTML

HTML

PNG PDF

RDF RDF

Web of Data

Dataspace

Linked Data for enabling support platforms for Web dataspaces

Web Dataspaces and support platforms

Digital Enterprise Research Institute

www.deri.ie

standards no guarantees

RIF SKOS GRDDL

no central control

RDF SPARQL OWL RDFa

search

catalogs RDF

RDF

CSV

RDF

query indexes

HTML HTTP PDF

REST API

discovery dynamic

incomplete knowledge

heterogeneous administration

enhancement

DSSP -> Linked Data Digital Enterprise Research Institute

www.deri.ie

 Participants/Relationships -> Resources/Links  XML for interchanging data -> RDF  standardised access method common query language -> HTTP/SPARQL  Global keys -> URIs  Discovery -> crawling/reasoning  Integration of other dataspaces -> entity recognition, ontologies

Open Challenges Digital Enterprise Research Institute

www.deri.ie

 Graph-Based Data Model to scale to the size of the Web   Efficient processing methods (index, query)

 Search and Query  Structured queries with keyword search  Ranking (different levels, typed links, trust, etc)  Guarantees:  Full guarantee not possible, assessment of possible guarantees is needed

Query Processing Digital Enterprise Research Institute

www.deri.ie

 Catalogs for query planning/processing  Key component on a DSSP  Linked Data: vocabularies, meta data descriptions as catalogs  Complete Web catalogs not feasible: scale and dynamics  Indexing also affected by dynamics  Distributed query processing approaches  Works for a few number of large repositories  Web of Data: large number of small repositories

Query Processing Digital Enterprise Research Institute

www.deri.ie

 Alternative approach: “live” querying  Link traversal query approaches  Exploit Linked Data principles (dereferenceable URIs)  Guarantee ``live’’ results  Query time in the range of seconds  Our vision: hybrid query processing  Combine offline (static) and online (dynamic) processing  Trade-off between performance/complements/ fresheness

Index Consistency Study Digital Enterprise Research Institute

www.deri.ie

 Two Linked Data Web index (SPARQL endpoints)  Sindice (RDF, RDFa, Microformats, ~ 20 billion triples)  Openlink (LOD cache; ~20 billion triples)  16,616 distinct entity queries  Sampled from the BTC 2011 dataset  Number of entities found and exec. time Web

Sindice

Openlink

Entities found

16616

5007

13096

Avg. query time

3261 ms

136 ms

86 ms

Index Consistency Study Digital Enterprise Research Institute

 Web Recall: % of Web results found in the endpoints

www.deri.ie

Index Consistency Study Digital Enterprise Research Institute

 Web Recall: % of Web results found in the endpoints

Openlink consistent information for 50% of the entities

www.deri.ie

Index Consistency Study Digital Enterprise Research Institute

www.deri.ie

 Web Recall: % of Web results found in the endpoints

Sindice consistent information for 30% of the entities

Index Consistency Study Digital Enterprise Research Institute

 Web Recall: % of Web results found in the endpoints

www.deri.ie

Hybrid Query Model Digital Enterprise Research Institute

www.deri.ie

Linked Data Web

guarantees fresh results

Live query interface

SPARQL query query results

provides fast query times

(sub) query query planner (sub) query

results knowledge of dynamics results

Index interface

Repository Repository

hybrid query engine

query planning guided by dynamic knowledge

Query planning Digital Enterprise Research Institute

 Knowledge of Dynamics  Mining and statistical approaches  Query planner  Incorporate dynamics as cost factor  Latency and availability of sources ?  Selectivity based on statistics or rules ?  Query Execution  Split query into static and dynamic parts  Only update potentially outdates results  Consider user requirements (fresh vs speed)

www.deri.ie

Conclusion Digital Enterprise Research Institute

 Enabling DSSP for Web dataspaces via Linked Data  Common data representation  Standard assess methods, global keys  Still open challenges (e.g. search and query)  Study shows that repositories lack completeness and freshness  Hybrid query processing  Combine offline (static) and online (dynamic) processing  Trade-off between performance/completeness/ fresheness

www.deri.ie

Linked Data and Live Querying for Enabling Support ...

Linked Data and Live Querying for Enabling. Support Platforms for Web Dataspaces. Jürgen Umbrich1, Marcel Karnstedt1, Josiane Xavier Parreira1,.

876KB Sizes 1 Downloads 174 Views

Recommend Documents

SIHJoin: Querying Remote and Local Linked Data
problem of Linked Data query processing: to query not only remote, but also local ..... server on the local network so that data can be accessed using URI lookup,.

SIHJoin: Querying Remote and Local Linked Data
are retrieved by a dedicated retrieval thread [6] and their data is pushed directly ..... server on the local network so that data can be accessed using URI lookup,.

Linked Open Data and Web Corpus Data for noun ...
Keywords: noun compound bracketing, linked open data, DBpedia, Google Web Ngrams, Google .... and his gold standard has been used in different research.

CAMO: Integration of Linked Open Data for ... - Semantic Scholar
1. An example of integrating LOD for multimedia metadata enrichment. A motivating example ... tion, thus creating mappings between their classes and properties is important ... The technical contributions of this paper are threefold: ..... the multim

LDWPO – A Lightweight Ontology for Linked Data Management.pdf ...
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. LDWPO – A ...

How Google is using Linked Data Today and ... - Semantic Scholar
3 DERI, NUI Galway IDA Business Park, Lower Dangan Galway, Ireland, ... The Web is the seminal part of the Application Layer of network architectures. Two major trends are currently ... the Social Web (also called Web 2.0). The Web of Data ...

TcruziKB: Enabling Complex Queries for Genomic Data Exploration
independent online databases. The technology ... online access to sequence data, “Omics” data and ... works with data providers to make data available in RDF;.

Enabling Advanced Loading Strategies for Data Intensive Web Services
Enabling Advanced Loading Strategies for Data Intensive Web Services ... those applications where data-intensive multiple-interactions ..... development.

Exploiting Linked Data Francisco Javier Cervigon Ruckauer.pdf ...
Exploiting Linked Data Francisco Javier Cervigon Ruckauer.pdf. Exploiting Linked Data Francisco Javier Cervigon Ruckauer.pdf. Open. Extract. Open with.

Enabling Data Storage Security in Cloud Computing for ... - wseas.us
important aspect of quality of service, Cloud. Computing inevitably poses ... also proposed distributed protocols [8]-[10] for ensuring storage .... Best practices for managing trust in private clouds ... information they're hosting on behalf of thei

Enabling Advanced Loading Strategies for Data Intensive Web Services
in a real implementation of a Web services framework that extends CXF. ... those applications where data-intensive multiple-interactions ..... development.

Enabling Data Storage Security in Cloud Computing for ... - wseas.us
Cloud computing provides unlimited infrastructure to store and ... service, paying instead for what they use. ... Due to this redundancy the data can be easily modified by unauthorized users which .... for application purposes, the user interacts.

Linked Data Query Processing Strategies
Recently, processing of queries on linked data has gained at- ... opment is exciting, paving new ways for next generation applications on the Web. ... In Sections 3 & 4 we present our approach to stream-based query ..... The only “interesting”.

Enabling Big Data Solutions with Centralized Data ... - Media15
Enabling Big Data Solutions with. Centralized Data Management. IT@Intel White Paper. Intel IT. IT Best Practices. Enterprise Data Management. January 2013.

The DaQuinCIS Broker: Querying Data and Their ... - Semantic Scholar
on the design of a broker, which selects the best available data from dif- ferent sources; such a broker .... Indeed cooperative information systems are software systems supporting coop- ...... Intelligent Data Analysis, Cascais, Portugal, 2001. 15.

The DaQuinCIS Broker: Querying Data and Their ... - Semantic Scholar
Indeed cooperative information systems are software systems supporting coop- .... data quality dimension values evaluated for the application data) according to a specific data model. ..... (SAW) [22] or Analytical Hierarchy Process (AHP) [23].

The DaQuinCIS Broker: Querying Data and Their ... - Semantic Scholar
in order to provide social aids, is a cooperative business system derived from the .... values; instead, they do not deal with aspects concerning quality of logical .... Data class instances can be represented as D2Q data graphs, according to.

Letter of support for Patient Data Platform for capturing patient ...
May 18, 2016 - integration as well as to produce reports and summaries that can be shared with physicians. As such, it is patient-friendly and brings direct ...

Method and apparatus for enabling individual or smaller investors or ...
Jul 28, 2003 - http://Web.ebscohost.corn/ehost/pdfvieWer/pdfvieWer?vid:2 .... Engel, Louis, et al., How to Buy Stocks, Eighth Edition, Little, Brown.

20131104 Dai metadati bibliografici ai linked data SARDEGNA.pdf ...
Please enter this document's password to view it. Password incorrect. Please try again. Submit. 20131104 Dai metadati bibliografici ai linked data SARDEGNA.pdf. 20131104 Dai metadati bibliografici ai linked data SARDEGNA.pdf. Open. Extract. Open with

20131104 Dai metadati bibliografici ai linked data SARDEGNA.pdf ...
20131104 Dai metadati bibliografici ai linked data SARDEGNA.pdf. 20131104 Dai metadati bibliografici ai linked data SARDEGNA.pdf. Open. Extract. Open with.