Linked Data and Live Querying for Enabling Support Platforms for Web Dataspaces Jürgen Umbrich1, Marcel Karnstedt1, Josiane Xavier Parreira1, Axel Polleres2, Manfred Hauswirth1 1DERI, National University of Ireland, Galway, Ireland 2Siemens AG Österreich, Vienna, Austria
Web as a set of interlinked Web dataspaces Enabling DSSP for Web dataspaces Linked Data Missing components Challenges Efficient query processing Challenges Index consistency study Hybrid query processing mechanism
www.deri.ie
The World Wide Web Digital Enterprise Research Institute
www.deri.ie
CSV HTML
href http
href
rel
href http
HTML
href href
CSS
HTML
img PNG
PDF
Web of Documents
The World Wide Web Digital Enterprise Research Institute
www.deri.ie
CSV HTML
href http href
rel
CSS
href
href • Unstructured • Heterogeneous http HTML • Data integration href HTML mostly manual img PNG
PDF
Web of Documents
The World Wide Web Digital Enterprise Research Institute
www.deri.ie
RDF CSV HTML
href
href
CSS
rel
href http
HTML
href
HTML
img PNG
PDF
RDF
RDF
Web of Data
The World Wide Web Digital Enterprise Research Institute
www.deri.ie
RDF CSV HTML
CSS
rel
href • Standards href
href
href • URIs as identifiers • Typed links http HTML • Web heterogeneous HTML img distr. DB PNG
PDF
RDF
RDF
Web of Data
Dataspace Support Platforms Digital Enterprise Research Institute
www.deri.ie
[Franklin 2005]
Dataspace Support Platforms Digital Enterprise Research Institute
www.deri.ie
• data management for smallscale loosely connected heterogeneous source • services hide complexity of data management
[Franklin 2005]
Two directions, similar goals Digital Enterprise Research Institute
CSV
www.deri.ie
RDF
CSS HTML
HTML
HTML
PNG PDF
RDF RDF
Web of Data
web-scale heterogeneous distributed database
Dataspace
data management for small-scale loosely connected heterogeneous source
Proposed Solution Digital Enterprise Research Institute
CSV
RDF
www.deri.ie
CSS HTML
HTML
HTML
PNG PDF
RDF RDF
Web of Data
Dataspace
Linked Data for enabling support platforms for Web dataspaces
Web Dataspaces and support platforms
Digital Enterprise Research Institute
www.deri.ie
standards no guarantees
RIF SKOS GRDDL
no central control
RDF SPARQL OWL RDFa
search
catalogs RDF
RDF
CSV
RDF
query indexes
HTML HTTP PDF
REST API
discovery dynamic
incomplete knowledge
heterogeneous administration
enhancement
DSSP -> Linked Data Digital Enterprise Research Institute
www.deri.ie
Participants/Relationships -> Resources/Links XML for interchanging data -> RDF standardised access method common query language -> HTTP/SPARQL Global keys -> URIs Discovery -> crawling/reasoning Integration of other dataspaces -> entity recognition, ontologies
Open Challenges Digital Enterprise Research Institute
www.deri.ie
Graph-Based Data Model to scale to the size of the Web Efficient processing methods (index, query)
Search and Query Structured queries with keyword search Ranking (different levels, typed links, trust, etc) Guarantees: Full guarantee not possible, assessment of possible guarantees is needed
Query Processing Digital Enterprise Research Institute
www.deri.ie
Catalogs for query planning/processing Key component on a DSSP Linked Data: vocabularies, meta data descriptions as catalogs Complete Web catalogs not feasible: scale and dynamics Indexing also affected by dynamics Distributed query processing approaches Works for a few number of large repositories Web of Data: large number of small repositories
Query Processing Digital Enterprise Research Institute
www.deri.ie
Alternative approach: “live” querying Link traversal query approaches Exploit Linked Data principles (dereferenceable URIs) Guarantee ``live’’ results Query time in the range of seconds Our vision: hybrid query processing Combine offline (static) and online (dynamic) processing Trade-off between performance/complements/ fresheness
Index Consistency Study Digital Enterprise Research Institute
www.deri.ie
Two Linked Data Web index (SPARQL endpoints) Sindice (RDF, RDFa, Microformats, ~ 20 billion triples) Openlink (LOD cache; ~20 billion triples) 16,616 distinct entity queries Sampled from the BTC 2011 dataset Number of entities found and exec. time Web
Sindice
Openlink
Entities found
16616
5007
13096
Avg. query time
3261 ms
136 ms
86 ms
Index Consistency Study Digital Enterprise Research Institute
Web Recall: % of Web results found in the endpoints
www.deri.ie
Index Consistency Study Digital Enterprise Research Institute
Web Recall: % of Web results found in the endpoints
Openlink consistent information for 50% of the entities
www.deri.ie
Index Consistency Study Digital Enterprise Research Institute
www.deri.ie
Web Recall: % of Web results found in the endpoints
Sindice consistent information for 30% of the entities
Index Consistency Study Digital Enterprise Research Institute
Web Recall: % of Web results found in the endpoints
www.deri.ie
Hybrid Query Model Digital Enterprise Research Institute
www.deri.ie
Linked Data Web
guarantees fresh results
Live query interface
SPARQL query query results
provides fast query times
(sub) query query planner (sub) query
results knowledge of dynamics results
Index interface
Repository Repository
hybrid query engine
query planning guided by dynamic knowledge
Query planning Digital Enterprise Research Institute
Knowledge of Dynamics Mining and statistical approaches Query planner Incorporate dynamics as cost factor Latency and availability of sources ? Selectivity based on statistics or rules ? Query Execution Split query into static and dynamic parts Only update potentially outdates results Consider user requirements (fresh vs speed)
www.deri.ie
Conclusion Digital Enterprise Research Institute
Enabling DSSP for Web dataspaces via Linked Data Common data representation Standard assess methods, global keys Still open challenges (e.g. search and query) Study shows that repositories lack completeness and freshness Hybrid query processing Combine offline (static) and online (dynamic) processing Trade-off between performance/completeness/ fresheness
problem of Linked Data query processing: to query not only remote, but also local ..... server on the local network so that data can be accessed using URI lookup,.
are retrieved by a dedicated retrieval thread [6] and their data is pushed directly ..... server on the local network so that data can be accessed using URI lookup,.
Keywords: noun compound bracketing, linked open data, DBpedia, Google Web Ngrams, Google .... and his gold standard has been used in different research.
1. An example of integrating LOD for multimedia metadata enrichment. A motivating example ... tion, thus creating mappings between their classes and properties is important ... The technical contributions of this paper are threefold: ..... the multim
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. LDWPO â A ...
3 DERI, NUI Galway IDA Business Park, Lower Dangan Galway, Ireland, ... The Web is the seminal part of the Application Layer of network architectures. Two major trends are currently ... the Social Web (also called Web 2.0). The Web of Data ...
independent online databases. The technology ... online access to sequence data, âOmicsâ data and ... works with data providers to make data available in RDF;.
Enabling Advanced Loading Strategies for Data Intensive Web Services ... those applications where data-intensive multiple-interactions ..... development.
Exploiting Linked Data Francisco Javier Cervigon Ruckauer.pdf. Exploiting Linked Data Francisco Javier Cervigon Ruckauer.pdf. Open. Extract. Open with.
important aspect of quality of service, Cloud. Computing inevitably poses ... also proposed distributed protocols [8]-[10] for ensuring storage .... Best practices for managing trust in private clouds ... information they're hosting on behalf of thei
in a real implementation of a Web services framework that extends CXF. ... those applications where data-intensive multiple-interactions ..... development.
Cloud computing provides unlimited infrastructure to store and ... service, paying instead for what they use. ... Due to this redundancy the data can be easily modified by unauthorized users which .... for application purposes, the user interacts.
Recently, processing of queries on linked data has gained at- ... opment is exciting, paving new ways for next generation applications on the Web. ... In Sections 3 & 4 we present our approach to stream-based query ..... The only âinterestingâ.
Enabling Big Data Solutions with. Centralized Data Management. IT@Intel White Paper. Intel IT. IT Best Practices. Enterprise Data Management. January 2013.
on the design of a broker, which selects the best available data from dif- ferent sources; such a broker .... Indeed cooperative information systems are software systems supporting coop- ...... Intelligent Data Analysis, Cascais, Portugal, 2001. 15.
Indeed cooperative information systems are software systems supporting coop- .... data quality dimension values evaluated for the application data) according to a specific data model. ..... (SAW) [22] or Analytical Hierarchy Process (AHP) [23].
in order to provide social aids, is a cooperative business system derived from the .... values; instead, they do not deal with aspects concerning quality of logical .... Data class instances can be represented as D2Q data graphs, according to.
May 18, 2016 - integration as well as to produce reports and summaries that can be shared with physicians. As such, it is patient-friendly and brings direct ...
Please enter this document's password to view it. Password incorrect. Please try again. Submit. 20131104 Dai metadati bibliografici ai linked data SARDEGNA.pdf. 20131104 Dai metadati bibliografici ai linked data SARDEGNA.pdf. Open. Extract. Open with
20131104 Dai metadati bibliografici ai linked data SARDEGNA.pdf. 20131104 Dai metadati bibliografici ai linked data SARDEGNA.pdf. Open. Extract. Open with.