LODQA Natural language query processing for SPARQL generation http://lodqa.dbcls.jp (demo system)

Jin-Dong Kim (DBCLS) Kevin Cohen (University of Colorado) Licensed under a Creative Commons Attribution 3.0 Unported License - DBCLS

Motivation ● Linked

Open Data (LOD) increasing. ● SPARQL emerging as a query language. ● SPARQL is hard to write ✔

even for experts

● Natural

language query conversion to SPARQL can provide public users with a gentle interface layer.

Licensed under a Creative Commons Attribution 3.0 Unported License - DBCLS

Demo ● LODQA ✔

prototype system

http://lodqa.dbcls.jp

Licensed under a Creative Commons Attribution 3.0 Unported License - DBCLS

LODQA - two-step approach ● Pseudo ✔

SPARQL generation

Linguistic processing

● SPARQL ✔

instantiation

Adaptation to the target DB

Licensed under a Creative Commons Attribution 3.0 Unported License - DBCLS

Pseudo SPARQL generation ● Analysis ✔ ✔ ✔

Parsing Base Noun Chunks (BNC) Targeting (typing) ➔ Find

the BNC modified by the wh-determiner, what.

● Generation ✔ ✔

(graph conversion)

Entity instantiation conditioning

Licensed under a Creative Commons Attribution 3.0 Unported License - DBCLS

SPARQL instantiation ● URL ✔

lookup

OntoFinder (REST service) ➔ http://ontofinder.dbcls.jp ➔ Receives

a list of terms, and a (list of) ontology IDs ➔ Returns matching URIs to every term. ✔

“Kabuki Syndrom” in OMIM ➔ http://purl.bioontology.org/ontology/OMIM/147920

Licensed under a Creative Commons Attribution 3.0 Unported License - DBCLS

SPARQL instantiation ● Typing ✔

?t1 :isa [genes] ➔ ?t1

umls:tui "T028"^^xsd:string

● Relation ✔

instantiation

?t1 [associated, with] ?t2 ➔ ?t1

?r1 ?t2. ➔ ?t1 ?r1 ?x1. ?x1 ?r2 ?t2.

?t1

?t2 Maximum sensitivity

?t1

?t2 ?x1

Licensed under a Creative Commons Attribution 3.0 Unported License - DBCLS

SPARQL instantiation ● Strategy ✔

maximize the sensitivity first, ➔ Trivial,



Generic

then, improve the specificity (filtering). ➔ Might

be more difficult ➔ Relation classification ➔ Mapping linguistic and schematic representation of relations

Licensed under a Creative Commons Attribution 3.0 Unported License - DBCLS

LODQA ● Pseudo ✔ ✔ ✔ ✔

SPARQL generation

Base noun chunking Pattern Enju Parsing Targeting Pattern Conditioning Shortest path

● SPARQL ✔ ✔

instantiation

Entity instantiation Relation instantiation

OntoFinder Any path

granularity Filtering Classification

Licensed under a Creative Commons Attribution 3.0 Unported License - DBCLS

Conclusions ● LODQA ✔

Is at its beginning stage ➔ We

are seeking a best starting point for open collaborative development.



Toward an open source project. ➔ We



are now writing unit-tests.

Prototype system is available ➔ http://lodqa.dbcls.jp

Licensed under a Creative Commons Attribution 3.0 Unported License - DBCLS

Future directions 1 ● Parser ✔

Other parsers for other languages

● Graph ✔

Linguistic structure processing, e.g., coordinations, sortal coreferences.

● URL ✔ ✔ ✔

lookup

Matamap, Bioportal annotator for Bio-domain DBPedia spotlight for general domain Quantity expressions

● Target ✔ ✔

conversion

LODs

Bioportal LODs, Bio2RDF LODs DBpedia

Licensed under a Creative Commons Attribution 3.0 Unported License - DBCLS

Future Direction 2 ● Federated

search

Language processing

answer Pseudo SPARQL aggregation

Adapdation to LODs

… Multiple LODs Licensed under a Creative Commons Attribution 3.0 Unported License - DBCLS

Future directions 3 ● To

make it an open source project ● To collect queries ✔ ✔

To direct linguistic changes to NLP community. To direct LOD challenges to LOD community.

Licensed under a Creative Commons Attribution 3.0 Unported License - DBCLS

presentation

Parser. ✓ Other parsers for other languages. ○ Graph conversion. ✓ Linguistic structure processing, e.g., coordinations, sortal coreferences. ○ URL lookup. ✓ Matamap, Bioportal annotator for Bio-domain. ✓ DBPedia spotlight for general domain. ✓ Quantity expressions. ○ Target LODs. ✓ Bioportal LODs, Bio2RDF LODs.

1MB Sizes 0 Downloads 293 Views

Recommend Documents

Presentation
A fast, cheap and simple analytical method. .... limited data from Jordan ... data. • Some of those: Mishor Yamin,. Revivim – Mashabim, Sde-. Boker, Shivta ...

Presentation Title Presentation Sub-Title
April 2010, Prahran, Melbourne. • Direct impacts ... Victoria. Currently infrastructure and facilities are designed based on past climate, not future climate. ... Sensitivity of Materials to Climate Change Impacts. Material. CO. 2. Cyclones. & Stor

Presentation Title Presentation Sub-Title
Climate change impacts – impact upon cycling conditions and infrastructure. Infrastructure and climate change risks for Vic. Primary impacts – impact upon ...

Presentation Title Presentation Sub-Title
Helen Millicer, Member, Glen Eira BUG and Bicycle. Victoria Board. Thanks for permission to use slides from presentations given to PACIA members in Vic and ...

Presentation Information
Please arrive at the assigned meeting room 10 minutes before the session ... All meeting rooms are equipped with digital projectors and laptop computers.

presentation guidelines
QUESTIONS AND ANSWERS. A. EACH GROUP WILL LISTEN TO PRESENTATIONS CAREFULLY. B. AFTER RESOLUTION IS PRESENTED OPPORTUNITY FOR QUESTIONS. 1. QUESTIONS: EXPOSE WEAKNESSES IN GROUPS RESOLUTION. 2. ANSWERS DEMONSTRATE THAT YOUR SUGGESTIONS ARE.

DCC03 Presentation
Design of Optimal Quantizers for Distributed Source Coding. 2 ... R. D. J λ λ +. −. = )1(. Distortion. Rate. Lagrangian cost. ▫ Rate measure r(q,y) models coder.

AGM Presentation - Tata Motors
Aug 12, 2011 - projections, estimates and expectations of the Company i.e. Tata. Motors Ltd ... Cash Profit = EBITDA + Other Income – Product Development Expenses – Net Interest - Tax Paid .... applications from the Tata Winger platform.

Oral Presentation Rubric
You will create and present a 5- to 7-minute oral presentation to the class, using at least one prop. Presentation must ... support theme. □ Uses correct grammar.

DCC03 Presentation
Mar 25, 2003 - Design of Optimal Quantizers for Distributed Source Coding. 2. Outline ... YQrER. = R. D. J λ λ +. −. = )1(. Distortion. Rate. Lagrangian cost.

FY16 presentation FINAL - Sage
Financial progress. Share based payments. (£8m). (£9m). Underlying depreciation and amortisation. (£30m). (£29m). Non-GAAP EBITDA. £465m. £429m. +8.4% ... Revenue categories. +10%. FY16. FY15. Recurring. Revenue. +32%. 0%. +6%. -9%. +6%. Other

DCC03 Presentation
Rebollo, Rane, Girod: Wyner-Ziv Quantization and Transform Coding of Noisy ..... R[bit]. SNR. OUT. [dB]. Wyner-Ziv Bound. Conditional q(x|y). Distributed q(x).

d1.1 project presentation - NUBOMEDIA
Feb 28, 2014 - ... ICT-2013.1.6. Connected and Social Media ... NUBOMEDIA: an elastic PaaS cloud for interactive social multimedia. 2 .... around 10 minutes.Missing:

presentation name
Nov 2, 2011 - ENTERPRISE RESOURCE PLANNING. Manesh ... information and business ... ERP. ➢Direct costs include hardware, software, and people on.

conference presentation
Social Media Communities. Wei Gong, Ee-Peng Lim, Feida Zhu ... Users in social sites can: Silent Users (or Lurkers) ... (marital status, religion, and political orientation) using content features: • The user's tweets. • The user's followees' twe

Windows 8 Presentation Template
The reason you get strong tools for IT is so you ... “10% of all laptops, and 70% of all USB sticks, are lost every year”. “600,000 laptops are lost at U.S. airports.

Research Presentation
Apr 2, 2005 - Relaxed conditions in both the direct and indirect case. Side information arbitrarily distributed. ▫ In the indirect case, condition on data similar to ...

DCC03 Presentation
D. Rebollo, A. Aaron, B. Girod: Transforms for High-Rate Distributed Source Coding. 2. Outline ... suggests small performance loss. [Slepian, Wolf, 73] [Wyner, ...

Presentation- Tourizm_2015_en.pdf
state management of tourism and resorts in the. VISION: Ajara as a ... well organized tourist and communication infrastructure, pristine nature and eco- friendly.

FY16 presentation FINAL - Sage
“Company”) or any company which is a subsidiary of the Company. .... -9%. +6%. Other recurring. Processing. SSRS. Software subscription. 8. #SageResults ...