Introduction to XML CSE532: Theory of Database Systems Fusheng Wang Department of Biomedical Informatics Department of Computer Science

What’s XML • The eXtensible Markup Language (XML) defines a generic syntax used to mark up data with simple, human-readable tags • Has been standardized by Consortium (W3C) as a format for computer documents • Data in XML documents is represented as strings of text • This data is surrounded by text markup that describes the data • A particular unit of data and markup is called an Element • XML specifies the exact syntax of how elements are delimited by tags, what a tag looks like, what names are acceptable, and so on

Evolution of XML • Both HTML and XML are descendants of the Standard Generalized Markup Language (SGML) • SGML is an extremely powerful markup language

• Unfortunately, it is also extremely complicated (no one has ever implemented it fully) • HTML is a small subset of SGML used specifically for creating web pages • XML is a bigger, more powerful subset of SGML trying to solve some of the same problems as SGML (without the complexity of SGML)

XML Usage Scenarios 1. Industry standards and data exchange applications 2. Web services, SOA data transport and message persistence

3. Business object / transaction record 4. Integration of diverse data sources 5. Forms and workflow processing 6. Document storage and querying 7. XML Feeds and Web 2.0 Syndication

8. Mapping XML in relational applications 9. Better data model for certain types of data 10. Rapid application prototyping and development

11. …

Who Uses XML Today?

AIM, PAIS

XML as a Better Data Model • XML provides a better data model for many new apps – Flexibility, schema versatility, hierarchical nature

• Semi-structured or unstructured data – E.g. healthcare records, biological data, contracts, insurance claims, etc.

• Inherently hierarchical, nested or complex data – E.g. manuals, books, catalogs, bills of materials, land records, etc.

• Data with changing or evolving schemas, e.g. forms, changing industry standard documents, new product versions, etc. • Data with Null, Multiple or Unknown values – e.g., Phone numbers (home, office, mobile), in patient records, etc.

XML Basics

XML = eXtensible Markup Language

From delimited flat file:

What’s XML

Data

Attributes vs. Elements • Design Choice • Elements can be repeated, e.g. “keyword”, “author”. Attributes can not. • Elements can be extended (made deeper), e.g. “author”. • Attributes are shorter, can often be stored /processed more efficiently

The XML Document Tree

What’s XML?

XML versus Relational

Schema Evolution

Well-formed XML Documents • An XML document is well-formed, if:

“Well-formed” or “Valid”? • An XML document is well-formed, if… – it complies with the rules on the previous page – i.e. it can be parsed by an XML parser without error

• An XML document is valid, if… – it is well-formed AND – it complies with a specific DTD or XML Schema • XML Parsers can optionally perform “validation”

• (Document Type Definitions) and XML Schema define a specific XML document structure

The XML Data Model: Node Types

Text Nodes and Mixed Content

Problem: Name Collision • Three different XML elements:

• Same element name, but different meaning! • Can result in processing/application errors.

• Need to distinguish between different domains.

Solution: Namespaces • A prefix identifies the domain (“namespace”), and distinguishes between duplicate element names

• Namespaces need to be uniquely identified….-> URIs • URI = Universal Resource Identifier

• URIs typically look like a URL, they may to point to a web page, but don’t have to !

Namespace Declaration • “xmlns” defines namespaces, and (optionally) assigns them to a namespace prefix • The namespace applies to the current element and all subelements and attributes that it contains • A namespace declaration without prefix defines a default namespace, and implicit for all elements in scope

XML Manipulation: File Based • DOM (Document Object Model) tree based navigation • Streaming event based parsing – Streaming pull parsing: Streaming API for XML (StAX)

– Streaming push parsing: SAX (Simple API for XML) javax.xml.parsers.SAXParserFactory

• XSLT based transformation – XSLT interface or TrAX – javax.xml.transform

• Java XML binding: map XML to Java and vice versa – Java Architecture for XML Binding (JAXB) – javax.xml

XML Manipulation: Database Based • Native XML database storage • XPath or XQuery based queries • SQL/XML functions: XTABLE

Alternatives to XML • JSON (JavaScript Object Notation) • HDF5 (Hierarchical Data Format) • Google Buffer Protocol

• Thrift

Introduction to XML

Industry standards and data exchange applications. 2. Web services, SOA data transport and message ... e.g., Phone numbers (home, office, mobile), in patient.

1MB Sizes 1 Downloads 265 Views

Recommend Documents

[Ebook] Introduction to JavaScript Programming with XML and PHP ...
[Ebook] Introduction to JavaScript Programming with XML and. PHP Ebook PDF, EPUB, KINDLE By Elizabeth Drake. Tabtight professional free when you need it ...

pdf to xml converter java
Page 1. Whoops! There was a problem loading more pages. pdf to xml converter java. pdf to xml converter java. Open. Extract. Open with. Sign In. Main menu.

pdf to xml convertor
Connect more apps... Try one of the apps below to open or edit this item. pdf to xml convertor. pdf to xml convertor. Open. Extract. Open with. Sign In. Main menu.

pdf to xml free
Loading… Whoops! There was a problem loading more pages. Retrying... Whoops! There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. pdf to xml free. pdf to xml f

pdf to xml conversion
Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. pdf to xml conversion. pdf to xml conversion. Open. Extract.

c pdf to xml
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. c pdf to xml.

pdf to xml file
Whoops! There was a problem loading more pages. pdf to xml file. pdf to xml file. Open. Extract. Open with. Sign In. Main menu. Displaying pdf to xml file.

XML Schema - Computer Science E-259: XML with Java
Dec 3, 2007 - ..... An all group is used to indicate that all elements should appear, in any ...

pdf to xml free online
Page 1 of 16. PLANIFICACIÓN DE LA FIESTA DE LA LECTURA. I. DATOS INFORMATIVOS. SUBNIVEL: Básica Media GRADO: Sexto. TUTOR: DOCENTE COLABORADOR: Dr. Juan Jiménez TEMA DE LA FIESTA DE LA. LECTURA: Monólogo cómico y mural de. lectura. FECHA ENTREGA

pdf to xml open source
pdf to xml open source. pdf to xml open source. Open. Extract. Open with. Sign In. Main menu. Displaying pdf to xml open source.

pdf to xml conversion java
Sign in. Loading… Whoops! There was a problem loading more pages. Retrying... Whoops! There was a problem previewing this document. Retrying.

itext xml to pdf example
Page 1 of 1. itext xml to pdf example. itext xml to pdf example. Open. Extract. Open with. Sign In. Main menu. Displaying itext xml to pdf example. Page 1 of 1.

My First XML Parser
Oct 15, 2007 - Computer Science E-259: XML with Java, Java Servlet, and JSP .... phones didn't exist in 1636, so the course hadn't a phone number on file for.

XML programming with SQL/XML and XQuery
agers fulfill vital responsibilities in complex informa- tion systems by ... other information-service systems. Permission to ...... Client, network, and server resources ...

how to convert pdf to xml online
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. how to convert ...

XML programming with SQL/XML and XQuery - IEEE Xplore
XML programming model evolution. SAX (the simple API [application programming in- terface] for XML)1 was the first popular interface for. XML programming.

XML Schema (Second Edition)
Nov 26, 2007 - Different companies all proposed different variations of ... XML Data (MS, Arbortext, Inso), January 1998 ... Storage of application information ...

XML Tooling - Sites
J2EE Tooling (2 of 2). Connector Projects. J2EE Connector Architecture (JCA) based. EJB Test Client – Universal Test Client. HTML-based. J2EE programming ...

My First XML Parser
Oct 15, 2007 - Consider now a larger excerpt from CSCI E-259's original database, the .... element can be printed as a start tag immediately followed by an ...

Learning XML
Extensible Markup Language (XML) is a data storage toolkit, a configurable vehicle for any kind of information, an .... computer programs to determine the functions and boundaries of document parts. ...... so perhaps the backup is a good idea.