MCCC CIS 177 - Markup Languages

Namespaces and Schemas

Namespaces

One of the greatest advantages of XML is the ability to create custom-named tags. This poses a real challenge when you want to share information from different sources and merge it into a common document. Often this results in name collisioin, where elements from two or more documents share the same name or the documents are created by two or more authors that use the same tag names for different elements. One solution to this problem is to declare a namespace in the document prolog as follows:

<?xml:namespace ns="URI" prefix="prefix"?>

Example

<?xml:namespace ns="http://uhosp/patents/ns" prefix="pat"?>

URI stands for Universal Resource Identifier. The URI looks like a URL (or Web address), but it's important not to confuse it this way. The URI is a text string that may refer to a physical or abstract resource. A URL is a phyiscal type of URI that identifies the location of a resource, however, a URI specified in a namespace may refer to an abstract refer in the form of an identifier. You can use any unique text string as a namespace identifier, however, a URI works especially well when you want to share the XML document on the Web, which makes the document unique to a particular domain.

A URN, or Universal Resource Name has also been proposed as an alternative URI. A URN is a persistent resource identifier. In this case, the user only needs to know the name of the resource, which can be retrieved from multiple search sites no matter where it is located. A good example of a URN is a book's ISBN. This is an area under development, but may very well be the way of the future.

The prefix in a URI is the string of letters that associates each element or attribute in the document with the declared namespace. Once a namespace is declared in the prolog, you insert the prefix into all elements that belong to the namespace:

<pat:Patients>
<pat:Patient>
<pat:Name>Joe Smith</pat:Name>
</pat:Patient>
</pat:Patients>

Declare a Namespace as Element Attribute

It is common to declare namespaces as element attributes:

<pat:Patients xmlns:pat="URI">
<Patients>
<Patient>
<Name>Joe Smith</Name>
</Patient>
</Patients>

To avoid using prefixes, you can declare a default namespace. All child elements inherit the parent's namespace.

<Patients xmlns="URI">

Use Namespaces with Attributes

You can qualify element attributes with namespaces as well. No element can contain two attributes with the same name, whether or not qualified and no element may contain two qualified alltribute names with the same local part, pointing to two identical namespaces, even if the prefixes are different.

prefix:attribute="value"

Namespaces and DTDs

Documents containing namespaces must be valid, so the namespace must also be included in applicable DTDs. There are issues, however, that prevent namespaces and DTDs from working well together. Since there is no way to associate a specific DTD with a namespace, authors must create a new DTD each time they want to combine two or more documents.

Work with Namespaces

Download the lab file.

Open UHosptxt.xml, rename it UHosp.xml and edit the code:

<?xml version="1.0" ?>
<!DOCTYPE Report SYSTEM "UHDTD.dtd">
<Report>

<pat:Patients xmlns:pat="http://uhosp/patients/ns">
...
</pat:Patients>

<std:Study xlmn:std="http"//uhosp/studies/ns">
<
std:Name>
<
std:Title>Tamoxifen Breast Cancer Study</std:Title>
<
std:Subtitle>Randomized Phase 3 Clinical Trial</std:Subtitle>
</
std:Name>
<
std:PI>Dr. Diane West</std:PI>
</
std:Study>

Save the file.

Open UHDTDtxt.dtd, rename it UHDTD.dtd and edit the code:

<!ELEMENT Report (pat:Patients, std:Study)>

<!ELEMENT
pat:Patients (Patient+)>
<!ELEMENT Patient (Name, ID, DOB, Age, Stage, Comment*, Performance)>
<!ELEMENT Name (#PCDATA)>
<!ELEMENT ID (#PCDATA)>
<!ELEMENT DOB (#PCDATA)>
<!ELEMENT Age (#PCDATA)>
<!ELEMENT Stage (#PCDATA)>
<!ELEMENT Comment (#PCDATA)>
<!ELEMENT Performance (#PCDATA)>
<!ATTLIST Performance Scale (Karnofsky | Bell) #REQUIRED>

<!ELEMENT
std:Study (std:Name, std:PI)>
<!ELEMENT
std:Name (std:Title, std:Subtitle)>
<!ELEMENT
std:Title (#PCDATA)>
<!ELEMENT
std:Subtitle (#PCDATA)>
<!ELEMENT
std:PI (#PCDATA)>

<!ATTLIST pat:Patients xmlns:pat CDAT #FIXED "http://uhosp/patients/ns">
<!ATTLIST std:Study xmlns:std CDATA #FIXED "htpp://uhosp/studies/ns">

Save the file. Open UHDTD.dtd in the browser.

Schemas

One possible solution to the problem of the lack of ease of creating namespaces with DTD validation rules is to use schemas. Another limitiation of using namespaces and DTDs is the lack of data types, such as enforcing the use of numeric values. It is also complicated to enforce structure when using mixed content. Finally, DTDs use a different syntax than XMLm, so a validator must be able to work with both and an author must be conversant in both. Schemas may halp to solve these problems as well. A schema is an XML document that defines the content and structure of one or more XML documents.

It is important to understand this is an area of development and currently there are severl schema dialects, not all of which work with validators. In any case, XML schema supports complex and simple element types. A comples type is an element with two or more attributes or is the parent to one or more child elements. A simple type contains only character data and has no attributes.

XML schema also supports two categories of data types including built-in and user-derived. Built-in data types are classified as either primitive or derived. There are 19 primitive data types and 25 data types that are derived from the 19 primitive ones (see Fig. 4-14 and 15 in the text).

Create an XML Schema

Start Notepad and create a new XSD file:

<?xml version="1.0" encoding="UTF-8"?>
<!-- Schema for breast cancer patient data -->

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
                       xmlns="http://uhosp/patients/ns"
                       targetNameSpace="http://uhosp/patients/ns" ></xsd:schema>
   <xsd:element name="Patients">
     <xsd:complexType>
         <xsd:sequence>
            <xsd:element name="Patient" maxoccurs="unbounded">
               <xsd:complexType>
                  <xsd:sequence>
                     <xsd:element name="Name" type="xsd:string"/>
                     <xsd:element name="ID" type="xsd:string"/>
                     <xsd:element name="DOB" type="xsd:date"/>
                     <xsd:element name="Age" type="xsd:positiveInteger"/>
                     <xsd:element name="Stage" type="xsd:string"/>
                     <xsd:element name="Comment" type="xsd:string" minoccurs="0" maxoccurs="unbounded"/>
                     <xsd:element name="Performance">
                        <xsd:complexType>
                           <xsd:simpleContent>
                              <xsd:extension base="xsd:decimal">
                                 <xsd:attribute name="Scale" type="xsd:string" use="required"/>
                              </xsd:extension>
                           </xsd:simpleContent>
                        </xsd:complexType>
                     </xsd:element>
                  </xsd:sequence>
                </xsd:complexType>
             </xsd:element>
         </xsd:sequence>
     </xsd:complexType>
   </xsd:element>
</xsd:schema>

Save the file as PSchema1.xsd.

Attach an Instance to the Schema by opening Pattxt.xml, rename it as Patient.xml:

<?xml version="1.0"?>
<
pat:Patients xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                       xmlns:pat="http://uhosp/patients/ns"
                       xsi:schemaLocation="http://uhosp/patients/ns PSchema1.xsd">

...

</pat:Patients>

Save and view the file in the browser.