MCCC CIS 177 - Markup Languages

Create a Valid XML Document

Document Type Definitions and Declarations

XML documents must be well-formed or the browser will produce an error. You may optionally specify validation rules, known as a document type definition (DTD) that ensures all required elements are present; prevents undefined elements from being included; enforces a specific data structure; specifies element attributes and permissable values; defines default values; and describes how the parser should access non-XML or non-text content.

You can only have one DTD per XML document. To create a DTD, you must enter a document type declaration in the XML document before any content. While you can only have one DTD per XML document, you may divide the DTD into internal and external subsets. The syntax for an internal subset is as follows:

<!DOCTYPE root
[
   declarations
]>

Root is the name of the document's root element and declarations are the statements that comprise the DTD. The root attribut must match the name of the root element or the parser will report and error. For external subsets, the syntax is as follows:

<!DOCTYPE root SYSTEM "URL">

OR

<!DOCTYPE root PUBLIC "identifier" "URL">

Root is the name of the document's root element, identifier is a text string that tells an application how to locate the external subset, and URL is the location and filename of the external subset. The PUBLIC form is used when the DTD has to be limited to an interanl system or when the XML document is part of an old SGML application. Unless your application requires a public identifier, you should use the SYSTEM form for location the external subset. To include both an internal and external subset, the syntax is as follows:

<!DOCTYPE root SYSTEM "URL">
[
   declarations
]>

OR

<!DOCTYPE root PUBLIC "identifier" "URL"
[
   declarations
]>

The main advantage of using an internal subset is that it is easier to compare the DTD to the document content. However, by using an external subset, you can apply the same DTD to many XML documents that can be shared amongst multiple authors. Internal subsets take precedence over external ones if a conflict between them arises.

 

Declaring Document Elements

Every element used in an XML document must be declared in the DTD. An element type declaration specifies the name of the element and indicates what type of content it can contain. You can even specify the order of elements within the document. The syntax for an element declaration is as follows:

<!ELEMENT element content-model>

Element is the name of the element and is case sensitive. You cannot use spaces or reserved symbols in an element name. The content-model specifies what type of content the element can contain. The five types of content are as follows:

Type
Syntax
Example
Any elements. No restrictions. <!ELEMENT element ANY> <!ELEMENT Product ANY>
Empty elements. The element cannot store any content. <!ELEMENT element EMPTY> <!ELEMENT img EMPTY>
Character data. The element can contain only text strings. <!ELEMENT element (#PCDATA)> <!ELEMENT Name (#PCDATA)>
Elements. The element can contain only child elements. <!ELEMENT element (child_elements)>

<!ELEMENT Customer (Phone)>

Note: If multiple child elements are separated by commas, all child elements must appear in proper sequence and can only appear once.

If multiple child elements are separated by vertical bars, the content must include only one of the choices.

See modifying symbols

Mixed. The element can contain both a text string and child elements. <!ELEMENT element (#PCDATA | child1 | child2 | ...)*> Varies. Using mixed data limits your control over the document structure.

Modifying Symbols

You may use modifying symbols to indicate the number of occurrences of each element. Modifying symbols are as follows:

? Allow zero or one of the item
+ Allow one or more of the item
* Allow zero or more of the item

Examples:

<!ELEMENT Customer (Name+)>
<!ELEMENT Customer (Name, Address, Phone, E-mail?, Order+)>
<!ELEMENT Order (Order_date, Items)+>
<!ELEMENT Customer (Name | Company)+>


Create DTD and Delcarations

Download lab file.

Open Ordertxt.xml and examine the code.

Add the following code immediately below the XML declaration:

<!-- document type declaration follows -->

<!DOCTYPE Customers
[
   <!ELEMENT Customers (Customer+)>
   <!ELEMENT Customer (Name, Address, Phone, E-mail?, Orders)>
   <!ELEMENT Name (#PCDATA)>
   <!ELEMENT Address (#PCDATA)>
   <!ELEMENT Phone (#PCDATA)>
   <!ELEMENT E-mail (#PCDATA)>
   <!ELEMENT Orders (Order+)>
   <!ELEMENT Order (Order_date, Items)>
   <!ELEMENT Items (Item+)>
   <!ELEMENT Order_date (#PCDATA)>
   <!ELEMENT Item (#PCDATA)>
]>

<Customers>....

Save the file as Order.xml. Note the document is not yet valid because the attributes have not yet been specified in the document structure.