MCCC CIS 177 - Markup Languages

Create an XML Document

Work with XML

XML Structure

An XML document consists of the prolog (optional, but considered good form), the document body and the epilog. The prolog provides information abou the document itself and includes the XML declaration, the DTD and miscellaneous statements or comments. The XML declaration is always the first line of code:

<?xml version="version number" encoding="encoding type" standalone="yes/no" ?>

Version numer by default is "1.0" (the only version currently). The encoding type allows you to use XML with a variety of languages. The default coding scheme is English, or "UTF-8." The standalone attribute tells the parser whether or not the document has any links to external files. By default, if you omit the standalone attribute, the parser assumes yes.

Elements and Attributes

Elements in XML are similar to those in HTML and may be open or closed. There are several rules you must be familiar with regarding XML elements:

  • Names are case sensitive
  • Names must begin with a letter or underscore
  • Names cannot contain blank spaces
  • Names cannot begin with xml
  • The name of the closing tag must exactly match the name of the opening tag
  • Elements can be nested as child elements
  • You must close a child element before closing the parent element (much more strict than HTML.)

The nesting concept applied to the whole document. All elements are nested within a single document or root element.

An open or empty element is one that contains no content. The syntax of an empty element tag is:

<emptyelement/>

Attributes describe a characteristic of an element. The syntax for adding an attribute is the same for HTML. Attributes are case sensitive, must also begin with a letter or underscore, must not include spaces and can only appear once in the same starting tag.

<element attribute="value">Content</element>

<track length="9:22">So What</track>

There is some debate about the use of attributes. Some say you should never use attributes because they add to the complexity of the document with the information better placed within an element. Most agree it is best to use attributes only for information that is processed by the parser but does not need to be viewed.

Create an XML Document

Create Prolog and Elements

Start a new Notepad document and add the following code:

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!-- This document contains data on Jazz Warehouse special offers -->

<specials>
<title>Monthly Specials at the Jazz Warehouse</title>
     <cd>Kind of Blue
              <artist>Miles Davis</artist>
              <priceus>US: $11.99</priceus>
              <priceuk>UK: &#163;8.39</priceuk>
              <track length="9:22">So What</track>
              <track length="9:46">Freddie Freeloader</track>
              <track length="5:37">Blue in Green</track>
              <track length="11:33">All Blues</track>
              <track length="9:26">Flamenco Sketches</track>
     </cd>
     <cd>Cookin'
              <artist>Miles Davis</artist>
              <priceus>US: $7.99</priceus>
              <priceuk>UK: &#163;5.59</priceuk>
              <track length="5:57">My Funny Valentine</track>
              <track length="9:53">Blues by Five</track>
              <track length="4:22">Airegin</track>
              <track length="13:03">Tune-Up</track>
     </cd>
     <cd>Blue Train
              <artist>John Coltrane </artist>
              <priceus>US: $8.99</priceus>
              <priceuk>UK: &#163;6.29</priceuk>
              <track length="10:39">Blue Train</track>
              <track length="9:06">Moment's Notice</track>
              <track length="7:11">Locomotion</track>
              <track length="7:55">I'm Old Fashioned</track>
              <track length="7:03">Lazy Bird</track>
     </cd>
</specials>

Save the file as jazz.xml and view in the browser.

CDATA

Ocassionally, you'll need to include large blocks of text in your dcoument that includes special code characters such as < >. It would be very tediousl to have to use the & special coding syntax to accomplish this. For this reason, it's a good idea to use CDATA sections within your document:

<![CDATA[
     Text area
]]>

Th CDATA section may contain most markup and parser characters and may be placed anywhere that text occurs. However cdata sections cannot be nested within one another and cannot be empty.

Add the cdata section to jazz.xml as follows just below the <title>:

<message>
<![CDATA[
   Here are some of the latest specials from the Jazz Warehouse. Please note that all Miles Davis & John Coltrane CDs wil be on sale for the month of March.

]]>
</message>

Save and view in the browser.

Format an XML Document

The easiest way to format an XML document is to link it to a style sheet. You can use CSS (Cascading Style Sheets), or XSL (eXtensible Stylesheet Language). Since XSL is still under development and hasn't yet gained full browser support, you'll use CSS to format jazz.xml. The syntax for creating a style in CSS is:

selector {attribute1:value1; attribute2:value2; ...}

Use a CSS for use with XML

Download the CSS for use with jazz.xml.

Open and examine jw.css.

Switch to jazz.xml and add the link as follows immediately below the comment tag.

<?xml-stylesheet type="text/css" href="JW.css" ?>

Save the file and view in the browser.