home account info subscribe login search My ITKnowledge FAQ/help site map contact us


 
Brief Full
 Advanced
      Search
 Search Tips
To access the contents, click the chapter and section titles.

Platinum Edition Using HTML 4, XML, and Java 1.2
(Publisher: Macmillan Computer Publishing)
Author(s): Eric Ladd
ISBN: 078971759x
Publication Date: 11/01/98

Bookmark It

Search this book:
 
Previous Table of Contents Next


Types of XML Markup

Five types of markup exist in XML. Some of these might be familiar from your knowledge of HTML. If you know some SGML, all of them should ring a bell. The great thing is that no single one of them is much harder to learn than HTML; therefore, XML should be much more accessible than SGML.

The five classes of markup in XML are as follows:

  Elements. XML elements describe the meaning of the text they contain. Elements typically occur in pairs with a start tag and an end tag that enclose the text they mark up. Inside the start tag, a keyword indicates the meaning of the markup. The end tag contains the same keyword with a forward slash (/) in front of it. Both kinds of tag start with the less than sign (<) and end with the greater than sign (>).


NOTE:  Although subtle differences exist between what constitutes an element and what constitutes a tag, the words “element” and “tag” are sometimes used interchangeably. Specifically, <ADDRESS> is a tag, but the notion of an <ADDRESS>…</ADDRESS> container captures the idea of an element.
Some elements do not occur in pairs. These elements are said to be empty. Because it is important for parsers to know whether an element is empty, the tag for the element ends with /> rather than >. A line break element, for example, might look like
<BR/>
rather than the <BR> tag you are used to in HTML. The additional forward slash makes it clear to parsers that they should not look for a corresponding end tag.


NOTE:  The XML 1.0 recommendation makes allowances for empty tags to have an end tag, provided it immediately follows the start tag. Under this provision, you could use
<BR></BR>

rather than

<BR/>

This addition makes XML a bit more like HTML and will help ease the transition from authoring HTML documents to authoring XML documents.


Some elements take attributes that modify or expand on the meaning they impart to the content they contain. Attributes are set equal to values that must be offset by quotation marks. You could add an attribute to the previously mentioned <BR/> tag, for example, to make it read
<BR CLEAR=”LEFT”/>
This makes it break to the first clear left margin.
  Entities. Entities in XML are very similar to entities in HTML. Recall that in HTML you need entities to represent reserved characters such as < or >. The same idea applies in XML, and you would use the same entities—&lt; and &gt;—to render these characters. XML also enables you to use any Unicode character you want; thus, producing documents in languages other than English is less of a chore. Finally, you can define your own entities right inside your XML code and reference them later on.
  XML entities can also reside externally to the document. You can incorporate a separate XML file by mapping it to an entity name and then referencing the entity in your main file.
  Comments. Commenting your code is always prudent, and XML supports you in commenting with the <!-- and --> tags for enclosing comments. These are the same tags you use for comments in both HTML and SGML.
  You can place any text you like between the <!-- and --> tags, except for the double hyphen construct --. This character sequence is reserved, so it can help to denote the comment.
  Processing instructions. Processing instructions (PIs) enable you to embed information to be passed to an application right in your XML document. All PIs have the following syntax:
<?name data?>
The name, or PI target, should be one that an application will recognize. You can give the target any name you like, but targets beginning with “XML” are reserved for standardization purposes.
The data component of the PI can be anything that the processing application understands. Therefore, it is important for processing applications to act only on PIs whose targets they recognize.
  Ignored sections. Sometimes it is necessary to pass characters that are XML reserved characters. In these cases, you can define a section that will be ignored by the XML parser and be passed to a processing application. A good example of this is mathematical code, which is likely to contain greater than (>) or less than (<) signs. A parser would normally treat these characters as parts of a start or end tag, but if you put them into an ignored section, like this:
<![CDATA[
4 < 3 is FALSE.
]]>
the expression with the less than sign passes to the application. All ignored sections start with <![CDATA[ and end with ]]>. You can put any text you want between these containers except for the ]]> combination.


NOTE:  Comments found in an ignored section get passed to the processing application as well.

With a sense of what the major types of markup are, you could probably get started with some XML yourself. You could mark up a letter, for example, as follows:

<LETTER>
<DATE ALIGN=”RIGHT”>
September 29, 1998
</DATE>
<INSIDEADDRESS>
Trans Union Corporation<BR/>
Consumer Disclosure Center<BR/>
P.O. Box 390<BR/>
Springfield, PA  19064-0390<BR/>
</INSIDEADDRESS>
<SALUTATION>
Dear Customer Relations Representative:
</SALUTATION>
<BODY>
<P ALIGN=”JUSTIFY”>
Please send me a copy of my credit file.  Enclosed please find a page of
 personal data and a check for $8.00 for your services.
</P>
<P ALIGN=”JUSTIFY”>
You may send the report to the address indicated on the personal
data page.
</P>
<P>
Thank you for your assistance.
</P>
</BODY>
<CLOSING>
Very truly yours,
</CLOSING>
<SIGNATURE>
Mary Consumer
</SIGNATURE>
</LETTER>

Because you are probably familiar with the structure of a business letter, the preceding markup makes sense. Note how the elements describe the nature of the text they contain, rather than how it should be presented. In fact, the only reference to presentation you see is the ALIGN attribute in the <DATE> and <P> tags.


Previous Table of Contents Next


Products |  Contact Us |  About Us |  Privacy  |  Ad Info  |  Home

Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc.
All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement.