Register for EarthWeb's Million Dollar Sweepstakes!
home account info subscribe login search My ITKnowledge FAQ/help site map contact us


 
Brief Full
 Advanced
      Search
 Search Tips
To access the contents, click the chapter and section titles.

Platinum Edition Using HTML 4, XML, and Java 1.2
(Publisher: Macmillan Computer Publishing)
Author(s): Eric Ladd
ISBN: 078971759x
Publication Date: 11/01/98

Bookmark It

Search this book:
 
Previous Table of Contents Next



CAUTION:  

No official method exists yet for resolving public identifiers in XML. An SGML method has been used for years, however, and because most of the major XML tools come from SGML developers, the same method has quietly been implemented in their XML tools without any real discussion about whether it was needed.


SGML uses a public identifier resolution commonly known as the SGML Open Catalog (SOC), which uses a catalog file located in the same directory as the document (the application is free to change this location, of course). This file is usually called catalog.soc or, more frequently, just catalog.

There is little point in going into all the technical details because the catalog file is really an SGML facility. As far as we are concerned, the catalog file is basically an ASCII file consisting of lines that couple a public identifier (officially a “Formal System Identifier” (FSI)) with a system object identifier. A system object identifier is basically a file, but it could also be some other kind of identifier that the system is able to convert into something meaningful. A typical catalog file looks like the following:

-- catalog: SGML Open style entity catalog for HTML --
-- $Id: catalog,v 1.3 1995/09/21 23:30:23 connolly Exp $ --
-- Hacked by jjc --
-- Ways to refer to Level 2: most general to most specific --
PUBLIC“-//IETF//DTD HTML//EN”                      “html.dtd”
PUBLIC“-//IETF//DTD HTML 2.0//EN”                  “html.dtd”
PUBLIC“-//IETF//DTD HTML Level 2//EN”              “html.dtd”
PUBLIC“-//IETF//DTD HTML 2.0 Level 2//EN”          “html.dtd”

-- Ways to refer to Level 1: most general to most specific --
PUBLIC“-//IETF//DTD HTML Level 1//EN”              “html-1.dtd”
PUBLIC“-//IETF//DTD HTML 2.0 Level 1//EN”          “html-1.dtd”

-- Ways to refer to Strict Level 2: most general to most specific --
PUBLIC“-//IETF//DTD HTML Strict//EN”               “html-s.dtd”
PUBLIC“-//IETF//DTD HTML 2.0 Strict//EN”           “html-s.dtd”
PUBLIC“-//IETF//DTD HTML Strict Level 2//EN”       “html-s.dtd”
PUBLIC“-//IETF//DTD HTML 2.0 Strict Level 2//EN”   “html-s.dtd”

-- Ways to refer to Strict Level 1: most general to most specific --
PUBLIC“-//IETF//DTD HTML Strict Level 1//EN”       “html-1s.dtd”
PUBLIC“-//IETF//DTD HTML 2.0 Strict Level 1//EN”   “html-1s.dtd”

-- ISO latin 1 entity set for HTML --
PUBLIC“ISO 8879-1986//ENTITIES Added Latin 1//EN//HTML”  ISOlat1.sgm

Note that this example is a modified XML version of an SGML entities file; in XML the filename has to be enclosed in quotes, but in SGML it does not.

Nothing prevents you from creating this file by hand using a text editor, but a few free catalog management packages (also called entity management packages) are available on the Internet. Some software packages have their own built-in facility, often called an entity manager, for resolving entities.

Parameter Entities

Parameter entity references may only appear in a DTD. To keep them distinct from general entities (and to prevent them from being used in a document), parameter entities are declared and referenced with a percent sign (%):

<!ENTITY % “front | body | back” >

Parameter entities are extremely useful as shortcuts for parts of declarations that occur often in DTD. They are not, however, allowed to contain markup (complete declarations); they can only contain parts of declarations:

<!ENTITY often-used “(para | body | text)”>
<!ELEMENT chapter ((%common;)*, section+)>
<!ELEMENT section (%common;)>

When a parameter entity reference is resolved, one leading and one trailing space character is added to the replacement text to make sure that it contains an integral number of grammatical tokens.

Entity Resolution

The rules governing entity resolution (when they are interpreted and when they are ignored) can be quite complicated.

Table 15.1 shows what happens to entity references and character references. The leftmost column describes where the entity reference appears:

Table 15.1 Entity Resolution

Entity Type

Where
Referenced
Parameter Internal General External
Parsed General
Character Reference

Inside an element ignored replaced replaced if validating replaced
In an attribute value ignored replaced not allowed replaced
Name in attribute value ignored not allowed tell application ignored
In an entity value replaced* ignored not allowed replaced
In the DTD replaced if validating not allowed not allowed not allowed

*When the entity reference appears in an attribute value or a parameter reference appears in an entity value, single and double quotes are ignored so that the value isn’t prematurely terminated.


Previous Table of Contents Next


Products |  Contact Us |  About Us |  Privacy  |  Ad Info  |  Home

Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc.
All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement.