|
To access the contents, click the chapter and section titles.
HTML 4.0 Sourcebook
The Document Object Model The Microsoft and Netscape approaches are similar in goal, but very different in implementation. However, both approaches depend on a model for the browser and the displayed page as a collection of objects that can be manipulated and processed. This model has come to be know as the Document Object Model, or DOM. Unfortunately, Netscape and Microsoft employ slightly different models, so that it is becoming practically impossible to write scripting software for one browser that will also work on the other. Fortunately, the World Wide Web Consortium has formed a DOM working group, which is working towards defining an open DOM model to be used by all browser vendors. Both Microsoft and Netscape are active participants in the DOM working group, and with much hard work and some luck, it is likely that this group will define an open DOM standard sometime in mid1998. If this standard is adopted by both Netscape and Microsoft, we can look forward to a much more exciting future of compatible scripting and dynamic HTML content in all browsers. Internationalization of HTMLInternationalization refers to the process of modifying software and software systems to support the worlds languages and to operate with an interface customized for any of these languages. For HTML, this means the ability to use character sets other than ISO Latin1, as well as the ability to support truly multilingual documentsthat is, documents containing more than one language.
HTML 4 incorporated several major changes in support of internationalization. First, the document character set for HTML was changed from ISO Latin1 to another set that supports more of the characters used by the worlds languages. The required characters or symbols (glyphs) number in the tens of thousands, a far cry from the 200odd characters possible with ISO Latin1. The character set is known as Unicode. When Unicode is the selected character set, every character reference refers to a Unicode character thus the reference δ refers to the 948th character in the Unicode character set, which is the Greek lowercase letter δ. Second, new elements and attributes were added to specify the language used within a particular element or the direction in which the characters should be drawn on the display. Internationalized Character SetsTo support truly international applications, HTML must support a character set that in turns supports all the characters and symbols of all the worlds languages. This is a tall order! Unfortunately, the specified character set of HTML 3.2 was ISO Latin1, an 8bit character set that only supports some 200odd characters common in Western European languages. This is, to say the least, insufficientISO Latin1 is useless for nonEuropean languages, while any standard 8bit character set is clearly insufficient for languages, such as Chinese or Japanese, where the required repertoire of characters far exceeds the 256 character limit of an 8bit character set.
|
Products | Contact Us | About Us | Privacy | Ad Info | Home
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement. |