|
To access the contents, click the chapter and section titles.
HTML 4.0 Sourcebook
Introduction and Book OutlineIn 1995, in the introduction to the first edition of this book, I stated that the World Wide Web had taken the Internet by storm. At the time I hoped this was appropriate, but felt in my heart that I was probably overstating things. In retrospect, or course, it was a gross understatement. Over the past three yearsand three editions of this bookthe Web has grown far beyond everyones expectations (well, perhaps not those of Marc Andreessen), to become one of the core technologies of the 1990s. Hundreds of thousands of companies now offer products and services via the Web, and the trickle of Web-related products available in late 1994 has grown, by late 1997, into a torrent. Purely Web-based companies such as Netscape Communications and Yahoo! are now market-valued in the billions of dollarsand these companies did not exist three years ago! Meanwhile, traditional software companies such as Oracle, Sun, and Microsoft have totally redesigned their product and business models, while noncomputer-related businesses, from news and entertainment to financial services, high technology, and manufacturing, are adopting these new technologies as a new paradigm for internal operations, and as a new way of communicating with clients and customers. This is not the simple swell of a storm, but a tidal wave that threatensand promisesto change our society in ways that we cannot yet imagine. This is because the World Wide Web model makes distributing and accessing any form of digital data easy and inexpensive for anyonecompany or consumerwith profound implications for business, culture, and society. Thus, it is no surprise that seemingly everyone is now buying or downloading the latest in Web tools and is madly learning how to build pages so that they too can join this new electronic world. Indeed, this is probably why you have picked up this bookto learn about the tools, and how to build Web pages! The Web ModelA tool may be easy to use, but usually requires skill and training to be used well. This is certainly true of the tools involved in preparing and distributing information via hypertext documents and Internet Web servers. Just as designing a book or magazine requires experience and knowledge in the tools of design and typography, preparing well-designed, useful, and reliable Web resources requires an in-depth understanding of how the tools that deliver these resources work, and how to use them well. The intention of this book, as with the first three editions, is to help you develop this understanding. Given a basic feeling for what the Internet issimply a system, rather like a courier service, for communicating digital information from one place to anotherthere are four essential concepts that you need to understand:
The goal of this book and its companion Web site (www.wiley.com/compbooks/graham) is to explain these main concepts, and give you the tools you need to develop your own high-quality World Wide Web products. The remainder of this introduction looks briefly at these components and explains their basic features, and outlines the organization of the book. A figurative summary of these different components and the relationships between them is found in Figure I.1. Uniform Resource LocatorsUniform Resource Locators, or URLs, are a naming scheme for specifying how and where to find any Internet server resource, such as those available from HTTP, FTP, or WAIS servers. For example, the URL that references the important file bunny_hop.zip in the directory /pub/web/browsers on the FTP server ftp.banzai.net is simply: <ftp://ftp.banzai.net/pub/web/browsers/bunny_hop.zip> World Wide Web hypertext documents use URLs to reference other hypertext resources.
The HyperText Transfer ProtocolThe HyperText Transfer Protocol, or HTTP, is an Internet communications protocol designed expressly for the rapid distribution of hypertext documents. Like other Internet tools such as FTP, WAIS, and Gopher, HTTP is a client-server protocol. In the client-server model a client program, running on the users machine, sends a message requesting service to a server program running on another machine on the Internet. The server responds to the request by sending a message back to the client. In exchanging these messages, the client and server use a well-understood protocol. FTP, WAIS, and Gopher are other examples of Internet client-server protocols, all of which are accessible to a World Wide Web browser. However, the HTTP protocol was designed expressly for hypertext document delivery. Today, almost all Web services are delivered via HTTP servers. Server-Side Resource ProcessingAt the simplest level, HTTP servers simply serve up files when clients request them. However, HTTP servers support additional important features:
The special server-side utilities that implement these features are often called gateway programs, as they usually act as a gateway between the HTTP server and other resources accessible to the Web server, such as databases. Just as a server can access many files, an HTTP server can access many different gateway programs; in both cases you specify which (file or program) resource you want through a URL. The interaction between the server and these gateway programs is governed by the Common Gateway Interface (CGI) specifications. Using the CGI specifications, a programmer can easily write simple programs or scripts to process user queries, interrogate databases, make images that respond to mouse clicks, and so on. Many servers also let you program gateway-like functionality directly into the server, for increased speed and performance. The HyperText Markup LanguageThe HyperText Markup Language, or HTML, is the language used to prepare Web hypertext documents. These are the documents you distribute on the World Wide Web and are what your human clients actually see. HTML contains commands, called elements or tags, to mark text as headings, paragraphs, lists, quotations, and so on. It also has tags for including images within the documents, for including fill-in forms that accept user input, and, most importantly, for including hypertext links connecting the document being read to other documents or Internet resources such as WAIS databases or anonymous FTP sites. It is this last feature that allows the user to click on a string of highlighted text and access a new document, an image, or a movie file from a computer thousands of miles away. And how does the HTML document specify where this other document is? Through a URL, which is included in the HTML markup instructions and which is used by the users browser to find the designated resource. What resources can URLs point to? They can be other HTML documents, pictures, sound files, movie files, or even database search engines. They can be downloadable programs in Java or other languages. They can be located on the users computer or anywhere on the Internet. They can be accessed from HTTP servers or from FTP, Gopher, WAIS or other servers. The URL is an immensely flexible scheme, and in combination with HTML, yields an incredibly powerful package for preparing a web of hypertext documents linked to each other and to Internet resources around the world. This image of interlinked resources is in fact the vision that gave rise to the name, World Wide Web. Overview of the BookThis book is an introduction to HTML, URLs, HTTP, and the CGI interface and to the design and preparation of resources for delivery via the World Wide Web. It begins with the HTML language. Almost every resource you prepare will be presented through an HTML document, so that your HTML presentation is your face to the world. It is crucial that you know how to write accurate HTML, and that you understand the design issues involved in creating attractive, useful documents, if you are to make a lasting impression on your audience and present your information clearly and concisely. It wont matter if your Internet resources are the best in the world if your presentation of them is badly designed, frustratingly slow to access, or difficult to use. HTML is also an obvious place to start. You can write simple HTML documents and view them with a Web browser such as Internet Explorer, Netscape, Mosaic, or lynx without having to worry about CGI programs, HTTP servers, or other advanced features. You can also easily include, in your documents, URLs pointing to server resources around the world, and get used to how the system works: Browsers understand HTML hypertext anchors and the URLs they contain and have built-in software to talk to Internet servers using the proper protocols. You can accomplish a lot just by creating a few pages of HTML. Chapter 1 is an elementary introduction to HTML and to the design issues involved in preparing HTML documents. This nontechnical chapter combines a brief overview of HTML with a discussion of some aspects of document design. The details of the HTML language and more sophisticated client-server issues are left to later chapters. Design issues are very important in developing good World Wide Web presentations. HTML documents are not like text documents, nor are they like traditional hypertext presentations, since they are limited by the varied capabilities of browsers and by the speed with which documents can be transported across the Internet. Chapter 2 discusses what this means in practice and gives guidelines for avoiding major HTML authoring mistakes. In most cases this is done using examples, with the important issues being presented in point form, so that you can easily extract the main points on first reading. The issue of images and graphics also comes up often in Web page design: Images are an important addition to any Web page, either as simple images or as clickable imagemaps. However, they must be carefully processed to make them Web-friendly: The image files must be small, in the right format, and of the right style for display by computer. These and other image-related issues are discussed in Chapter 3. At the same time, designing an HTML document collection is more than just writing pagesthe design of a collection is critically important, and involves design issues that are not always apparent from the point of view of a single page. Chapter 4 looks in detail at the issues surrounding document collection design, and will help you through the process of designing a real document web. Chapter 5 takes a more practical look at Web design issues, and describes how to go about planning and implementing a site (determining why you are building a site defining your audience, planning the site layout, etc.), cost analysis (how to estimate the costs of different site components), and maintenance (how to maintain the site, and how to estimate the costs of this process). This chapter helps to connect the theory of Chapters 1 through 4 with the practical realities of designing, building, and maintaining a Web collection. One point that is emphasized throughout the book is the importance of using correct HTML markup constructions when you create your HTML documents. Although HTML is a relatively straightforward language, there are many important rules specifying where tags can be placed. Ensuring that your documents obey these rules is the only way you can guarantee that they will be properly displayed on the many different browsers your site visitors may use. All too often, writers prepare documents that look wonderful on one browser but end up looking horrible, or even unviewable, on others. Although some general rules for constructing valid HTML are included in Chapters 1 and 5, Chapters 6 and 7 and the references therein should be used as detailed guides to correct HTML. In particular, Chapter 6 presents a detailed exposition of the current definitive version of HTML, known as HTML 4, and of the allowed nesting of the different HTML markup instructions. Chapter 7 continues along this line, but looks at more advanced features, such as framed documents, advanced HTML forms and tables, proprietary HTML extensions by browser vendors, document scripting (JavaScript), cascading style sheets, font embedding, and experimental HTML features that are not yet formally part of the standard HTML language, or that are not yet widely supported. You can use Chapter 6 as a guide for writing universally viewable HTML documents, and Chapter 7 as a guide to advanced features, and as a preview of coming attractions. Of course HTML is only a beginning. To truly take advantage of the Web you need to understand the interaction between browsers and HTTP servers, and be able to write server-side gateway programs that take advantage of this interaction. These topics are covered in Chapters 8 through 11. Chapter 8 describes the URL syntax in detail, while Chapter 9 delves into the specifics of the HTTP protocol used to communicate with HTTP servers, and discusses the basics of HTTP server operation. Chapter 10 then describes the details of the Common Gateway Interface (CGI) specification for writing server-side programs that interface with an HTTP server. Chapter 11 gives several concrete and clearly explained examples of real-world CGI programs, to show how the issues from Chapters 8 through 10 affect gateway program design. This chapter also contains a detailed reference list of resources useful in developing CGI or other server-based applicationsmany of these resources are available right over the Web, just waiting for you to go and get them. Chapters 6 through 11 are the technical core of this book, and will be useful reference material when you are writing HTML documents, JavaScript scripts, or CGI programs. Book NotationIn this book, HTML element names are generally given in boldface capital letters, for example, DIV. Similarly, the names of URL schemes are given in a boldface lowercase type, as in the phrase http URLs. A monospace font is generally used for explicit examples of HTML or other code, as in <DIV CLASS=foo> to denote a specific DIV element tag. Also, JavaScript and Cascading Style Sheet code, as well as system environment variable names, are given in a Courier font. Program, directory, and file names are often given in italics, to make them stand out from the text and to reduce confusion. However, this is not always the case, and in many situations the names are given in a regular, non-italicized font, to make the text easier to read. URL references are written using the standard text font. However, to make the text shorter and easier to read, the http:// portion has been omitted from all http URLs. Most browsers (in particular, Netscape Navigator and Internet Explorer) assume that strings typed into the Location (Netscape) or Address (Microsoft) windows are http URLs, if no other protocol is specified. With some other browsers you will need to explicitly add the http:// portion. And, of course, you always need to add the http:// when you use a URL as an HREF value in an HTML document! The Companion Web Sitewww.wiley.com/compbooks/graham/ For those of you familiar with the previous editions, this fourth edition has been both significantly expanded and brought up-to-date. Indeed, there was so much new material, that not everything made it into print! Instead, the companion Web site, available at either of the URLs listed at the beginning of this section, has been used as an adjunct to the book, containing the more time-sensitive material (such as lists of software resources and descriptions, or lists of defined MIME types), plus additional content that simply didnt fit, or that simply worked better on the Web. For more information on what the site contains, go to the About the Web Site section at the end of the book.
|
Products | Contact Us | About Us | Privacy | Ad Info | Home
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement. |