|
|
|
To access the contents, click the chapter and section titles.
HTML 4.0 Sourcebook
(Publisher: John Wiley & Sons, Inc.)
Author(s): Ian S. Graham
ISBN: 0471257249
Publication Date: 04/01/98
Chapter 9 The HTTP Protocol
To develop truly interactive HTML-based applications, a Web designer must understand how a Web client program, such as a browser, interacts with an HTTP server. This interaction involves two distinct but related issues. The first is HTTPthe protocol by which a client program sends information to an HTTP server and vice versa. HTTP supports mechanisms for communicating information about the transaction, such as the transaction status (successful or not) and the nature of the data being sent (i.e., what is the MIME type of the data) on top of mechanisms for sending data from client to server or from server to client. The protocol also supports several communication methods (for example, GET, POST, or HEAD) for specifying how message data is being sent by the browser or how the request should be handled by the server. This chapter presents a detailed description of these mechanisms and of how they work.
The second issue is the manner in which servers handle a request. If the requested resource is a file, the server locates the file and sends it back to the client or sends an appropriate error message if the file is unavailable. However, the requested resource (specified by the URL) in some cases is not a file, but rather a request for special processing at the server end of the transaction, such as a database query. In most cases, the Web server does not do this processing, since such tasks are specific to the applications running at the Web site and do not reflect generic functionality that can be easily incorporated in a universal server. Instead, most servers hand off these application-specific tasks to other programs, called gateway programs. These programs run independent of (but can communicate with) the HTTP server and are designed explicitly for the special processing required at a Web site. The Common Gateway Interface (CGI) specification, described in Chapter 10, defines how HTTP servers communicate with these gateway programs.1
1Most servers now support compiled modules that can be dynamically linked to the server and that support gateway-like functionality, with significant performance improvements over the CGI approach. This mechanism and the advantages and disadvantages of this approach are also discussed in Chapter 10.
This chapter first outlines the general principles of the HTTP protocol and then illustrates its operation using seven example transactions. The chapter concludes with a detailed list of the control messages that can be sent from client to server, and vice versa.
Like HTML, HTTP is evolving, with new features being added in later versions of the protocol. This chapter primarily discusses HTTP 1.0, as this is the current, universally supported protocolNetscape Navigator 4 does not support HTTP 1.1, while Internet Explorer 4 only supports HTTP 1.1 for certain types of client-server connections. Note, however, that a few HTTP 1.1 features, particularly important for controlling the client-server connection, were incorporated as extensions to HTTP 1.0 and are supported by most Web browsers and HTTP 1.0 Web servers. These extensions are also discussed here. A brief description of the other major differences between HTTP 1.1 and 1.0 are presented later in this chapter.
HTTP Protocol Overview
HTTP, or HyperText Transfer Protocol, is an Internet client-server protocol designed for the rapid and efficient delivery of hypertext materials. HTTP is a stateless protocol, which means that once a server has delivered the requested data to a client, the client-server connection is broken, and the server retains no memory of the event that just took place.
All HTTP communication transmits data as a stream of 8-bit characters or octets. This ensures the safe transmission of all forms of data, including images, executable programs, and HTML documents.
A typical HTTP 1.0 session has four stages:
- Client opens the connection. The client program (for example, a Web browser) contacts the server at the specified Internet address and port number (the default port is 80).
- Client makes the request. The client sends a message to the server requesting service. The request has either one or two parts. The first part is an HTTP request header, specifying the HTTP method to be used during the transaction and providing information about the capabilities of the client and about the data being sent to the server (if any). Typical HTTP methods are GET, for getting an object from a server, and POST, for posting (sending) data to a resource (e.g., a gateway program) on the server. The second part of the message consists of the data being sent by the client to the serverthis part is absent if no data are being sent.
- Server sends a response. The server sends a response to the client. This consists of either one or two parts. The first part is the response header describing the state of the transaction (e.g., was the transaction successful, or not) and the type of data being sent (if any), and the second part is the data being returned (if any).
- Server closes the connection. The connection is closed; the server does not retain any knowledge of the transaction just completed.
This procedure means that each connection processes a single transaction and can therefore download only a single data file to the client, while the stateless nature means that each connection knows nothing about previous connections. The implications of these features are illustrated in the following two illustrations.
NOTE: HTTP 1.0 ExtensionHTTP Keep-alive
Most HTTP 1.0-capable browsers and servers support an HTTP 1.0 extension known as keep-alive, which keeps the client-server connection open whenever a client requests multiple resources from the same server. In terms of the preceding model for HTTP transactions, this means that step 4 (the closing of the connection) is deferredinstead, the next request starts at step 2 and does not require the reopening of the connection. Note, however, that the server still does not retain knowledge of a transaction after it sends its response (step 3). Keep-alive is an integrated core feature of HTTP 1.1.
Illustration: Single Transaction per Connection
Assume that HTTP is used to access an HTML document that contains, via IMG element references, 10 inline images. Displaying the document requires 11 distinct connections to the HTTP serverone to retrieve the HTML document itself and 10 others to retrieve the 10 image files.
If this transaction is repeated using keep-alive, the browser still retrieves 11 distinct files, but the connection between browser and server is only broken after the last image has been downloaded. This can significantly speed up the downloading of composite documents, since the connection is only opened and closed once, not 11 times. However, as far as the browser and server are concerned, each request is a single transaction, and the server does not retain knowledge of each transaction after it is completed, even if the connection to the client is still open.
|