home account info subscribe login search My ITKnowledge FAQ/help site map contact us


 
Brief Full
 Advanced
      Search
 Search Tips
To access the contents, click the chapter and section titles.

Platinum Edition Using HTML 4, XML, and Java 1.2
(Publisher: Macmillan Computer Publishing)
Author(s): Eric Ladd
ISBN: 078971759x
Publication Date: 11/01/98

Bookmark It

Search this book:
 
Previous Table of Contents Next


Of course, the choice of programming language will affect each variable greatly. A tight little C program hardly makes an impact, whereas a Visual Basic program, run from a wrapper and talking to an SQL Server back end, will gobble up as much memory as it can. (Using Active Server Pages (ASP) on Microsoft Information Server resolves this problem, reducing the load on the server. See Chapter 33, “Active Server Pages and VBScript,” for more details on ASP.) Visual Basic and similar development environments are optimized for ease of programming and best runtime speed, not for small code and quick loading. If your program loads seven DLLs, an OLE control, and an ODBC driver, you may notice a significant delay.

UNIX

UNIX machines are usually content with significantly less RAM than Windows NT computers, for a number of reasons. First, most of the programs, including the operating system and all its drivers, are smaller. Second, it’s unusual, if not impossible, to use an X Windows program as a CGI script. This means that the resources required are fewer, although with the prices of processor speed and drive and memory megabytes falling, the difference in hardware cost is not that great. Maintenance and requisite system knowledge, however, are far greater. Trade-offs occur in everything, and what UNIX gives you in small size and speed, it more than makes up with complexity. In particular, setting Web server permissions and getting CGI to work properly can be a nightmare for the UNIX novice. Even experienced system administrators often trip over the unnecessarily arcane configuration details. Things are getting better, though. You can buy preconfigured servers, for example, and many do-it-yourself Linux administrators are glad for Redhat. After a UNIX-based system is set up, however, adding new CGI scripts usually goes smoothly and seldom requires adding memory.

If you give your UNIX computer 16MB of RAM and a reasonably fast hard disk, it will run quickly and efficiently for any reasonable number of hits. (Of course, you may not want to skimp on RAM when memory prices are low.) Database queries will slow it down, the same as they would if the program weren’t CGI. Due to UNIX’s multiuser architecture, the number of logged-on sessions (and what they’re doing) can significantly affect performance. It’s a good idea to let your Web server’s primary job be servicing the Web rather than the users. Of course, if you have capacity left over, no reason exists not to run other daemons, but it’s best to choose processes that consume resources predictably so that you can plan your site.

A large, popular site—one that receives several hits each minute, for example—will require more RAM, the same as on any platform. The more RAM you give your UNIX system, the better it can cache, and therefore, the faster it can satisfy requests.

CGI Script Structure

When your script is invoked by the server, the server passes information to the script via environment variables and, in the case of POST, via STDIN. GET and POST are the two most common request methods you’ll encounter, and probably the only ones you’ll need. (HEAD and PUT are also defined but seldom used for CGI.) The request method tells your script how it was invoked; based on that information, the script can decide how to act. The request method is passed to your script using the environment variable called, appropriately enough, REQUEST_METHOD.

  GET is a request for data, the same method used for obtaining static documents. The GET method sends request information as parameters tacked onto the end of the URL. These parameters are passed to your CGI program in the environment variable QUERY_STRING.
If your script is called myprog.exe, for example, and if you invoke it from a link with the form
<A HREF =”cgi-bin/myprog.exe?lname=blow&fname=joe”>

the REQUEST_METHOD will be the string GET, and the QUERY_STRING will contain lname=blow&fname=joe.
The question mark separates the name of the script from the beginning of the QUERY_STRING. On some servers the question mark is mandatory, even if no QUERY_STRING follows it. On other servers, a forward slash may be allowed instead of or in addition to the question mark. If the slash is used, the server passes the information to the script using the PATH_INFO variable instead of the QUERY_STRING variable.
  A POST operation occurs when the browser sends data from a fill-in form to the server. With POST, the QUERY_STRING may or may not be blank, depending on your server.
The data from a POSTed query gets passed from the server to the script using STDIN. Because STDIN is a stream and the script needs to know how much valid data is waiting, the server also supplies another variable, CONTENT_LENGTH, to indicate the size in bytes of the incoming data. The format for POSTed data is
variable1=value1&variable2=value2&etc

Your program must examine the REQUEST_METHOD environment variable to know whether to read STDIN. The CONTENT_LENGTH variable is typically useful only when the REQUEST_METHOD is POST.


URL Encoding

The HTTP 1.0 specification calls for URL data to be encoded in such a way that it can be used on almost any hardware and software platform. Information specified this way is called URL-encoded; almost everything passed to your script by the server will be URL-encoded.

Parameters passed as part of QUERYSTRING or PATHINFO will take the form variable1=value1&variable2=value2 and so forth, for each variable defined in your form.

Variables are separated by the ampersand. If you want to send a real ampersand, it must be escaped—that is, encoded as a two-digit hexadecimal value representing the character. Escapes are indicated in URL-encoded strings by the percent (%)sign. Thus, %25 represents the percent sign itself. (25 is the hexadecimal representation of the ASCII value for the percent sign.) All characters above 127 (7F hexidecimal) or below 33 (21 hexidecimal) are escaped by the server when it sends information to your CGI program. This includes the space character, which is escaped as %20. Also, the plus sign (+)needs to be interpreted as a space character.

Before your script can deal with the data, it must parse and decode it. Fortunately, these are fairly simple tasks in most programming languages. Your script scans through the string looking for an ampersand. When it is found, your script chops off the string up to that point and calls it a variable. The variable’s name is everything up to the equal sign in the string; the variable’s value is everything after the equal sign. Your script then continues parsing the original string for the next ampersand, and so on, until the original string is exhausted.

After the variables are separated, you can safely decode them, as follows:

1.  Replace all plus signs with spaces.
2.  Replace all %## (percent sign followed by two hexidecimal digits) with the corresponding ASCII character.

It’s important that you scan through the string linearly rather than recursively because the characters you decode may be plus signs or percent signs.

When the server passes data to your form with the POST method, check the environment variable called CONTENT_TYPE. If CONTENT_TYPE is application/x-www-form-urlencoded, your data needs to be decoded before use.




Previous Table of Contents Next


Products |  Contact Us |  About Us |  Privacy  |  Ad Info  |  Home

Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc.
All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement.