Platinum Edition Using HTML 4, XML, and Java 1.2:Programming CGI Scripts

To access the contents, click the chapter and section titles.

Platinum Edition Using HTML 4, XML, and Java 1.2
(Publisher: Macmillan Computer Publishing)
Author(s): Eric Ladd
ISBN: 078971759x
Publication Date: 11/01/98

Table of Contents

Planning Your Script

Now that you’ve seen a script’s basic structure, you’re ready to learn how to plan a script from the ground up:

1. Take your time defining the program’s task. Think it through thoroughly. Write it down and trace the program logic. When you’re satisfied that you understand the input, output and the transform process you’ll have to do, proceed.

2. Order a pizza and a good supply of your favorite beverage, lock yourself in for the night, and come out the next day with a finished program. This sounds cute, but it is oddly good advice. Sometimes, it seems as if more bugs stem from interruptions while programming—which cause loss of concentration—than from any other source. And while you’re sequestered, don’t forget to document your code as you write it.

3. Test, test, test. Use every browser known to mankind and every sort of input you can think of. Especially test for the situations in which users enter 32KB of data in a 10-byte field (using MAXSIZE within your input tag does not protect you from receiving more input than expected), or they enter control codes where you’re expecting plain text.

4. Document the program as a whole, too—not just the individual steps within it—so that others who have to maintain or adapt your code will understand what you were trying to do.

Step 1, of course, is this section’s topic, so we’ll look at that process in more depth:

• If your script will handle form variables, plan out each one: its name, expected length, and data type.

• As you copy variables from QUERY_STRING or STDIN, check for proper type and length. A favorite trick of UNIX hackers is to overflow the input buffer purposely. Because of the way some scripting languages (notably sh and bash) allocate memory for variables, this sometimes gives the hacker access to areas of memory that should be protected, enabling them to place executable instructions in your script’s heap or stack space.

• Use sensible variable names. A pointer to the QUERY_STRING environment variable should be called something such as pQueryString, not p2. This not only helps debugging at the beginning but makes maintenance and modification much easier. No matter how brilliant a coder you are, chances are good that a year from now you won’t remember that p1 points to CONTENT_TYPE and p2 points to QUERY_STRING.

• Distinguish between system-level parameters that affect how your program operates and user-level parameters that provide instance-specific information. In a script to send email, for example, don’t let the user specify the IP number of the SMTP host. This information shouldn’t even appear on the form in a hidden variable. It’s instance independent and should therefore be a system-level parameter. In Windows NT, store this information in the Registry or an .ini file. In UNIX, store it in a configuration file or system environment variable.

• If your script will shell out to the system to launch another program or script, don’t pass user-supplied variables unchecked. Especially in UNIX systems, where the system() call can contain pipe or redirection characters, leaving variables unchecked can spell disaster. Clever users and malicious hackers can copy sensitive information or destroy data this way. If you can’t avoid system() calls altogether, plan for them carefully. Define exactly what can get passed as a parameter and know which bits will come from the user. Include an algorithm to parse for suspect character strings and exclude them.

• If your script will access external files, plan how you’ll handle concurrency. You may lock part or all of a data file, you may establish a semaphore, or you may use a file as a semaphore. If you take chances, you’ll be sorry. Never assume that because your script is the only program to access a given file that you don’t need to worry about concurrency. Five copies of your script might be running at the same time, satisfying requests from five users.

• If you lock files, use the least-restrictive lock required. If you’re only reading a data file, lock out writes while you’re reading and release the file immediately afterward. If you’re updating a record, lock just that one record (or byte range). Ideally, your locking logic should immediately surround the actual I/O calls. Don’t open a file at the beginning of your program and lock it until you terminate. If you must do this, open the file but leave it unlocked until you’re actually about to use it. This will enable other applications or other instances of your script to work smoothly and quickly.

• Prepare graceful exits for unexpected events. If, for instance, your program requires exclusive access to a particular resource, be prepared to wait a reasonable amount of time and then die gracefully. Never code a wait-forever call. When your program dies from a fatal error, make sure that it reports the error first. Error reports should use plain, sensible language. When possible, also write the error to a log file so the system administrator knows of it.

• If you’re using a GUI language (for example, Visual Basic) for your CGI script, don’t let untrapped errors result in a message box onscreen. This is a server application; chances are excellent that no one will be around to notice and clear the error, and your application will hang until the next time an administrator chances by. Trap all errors! Work around those you can live with and treat all others as fatal.

• Write pseudocode for your routines at least to the point of general logical structure before firing up the editor. It often helps to build stub routines so that you can use the actual calls in your program while you’re still developing. A stub routine is a quick and dirty routine that doesn’t actually process anything; it just accepts the inputs the final routine will be expecting and outputs a return code consistent with what the final routine would produce.

• For complex projects, a data flow chart can be invaluable. Data flow should remain distinct from logic flow; your data travels in a path through the program and is “owned” by various pieces along the way, no matter how it’s transformed by the subroutines.

Table of Contents

Products | Contact Us | About Us | Privacy | Ad Info | Home

Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc.
All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement.