home account info subscribe login search My ITKnowledge FAQ/help site map contact us


 
Brief Full
 Advanced
      Search
 Search Tips
To access the contents, click the chapter and section titles.

Platinum Edition Using HTML 4, XML, and Java 1.2
(Publisher: Macmillan Computer Publishing)
Author(s): Eric Ladd
ISBN: 078971759x
Publication Date: 11/01/98

Bookmark It

Search this book:
 
Previous Table of Contents Next


A much safer course is to use the PATH_TRANSLATED environment variable. It automatically appends the contents of PATH_INFO to the root of your server’s document tree, which means that any file specified by PATH_TRANSLATED is probably already accessible to browsers and, therefore, safe. If your document root is /usr/local/etc/htdocs, for example, and PATH_INFO is /etc/passwd, then PATH_TRANSLATED is /usr/local/etc/htdocs/etc/passwd.


NOTE:  In one case, however, files that may not be accessible through a browser can be accessed if PATH_TRANSLATED is used within a CGI script. The .htaccess file, which can exist in each subdirectory of a document tree, controls who has access to the particular files in that directory. It can be used, for example, to limit the visibility of a group of Web pages to company employees. Whereas the server knows how to interpret .htaccess and thus knows how to limit who can and who can’t see these pages, CGI scripts don’t. A program that uses PATH_TRANSLATED to access arbitrary files in the document tree may accidentally override the protection provided by the server.

Handling Filenames

Filenames, for example, are simple pieces of data that may be submitted to your CGI script and cause endless amounts of trouble—if you’re not careful (see Figure 35.1).


FIGURE 35.1  Depending on how well the CGI script is written, the Webmaster for this site can get into big trouble.

Any time you try to open a file based on a name supplied by the user, you must rigorously screen that name for any number of tricks that can be played. If you ask the user for a filename and then try to open whatever was entered, a problem may occur.

  For instance, what if the user enters a name that has path elements in it, such as directory slashes and double dots? Although you expect a simple filename—for example, File.txt—you can end up with /file.txt or ../../../file.txt. Depending on how your Web server is installed and what you do with the submitted filename, you can be exposing any file on your system to a clever hacker.
  Furthermore, what if the user enters the name of an existing file or one that’s important to the running of the system? What if the name entered is /etc/passwd or C:\WINNT\SYSTEM32\KERNEL32.DLL? Depending on what your CGI script does with these files, they may be sent out to the user or overwritten with garbage.
  Under Windows 95 and Windows NT, if you don’t screen for the backslash character (\), you might enable Web browsers to gain access to files that aren’t even on your Web server through Universal Naming Convention (UNC) filenames. If the script that’s about to run in Figure 35.2 doesn’t carefully screen the filename before opening it, it might give the Web browser access to any machine in the domain or workgroup.


FIGURE 35.2  Opening a UNC filename is one possible security hole that gives hackers access to your entire network.

  What might happen if the user puts an illegal character in a filename? Under UNIX, any filename beginning with a period (.) will become invisible. Under Windows, both slashes (/ and \) are directory separators. It’s possible, if the filename begins with the pipe (|), to write a Perl program carelessly and allow external programs to be executed when you thought you were only opening a file. Even control characters (the Escape key or the Return key, for instance) can be sent to you as part of filenames if the user knows how. (See “Where Bad Data Comes From,” earlier in this chapter.)


NOTE:  

Worse yet, in shell script, the semicolon ends one command and starts another. If your script is designed to cat the file the user enters, a user might enter file.txt;rm -rf/ as a filename, causing File.txt to be returned and, consequently, the entire hard disk to be erased, without confirmation.


Verifying Input Is Legitimate

To avoid all the dangers associated with bad input and close all the potential security holes they open, you should screen every filename the user enters. You must make sure that the input is only what you expect.

The best way to do this is to compare each character of the entered filename against a list of acceptable characters and return an error if they don’t match. This turns out to be much safer than trying to maintain a list of all the illegal characters and to compare against that—it’s too easy to accidentally let something slip through.

The code snippet below is an example of how to do this comparison in Perl. It allows any letter of the alphabet (upper or lowercase), any number, the underscore, and the period. It also checks to make sure that the filename doesn’t start with a period. Thus, this fragment doesn’t allow slashes to change directories, semicolons to put multiple commands on one line, or pipes to play havoc with Perl’s open() call.

if (($file_Name =~ /[^\w\.]/) || ($file_Name =~ /^\./)){
      # File name contains an illegal character or starts with a period
}

When you have a commonly used test, such as the code above, it’s a good idea to make it into a subroutine so you can call it repeatedly. This way, you can change it in only one place in your program if you think of an improvement.

Continuing that thought, if the subroutine is used commonly among several programs, it’s a good idea to put it into a library so that any improvements can be instantly inherited by all your scripts.



CAUTION:  

Although the previous code snippet filters out most bad filenames, your operating system may have restrictions it doesn’t cover. Can a filename start with a digit, for instance? With an underscore? What if the filename has more than one period, or if the period is followed by more than three characters? Is the entire filename short enough to fit within the restrictions of the file system?

You must constantly ask yourself these kinds of questions. The most dangerous thing you can do when writing CGI scripts is to rely on the users to follow instructions. They won’t. It’s your job to make sure they don’t get away with it.


Handling HTML

Another type of seemingly innocuous input that can cause you endless trouble is receiving HTML when you request text from the user. The code snippet below is a Perl fragment that customizes a greeting to whomever has entered a name in the $user_Name variable; for example, John Smith (see Figure 35.3).

print(“<HTML><TITLE>Greetings!</TITLE><BODY>\n”);
print(“Hello, $user_Name!  It’s good to see you!\n”);
print(“</BODY></HTML>\n”);


Previous Table of Contents Next


Products |  Contact Us |  About Us |  Privacy  |  Ad Info  |  Home

Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc.
All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement.