|
|
|
To access the contents, click the chapter and section titles.
Platinum Edition Using HTML 4, XML, and Java 1.2
(Publisher: Macmillan Computer Publishing)
Author(s): Eric Ladd
ISBN: 078971759x
Publication Date: 11/01/98
A much safer course is to use the PATH_TRANSLATED environment variable. It automatically appends the contents of PATH_INFO to the root of your servers document tree, which means that any file specified by PATH_TRANSLATED is probably already accessible to browsers and, therefore, safe. If your document root is /usr/local/etc/htdocs, for example, and PATH_INFO is /etc/passwd, then PATH_TRANSLATED is /usr/local/etc/htdocs/etc/passwd.
NOTE: In one case, however, files that may not be accessible through a browser can be accessed if PATH_TRANSLATED is used within a CGI script. The .htaccess file, which can exist in each subdirectory of a document tree, controls who has access to the particular files in that directory. It can be used, for example, to limit the visibility of a group of Web pages to company employees. Whereas the server knows how to interpret .htaccess and thus knows how to limit who can and who cant see these pages, CGI scripts dont. A program that uses PATH_TRANSLATED to access arbitrary files in the document tree may accidentally override the protection provided by the server.
Handling Filenames
Filenames, for example, are simple pieces of data that may be submitted to your CGI script and cause endless amounts of troubleif youre not careful (see Figure 35.1).
FIGURE 35.1 Depending on how well the CGI script is written, the Webmaster for this site can get into big trouble.
Any time you try to open a file based on a name supplied by the user, you must rigorously screen that name for any number of tricks that can be played. If you ask the user for a filename and then try to open whatever was entered, a problem may occur.
- For instance, what if the user enters a name that has path elements in it, such as directory slashes and double dots? Although you expect a simple filenamefor example, File.txtyou can end up with /file.txt or ../../../file.txt. Depending on how your Web server is installed and what you do with the submitted filename, you can be exposing any file on your system to a clever hacker.
- Furthermore, what if the user enters the name of an existing file or one thats important to the running of the system? What if the name entered is /etc/passwd or C:\WINNT\SYSTEM32\KERNEL32.DLL? Depending on what your CGI script does with these files, they may be sent out to the user or overwritten with garbage.
- Under Windows 95 and Windows NT, if you dont screen for the backslash character (\), you might enable Web browsers to gain access to files that arent even on your Web server through Universal Naming Convention (UNC) filenames. If the script thats about to run in Figure 35.2 doesnt carefully screen the filename before opening it, it might give the Web browser access to any machine in the domain or workgroup.
FIGURE 35.2 Opening a UNC filename is one possible security hole that gives hackers access to your entire network.
- What might happen if the user puts an illegal character in a filename? Under UNIX, any filename beginning with a period (.) will become invisible. Under Windows, both slashes (/ and \) are directory separators. Its possible, if the filename begins with the pipe (|), to write a Perl program carelessly and allow external programs to be executed when you thought you were only opening a file. Even control characters (the Escape key or the Return key, for instance) can be sent to you as part of filenames if the user knows how. (See Where Bad Data Comes From, earlier in this chapter.)
NOTE:
Worse yet, in shell script, the semicolon ends one command and starts another. If your script is designed to cat the file the user enters, a user might enter file.txt;rm -rf/ as a filename, causing File.txt to be returned and, consequently, the entire hard disk to be erased, without confirmation.
Verifying Input Is Legitimate
To avoid all the dangers associated with bad input and close all the potential security holes they open, you should screen every filename the user enters. You must make sure that the input is only what you expect.
The best way to do this is to compare each character of the entered filename against a list of acceptable characters and return an error if they dont match. This turns out to be much safer than trying to maintain a list of all the illegal characters and to compare against thatits too easy to accidentally let something slip through.
The code snippet below is an example of how to do this comparison in Perl. It allows any letter of the alphabet (upper or lowercase), any number, the underscore, and the period. It also checks to make sure that the filename doesnt start with a period. Thus, this fragment doesnt allow slashes to change directories, semicolons to put multiple commands on one line, or pipes to play havoc with Perls open() call.
if (($file_Name =~ /[^\w\.]/) || ($file_Name =~ /^\./)){
# File name contains an illegal character or starts with a period
}
|
| When you have a commonly used test, such as the code above, its a good idea to make it into a subroutine so you can call it repeatedly. This way, you can change it in only one place in your program if you think of an improvement.
Continuing that thought, if the subroutine is used commonly among several programs, its a good idea to put it into a library so that any improvements can be instantly inherited by all your scripts.
|
|
CAUTION:
Although the previous code snippet filters out most bad filenames, your operating system may have restrictions it doesnt cover. Can a filename start with a digit, for instance? With an underscore? What if the filename has more than one period, or if the period is followed by more than three characters? Is the entire filename short enough to fit within the restrictions of the file system?
You must constantly ask yourself these kinds of questions. The most dangerous thing you can do when writing CGI scripts is to rely on the users to follow instructions. They wont. Its your job to make sure they dont get away with it.
Handling HTML
Another type of seemingly innocuous input that can cause you endless trouble is receiving HTML when you request text from the user. The code snippet below is a Perl fragment that customizes a greeting to whomever has entered a name in the $user_Name variable; for example, John Smith (see Figure 35.3).
print(<HTML><TITLE>Greetings!</TITLE><BODY>\n);
print(Hello, $user_Name! Its good to see you!\n);
print(</BODY></HTML>\n);
|