Chapter 9

Debugging


CONTENTS

The process of writing good programs, whether in Perl or other languages, requires not only a strong understanding of the language itself, but also the ability to debug that language. It is inevitable that sooner or later you will run a Perl script and it won't work. In my case, it is always sooner than later, but this is par for the course in any programmer's work.

Effective debugging is a skill developed like any other-with practice over time, and once you are a strong debugger your scripting time will be significantly reduced. The wonderful thing about debugging is that there is always an ample supply of scripts on which to practice.

When debugging Perl there are several standard tools available, as well as some general debugging rules to keep in mind, all of which are explored in this chapter. You are using Perl with your Web site, so there are two main areas that can cause your script to fail: an error in the script itself or an error with the CGI (Common Gateway Interface). Either of these can interfere with full and proper execution of your script; both are discussed in this chapter.

NOTE
Remember that effective debugging requires more than just knowing for what to look. You also have to know the order in which to look for bugs. Developing and sticking to a standard game plan can solve your problems faster than randomly trying one debugging technique after another.

Quick Debugging Tips

These tips might not appear to be all that quick, but once you have been debugging for awhile, each of these steps will become second nature to you.

Debugging singular Perl scripts and debugging Perl scripts used in the CGI are two slighty different tasks. It is much easier to control your debugging environment when working with a solitary Perl script than with one used in conjunction with other scripts and the CGI. The main steps to follow when you start your debugging are as follows:

  1. Figure out whether the script is running or not. You can use a die statement that prints a line like, "I am not running."
  2. Test your scripts and take notes on any failures they have.
  3. If there are many scripts working together, always determine which script is causing the error first. Here, again, the die command can be very useful. You can also insert print statements at various phases of your script to flag a bug.
  4. Check the script's syntax carefully. Use the debugger that comes with Perl.
  5. Check the HTML output of the script for syntax errors.
  6. Find out if the form is returning the right data.Use a test HTML document for this.
  7. Find the exact location of the error in the script.

Some of these points are explained in greater detail so they are useful to you.

Is the Script Running?

Although it is usually quite obvious whether a script is running or not, with the CGI it can be a little tricky. The problem lies in the fact that usually you will get the same error message from a bad script as you will from a script that is not running at all. Run through this checklist to see:

  1. Does the program file exist? Is the wrong name given in the <form> tag? Is the wrong URL used?
  2. Is the permissions file set properly? It must have the execute permissions set for all users.
  3. Is the program in the right directory?
  4. Is the file extension associated with the file? Neither .pl nor .cgi is a standard file extension with Windows NT.
  5. Is the Perl interpreter in the right directory? The Perl interpreter has to be in the same, or higher level, directory to find the Perl script that called it. Check the security announcement in Appendix E before you move it, though, or you could be in for big trouble.
  6. Is the syntax correct? Perl always checks a script's syntax before it runs a script, and will shut down the script at the error, if an error is found.

Whatever your bug hunt may entail, always remember to check Event Viewer to see if any specific data concerning the error is listed there. An example of an error in Event Viewer might look like Figure 9.1.

Figure 9.1 : An example Event Viewer error message.

Finding the Bad Script

If you are using several scripts together on a project, you need to first determine which one is causing you the problem. To do this, you must first isolate each script and run it individually. Once you find the bad script, you can add print statements to it at various spots throughout the program to help find where it breaks down. This technique is illustrated later with a full example.

Debugging Perl Scripts

There are several steps in debugging your Perl scripts, which include procedures like the more obvious syntax check, to the less obvious ones regarding some of Perl's confusing punctuation and use of different variable signifiers.

Correcting the Syntax

The most obvious errors in scripts are syntax errors. This could be an error in the language, format, or punctuation, causing the script to fail.

To check the syntax of your Perl script you can go to the Perl interpreter command line and type in

perl -c script_name.pl

where the Perl interpreter will invoke its debugging "c," or continue, command. This command will run the script until the spot where the syntax error is located, which is often called the breakpoint. By using the "c" command, you can run the script without it executing any of its commands, so that if other scripts are causing the problems, they cannot interfere.

When Perl does find an error it will print out to the screen something like this:

syntax error at return.pl line 15, near "print"
     return.pl had a compilation error.
     Exit -1

which tells us both the line number of the problem (line 15 in this case) and the area in the line where the problem is (near the print statement).

The Perl debugger does not print automatically to a file; however, you could accomplish this by copying the debugger, perldb.pl (found in the Perl library), and then modifying the copy to include an output to a file. To activate this debugger instead of the standard one, change the environmental variable PERLDB to a perl command like

require 'myperldb.pl';

so that this new debugger is now used to check scripts.

Sometimes you will receive more than one syntax error from the Perl debugger.

syntax error at long.pl line 43, near "rename"
     syntax error at long.pl line 86, near "print"
syntax error at long.pl line 146, near "else"
long.pl had compilation errors.
Exit -1

This doesn't necessarily mean that you have three syntax errors to fix. Remember, the Perl interpreter runs through your script from the first line to the last, so check the first error first. It may be causing the others by creating false data. After you correct it, test the script again. If you still have an error, fix the next problem and test it again, and so on, down the line.

Environmental Variables

It is important to check which environmental variables are available to you. It doesn't help you to be using ones that aren't supported in your system in your Perl script. To do this, type "env" at the Perl command line. This will produce a list of all the environmental variables available to your scripts from the command line. Check this list against the environmental variables you are calling from your script.

The Perl Debugger

Perl comes equipped with its own debugger, perldb.pl, which is invoked by using the -d switch at the Perl command line when running perl.exe:

perl -d script_name.pl

When the Perl debugger is running, it goes through the script specified with script_name.pl line by line. After the debugger processes a line of the script, it will prompt you for input. You can proceed with the need line, use any Perl command, or use any number of debugger commands. The standard debugger commands appear in Table 9.1.These debugger commands control the way the debugger works on your script.

Table 9.1 Perl Debugger Commands

#_Debug Command Function
Line ControlsSteps to the next line in the script
 n Goes to the next statement, skipping subroutines
 c line_number Signifies a one-time breakpoint at "line_number" to which the debugger continues until it is reached
 <CR> Repeats the last occurrence of the "s" or "n" command
 /regular_expression/ Finds the pattern specified, searching forward in the script
 ?regular_expression? Finds the pattern specified, searching backward in the script
 A Removes all line actions
ListinghLists the debug commands
 V package variables List the specified variables for the specified package, defaulting to the current package
 X variables Similar to the V command, but applied to current package
 f filename Makes a switch to the file name specified and begins listing it
 - Lists the previous window of script
 w line Lists the window around the line number specified
 1 min-max Lists the script lines in the range specified in min-max
 1 Lists the next window of script
 1 line Lists the script line specified
 1 subroutine_name Lists the script comprising the subroutine specified
 L Lists all the script lines that have breakpoints or actions
 S Lists the names of all script subroutines, prefixing them to the package in which they are declared
Declarationsb line_condition Declares a breakpoint in the debug search
 b subroutine_condition Declares a breakpoint at the beginning of a subroutine
 a line_command Declares an action for the specified lines
 < command Declares an action to occur before each debugger prompt
 > command Declares an action to occur after each debugger prompt
Miscellaneous! number Repeats a debugging command
 ! -number Repeats the debugging command that occurred the specified number of debug commands previously
 H-number Shows the last specified number of debug commands
 q or ^D
Quits
Executes specified command like a Perl statement, no semi-colon necessary
 p Prints
 T For tracing stacks, lines starting with "$ =" are called in a scalar context, those starting with "@ =" are called in an array context
 rReturns you from the present subroutine and executes all statements until the end of that subroutine
 tTurns the trace mode on or off
 d lineDeletes the specified breakpoint
 DDeletes all previous breakpoints

NOTE
Some of the programming terms used in Table 9.1, like package, need an explanation. A package is an area of code in a script that protects its privacy. Variables and subroutines in a package are local to it only, like a script within a script. A breakpoint is a place in the debugging process where the debugger stops, or breaks, from its debugging. A window is the actual space supplied by perl.exe to view script lines in the Perl window. An action is the event caused by a command.

Common Perl Errors

Each language has its own peculiarities, and this holds true with its bugs, too. At times, Perl uses some strange punctuation, so checking this carefully is very important. This and other typical errors are outlined below.

Command Syntax Errors:

Perl Punctuation Errors:

Variable Names Errors:

Strings and Numbers Errors:

These various errors are among the most common to appear in Perl scripts. The most frequently encountered errors are explained in detail below.

Command Syntax Errors

Perl uses many different operators, some of which can be used in any loop or subroutine, some of which cannot. Pay careful attention to those operators by understanding what Perl control structures can use which commands.

As an example, Perl uses two different operators to form relationships; "=" and "==". The "=" symbol is the assignment operator, which is used to assign a value to a variable. The "==" symbol is the equality operator, which is used in an if statement to check the equality between two numbers. A simple typing mistake can make this error quite common in your scripts. The tricky element to this error is that, at most times, it does not cause the script to fail, only to turn out unusable data.

To check for this type of error, take note of what is going wrong with the script in question. If the script is always treating an if statement as being true, or if a variable's value changes unexpectedly after a comparison, then you probably have a switched assignment and/or equality operator.

Perl Punctuation Errors

The sometimes strange punctuation in Perl often causes mischief unintentionally. This is a list of Perl's punctuation and the typical problems concerning their use.

Variable Names Errors

Perl makes use of many types of symbols to designate the different kinds of variables, "$" for scalars, "@" for arrays, "%" for associative arrays, and so forth, so checking that these signifiers are used correctly is very important. Using brackets, "[]", with arrays, and curly braces, "{}", with associative arrays is also a must, but sometimes these are switched and cause problems.

Strings and Numbers Errors

Perl uses different operators for strings and numbers, always taking its lead from these operators on how to handle the text; as a string or as a number. Check the string and numeric operators table in Appendix A to make sure the ones in your script are correct.

Debugging the CGI

The problem with your script may not be one that causes the script to fail to execute, but that when it executes it supplies the wrong data to the CGI. These problems can be even more frustrating because they involve several levels outside of the script, based on client/server dynamics, such as having the wrong MIME header, or HTML tags.

Checking HTML Output

The problem may not be with Perl at all, but with the HTML tags inside the Perl script. When a server returns an error like this:

This server has encountered an internal error which prevents it from fulfilling your request.

it might not be that your Perl script isn't running, which you should check first, but that the HTML is interfering. If your script is running fine, then look over your HTML document for misused, or conflicting tags; missing "<" or ">" symbols; or even unsupported HMTL tags.

Use a browser that displays incorrect HTML to find the problem areas. Some common HTML errors are as follows:

If you can't find a browser that will display HTML errors, then you can use the MIME header

Content-type: text/ascii

to force the browser to display HTML. If you are familar with telnet you can also use this protocol to find out if HTML is your problem.

Using telnet, you can view the HTML document and not use the browser at all. The first step is to begin a telnet session with your HTTP service. You can do this by using the command

telnet your_site_name.com 8Ø

where your_site_name is the name of the site with the HTML document to be checked and 80 is the HTTP port. The HTTP port may be different, but it is usually 80. Once a connection has been established, use the GET command to find the HTML document, like this:

GET perl/directory/the_file HTTP/1.0

where perl/directory/ is the pathname and the_file is the Perl script containing the HTML document being tested.

This command will cause the Perl script to execute and the HTML to print out line by line on the telnet terminal screen.

The MIME Header

As mentioned in the chapter on the CGI, MIME specifications are very important in determining the type of output. Having the right MIME header sent by your script can determine its success or failure, so close attention must be paid here. The MIME header should look like this:

Content-type: text/html
<HTML>
<HEAD>
the rest of your HTML document...

A common error is to leave a blank line between the MIME header, "Content-type:" and the first HTML tag, "<HTML>." Remember to include at least one line of text, like the <HTML> tag, after the MIME header when you are testing it, because some browsers will read only MIME headers that are followed by text.

Problems with User Data

There may be a problem with the data being returned to the Perl script from the user, be it from an HTML form or a QUERY_STRING environmental variable. To check this data you can pass it through a simple Perl script like this:

#!/usr/bin/perl
# datarc.pl
MAIN: {
     print "Content-type: text/html\n\n";
     print "<HTML><HEAD><TITLE>Return Data Display</TITLE>";
     print "</HEAD><BODY>";
     while (($a,$b) = each %ENV) {
          print "$a=$b<BR>\n";
     }
     print "</BODY></HTML>";
     exit Ø;
}

which will list all the properties of the CGI environment. A typical response from this script might include these environmental variables:

OS=Windows_NT
GATEWAY_INTERFACE=CGI/1.1
DOCUMENT_ROOT=http/lib/html
REMOTE_ADDR=134.56.1Ø3.1
SERVER_PROTOCOL=HTTP/1.Ø
REQUEST_METHOD=GET
SCRIPT_NAME=data_return_check.pl
SERVER_NAME=www.atlantis.com 

where all the environmental variables involved with the server, CGI, and script are returned (a list that is actually much longer than the example here).

Take special note of the following returns:

Each of these returns, as well as the other environmental variables returned, can provide debugging help if you know what to look for.

Checking Name/Value Pairs

Debugging also can be accomplished by splitting key/value pairs in an associative array. It is these data pairs-which are called name/value pairs in the CGI-that are used to move information from the client to the server via an HTML form.

For a thorough check of user input you can examine the associative array into which this data is stored. By calling up the key/value pairs, or the name/value pairs, of the associative array you can check each element. You can call this information by splitting the CGI name/value pairs and then displaying the data. Please note that this is just an excerpt from a script. This should be used at the beginning of your CGI script, after print "Content-type: text/html\n\n"; but before the script does anything. Otherwise, insert the #!/usr/bin/perl line and all the HTML form tags at the beginning of this snippet. To split the name/value pairs you can use this script snippet:

foreach $key (keys(%ASSOC)) {
print "$key = $ASSOC{$key}\n";
}     

where the user data is returned to the associative array %ASSOC, and the literal representation of this associatve array is then put into the scalar variable $key.

To display the name/value pairs in the array, another snippet of Perl (with the same conditions as the above snippet) is effective:

print "<html><head><title>\n";
print "Name/Value Listing\n";
print "</title><body><h2>/n";
while (($key,$value) = each %form_data {
     print "$key = $value<BR>\n";
print "</h2></body></hmtl>\n";

where each name/value pair will be presented as an equation with the key/name on the left side and the value on the right in an HTML document, as shown in Figure 9.2.

Figure 9.2 : A return of name/value pairs.

The Perl Command Line

By accessing your Perl script on the command line of perl.exe, you can test your script in the CGI environment without having to involve the HTTP server you are using.

How to Test without an HTTP Server

It is better to test your script without involving your HTTP server so that you can find the precise place where your script breaks down. You also will be able to see the error messages returned by the CGI in their entirety; view the whole of the script's HTML output even if it isn't written properly, or if it is missing the necessary MIME headers; and be able to sidestep any errors originating in your HTTP server.

You can test the request for data cycle that your script creates at the command line if you are using the GET method instead of the POST method on your HTML form. When you are using the GET method, then the environmental variable REQUEST_METHOD should have the value GET and QUERY_STRING should have the value of the user specified data.

The key to testing your GET method is simulating the way data used for the test is being sent to the HTTP server in QUERY_STRING. To do this, you place an "&" symbol between all variable/value pairs, and the "=" symbol between the variables and their values. This data is what is held in the QUERY_STRING for the test. To create a sample of this modified data, like that used in Figure 9.1, you would take the following form-entered data

First Name: Pierre
     Last Name: Burton
     Street Address: 32 Pitty-pat Lane
     City: Hope
     State: British Columbia
     Telephone: 6Ø4 621 6467
     Occupation: Living Legend 

and add the "&" and "=" symbols to get these variable values:

REQUEST_METHOD = GET
     QUERY_STRING = FirstName=Pierre&Last+Name=Burton&Street Address=32Pitty-pat
	 +Lane&City=Hope&State=BritishColumbia&Telephone=6Ø455556467
	 &Occupation=Living+Legend

where you will notice the whitespaces in the form's reply are preserved in the QUERY_STRING.

To give QUERY_STRING the test value, use the setenv operator with double quotes, so the example would then look like this:

setenv REQUEST_METHOD GET
     setenv QUERY_STRING "First Name=Pierre\&Last Name=Burton\&Street Address=32
	 Pitty-pat Lane\&City=Hope\&State=British Columbia\&Telephone=6Ø
	 4 555 6467\&Occupation=Living Legend"

where you will notice the use of the backslash to tell Perl the "&" symbol is performing a special task in this line.

To see if you can succeed in adding these two environmental variables you can type "setenv" by itself on the command line. This will produce a full list of all the environmental variables currently available-including the two you just added.

The Server Error Log

Checking your HTTP service's error log can provide another source of information regarding script failures with the CGI. This log records every error that occurs between client and service on your server. Not all of these will be CGI errors, so look for the listings that include the request method GET or POST; these are the CGI errors. Watch out for the ever-increasing size of error logs, such as the HTTP service error log. These can use up a lot of memory very quickly if not kept under control, so always keep a close watch on their size. Depending on how much memory you have available, logs should not take up more than five percent of your storage space. If you want to keep log data, you can store it in compressed form, like a .zip archive.

For checking the error logs in NT you use Event Viewer to view HTTP service logs. For example, the EMWAC HTTP service, or https, logs its events in the Application Event Log, which can be viewed using the Event Viewer. All the different errors that occur are listed here, from I/O failures to system calls that run out of resources. Any client errors are listed as Warning events.

There are four revealing pieces of information in the error messages. These are the date and time of the error; the name of the client that made the request that failed; the kind of error that occurred; and the request method, GET or POST, that was used with the error. Taken individually, these log listings are not really helpful. But if you have a large number of them you can look for patterns in the listings to see if certain client locations cause the errors, a certain time of day might have something to do with it, and so forth. All these are helpful clues in tracking down the source of your problems.

Using the Print Operator

This method may appear to be slow, but can actually help you find the part of your script that is causing the rest to break down. This technique is especially useful in larger scripts where variable data changes are quite extensive. This method also allows you to test the script in the CGI environment without having to access the command line to do it.

When a script runs fine on your own computer, but fails when put through the CGI, you insert "print" commands thoughout the script in question. To illustrate we'll use a Perl script with an error in it. (The error is indicated in parentheses.)

if ($order{"Payment"} gt " ") {
          if ($order{"Payment"} ne $standard{"Pay"}) {
               $match = Ø;
          } # this section matches type of payment
     } 
     if ($order{"Product"} gt " ") {
          if ($order{"Product"} ne $standard{"Prod"}) {
               $match = Ø;
          } # this matches the product ordered
     }
     if ($order{"Delivery"} gt " ") {
          if ($order{"Delivery"} (!=) $standard{"Del"}) {
               $match = Ø;
          } # this matches the delivery type
     }

This example script is matching the method of payment, product name/number, and delivery type of a customer order with specified parcel carriers held in the database accessed by $standard, from an associative array %standard.

To check what is causing this error, you would first make a copy of the script. Then, with this copy, insert print commands after loop (remembering to add HTML tags, too) in the script, like this

if ($order{"Payment"} gt " ") {
          if ($order{"Payment"} ne $standard{"Pay"}) {
               $match = Ø;
          } # this section matches type of payment
     } 
     print "<HTML><H2>Payment Match is $match<BR>";
     if ($order{"Product"} gt " ") {
          if ($order{"Product"} ne $standard{"Prod"}) {
               $match = Ø;
          } # this matches the product ordered
     }
     print "Product Match is $match<BR>";
     if ($order{"Delivery"} gt " ") {
          if ($order{"Delivery"} (!=) $standard{"Del"}) {
               $match = Ø;
          } # this matches the delivery type
     }
     print "Delivery Match is $match<BR></H2></HTML>";

The value of $match will be returned after each loop. The value of "1" means a true value, or a match, and a value of "0" means a false value, or no match. If you run the above script you will get this output:

Payment Match is 1
     Product Match is 1
     Delivery Match is Ø

so you know the script is failing in the third loop of this section. The "!=" numeric operator was used in the third loop instead of the correct "ne" string operator.

Perl Scripts for Debugging

To help you in your search for bugs, these Perl scripts will work through your script and present HTML documents with their results. The various tests they perform on your script are described with the script listing.

Finding the Environment

This script will print out the environmental variables used by a script to an HTML document.

#!/usr/bin/perl
# envtest.pl
     MAIN: {
          print "Content-type: text/html\n\n";
          print "<HTML><HEAD><TITLE>List of Environmental Variables</TITLE></HEAD>";
          print "<BODY><H2>Environmental Variables Available to Your Script</H2>";
          print "<H3><UL>";
          while (($key,$value) = each %ENV) {
               print "<LI>$key = $value\n";
          }
          print "</UL><BR>";
          print "These are all the Environmental Variables available.</H3></BODY></HTML>";
          exit Ø;
     }

Finding the GET Values

To find and display all the variables sent to a form request using the GET method, use this script. The results are put into an HTML document.

#!/usr/bin/perl
     # gettest.pl
     MAIN: {
          print "Content-type: text/html\n\n";
          print "<HTML><HEAD><TITLE>List of GET Variables</TITLE></HEAD>";
          print "<BODY><H2>Variables Sent Using GET</H2>";
          print "<H3><UL>";
          $form_request = $ENV{'QUERY_STRING'};
          $pairs = split(/[&=]/, $form_request));
     # splits name/value pairs
          foreach (%pairs) {
               tr/+/ /;
               s/%(..)/pack("c",hex($1))/ge;
          } # converts URL to ASCII
          while (($key,$value) = each %pairs) {
               print "<LI>$key = $value\n";
          }
          print "</UL><BR>";
          print "These are all the GET variables sent.</H3></BODY></HTML>";
          exit Ø;
     }

Finding the POST Values

To find and display all the variables sent to a form request using the POST method, use the following script. The results are put into an HTML document.

#! usr/bin/perl
     # posttest.pl
     MAIN: {
          print "Content-type: text/html\n\n";
          print "<HTML><HEAD><TITLE>List of POST Variables</TITLE></HEAD>";
          print "<BODY><H2>Variables Sent Using POST</H2>";
          print "<H3><UL>";
          read(STDIN, $form_request, $ENV{'CONTENT_LENGTH'}); # puts the POST data into 
     # STDIN and defines how many bytes to read
          $pairs = split(/[&=]/, $form_request));
     # splits name/value pairs
          foreach (%pairs) {
               tr/+/ /;
               s/%(..)/pack("c",hex($1))/ge;
          } # converts URL to ASCII
          while (($key,$value) = each %pairs) {
               print "<LI>$key = $value\n";
          }
          print "</UL><BR>";
          print "These are all the POST variables sent.</H3></BODY></HTML>";
          exit Ø;
     }

Finding the Debugging Information

Similar to the debugging method of inserting print statements into your script, this bug report subroutine, when inserted into the script you are debugging, can be used to create a list of the same print statements without requiring you to go to the trouble of inserting all those print statements.

When you have finished debugging your script, you just set the value of $debug to 0, effectively shutting it off.

sub bugreport {
          if (debug == 1) {
               print "Debug Report"
               eval "print @_"; 
               print "<BR>\n";
          }
     }

Conclusion

Although there are many debugging techniques explored in this chapter, the best offense is always a good defense. If you develop good script writing habits to begin with, your troubles will be fewer and far between. This development includes making use of proper documentation in your scripts. If you are unsure whether or not a note needs to be made, make it. It will take very little time and space now, and may be invaluable in the future. If you never have to use these notes, then you are luckier than most. The programmers who come after you, who will have to work with your script, will also be appreciative of your extra work.

It is also important to give yourself some time if you can't find the bug. If you can, staying away from it for a couple of days is best, but even a few hours rest will give you a better, fresher perspective on the problem. To do this takes proper planning. Always plan ahead and reserve some of your programming time to debug. Just as writers have to incorporate editing time into their deadlines, programmers have to set aside enough time to debug. When making these plans, remember to involve any members of your team, or department, who are needed for the project, or will need to know your deadlines, as it may affect their deadlines. Don't be afraid to ask for assistance. If another programmer has already had a similar problem, then you can fix your problem that much faster.

Although at times it may seem that you are expected to know everything, you don't have to, because knowing where to get help is more valuable. Especially if a problem persists, then have someone else read through your script. This person should probably know Perl. Finally, if a section is really causing you problems, a final step might be just to delete it and write it over again. This time around you might eliminate the error, or you might make one easier to find.