Chapter 4

HTML and Perl


CONTENTS

In this chapter we begin to apply what you are learning in Perl to HTML. With this we can start to integrate Perl into your Web server environment.

Using the basic guestbook script from the last chapter, we can begin to integrate Perl scripts with HTML. You will begin to understand how Perl works with the Common Gateway Interface (CGI). In this chapter you will also cover more details involved in Perl programming: how Perl processes a text file, and how it then manipulates it.

Before we get to these topics though, there are a few final areas of Perl programming that you should be introduced to: specifically, user functions (or subroutines), and more control structures, such as the Last and Next operators.

User Functions

When anything happens in Perl it is called an action. One of the actions that takes place in a script is when variables are modified by operators. These operators are part of the methods, or functions, which Perl provides. We have already touched on a few of Perl's functions-the operators print and chop- that are standard issue Perl system functions. A system function's action is defined by the Perl source code, but Perl also has other functions which do not have a predetermined action. These are called user functions. Unlike the print system function, which is defined by Perl, user functions are defined by you, the programmer. Another name for the Perl user function is a subroutine, or sub.

Subroutines

Here is an example of a general format for a user function:

sub subroutine_name {
statement_1;
statement_2;
statement_3;
}

where the subroutine_name is the name of the user function, or subroutine, and the statements that define the user function.

To create a subroutine called hey_now, you might write some code that looks like this:

sub hey_now {
print "Hey, now!\n";
}

where the print statement will be invoked when the subroutine hey_now is used.

This subroutine acts the same as using the print statement

print "Hey, now!/n";

and can be used, or invoked, in the active part of the script by using the "&" symbol, like so:

& hey_now;

I'll bet you are asking, "Why go to all the trouble of creating a subroutine, when you could just as easily give a variable, like $hey, the value of 'Hey, now!/n' and 'print $hey' where it is needed?" This next example illustrates the advantages.

&scan;
sub scan {
foreach $file (<*>) { # Get each file in the directory
if (-d $file) { # check to see if it is a directory
chdir($file); # Change to it
&scan; # Do the scan again
chdir('..'); # Change back...
}
elsif ($file=~/\.txt/) { # Standard stuff to change each filename
$newfile=$file;
$newfile=~s/\.txt/\.htm/i;
rename($file, $newfile);
}
}
}

What does this do? Well, it uses a subroutine and recursion to change the foo.txt files in a directory (and all of its subdirectories) to foo.htm. It performs the scan() on the current directory, getting all the files. If the file turns out to be a directory, it will change the current directory to that one, execute scan() again, and change back. This will ensure that all the subdirectories are scanned. Every time you need to perform this scan all you need to do is to call up the subroutine.

While a variable can only hold a value, or series of values, a subroutine can take those variables and perform endless actions on them. You do not have to type out the entire action again. Proper use of subroutines are invaluable for this reason alone.

Subroutine Guidelines

It is important to know how Perl deals with Perl subroutines in your scripts. The following points outline how Perl handles subroutines:

Subroutines and Return Values

You cannot use a subroutine in Perl without including it in some kind of expression. When a subroutine is invoked in an expression, it produces a value that is called a return value. The current, or real, return value of a subroutine is the one given by the final expression in the subroutine, the last time it was invoked. The following example is an illustration:

sub two_variables {
$P + $Q;
{

where the subroutine two_variables is given the simple expression of adding two scalars. This value will be the return value. When used in a script it might look like this:

sub two_variables {
$P + $Q;
{	
$P = 1;
$Q = 2;
$S = &two_variables; # now $S has value of 3
$R = 3*&two_variables; # $R has value of 21

NOTE
While it is recommended that you put your subroutines at the end of your scripts, I am placing them at the beginning for tutorial demonstration purposes only.

When a subroutine is used with an array, you get a list of return values:

sub two_variables {
$P + $Q;
{
$P = 1Ø;
$Q = 1;
@S = &two_variables;

where @s now has the value of (10,1).

It is important to remember that the last expression evaluated may not be the last expression in the subroutine, as in this example:

sub yes_or_no {
if ($yes > Ø) {
print "You win!\n";
$yes;
} else {
print "You lose!/n";
$no;
}
}

where you will get a return value of "You win!" if any value higher than 0 is entered into the script. The return value depends on whether the $P or $Q is evaluated last, for this becomes the return value. If you print a variable before this, the default value will be 1-the value of a true print function.

What is more valuable in Perl than using global variables in our subroutines is using different values in each subroutine, each time they are invoked. To do this we use arguments.

Subroutine Arguments

The way in which subroutines use arguments demonstrates a clever use of the default variables in Perl because it avoids the pitfall of changing the values of already established variables used elsewhere in your script.

What if you took a subroutine and invoked it in front of a list, for example? This list would then be automatically placed into the "@_" variable for the entire time the subroutine is in operation. This gives your script greater flexibility, because the subroutine then can use this variable to find the number of arguments and their values. Doing this might look like this:

sub hey_now {
print "Hey, now! Here's $_[0]!\n";
}

where the first parameter will be the target.

NOTE
The odd scalar $_[0] is the first element in the @_ array. This is the first parameter of the array, and the script will print whatever value is then passed to @_. Both the special scalar variable $_ and the special element $_[0], the first element in the array @_, look alike, but they have no relation, so be careful not to confuse them.

We might pass a value to @_ like this:

&hey_now("Larry");

which prints

% Hey, now! Here's Larry!

But we can use other ways to pass a value to @_, like

$a = "Hank";
&hey_now($a); # to create Hey, now! Here's Hank!

or

&hey_now("Artie")+&hey_now("Bob"); # to create
# Hey, now! Here's Artie! Hey, now! Here's Bob!

If you want to use more than one parameter in the @_ array, then you might write a script like this:

sub hey_now {
print "$_[0], $_[1]!\n";
}
&hey_now("Hey,"," now!"); # which would print
# Hey, now!

where only those parameters defined in the script are examined.

Using the @_ variable is different because it is local to the subroutine, unlike previous cases where other variables were global to the entire script. A global value can be given to @_, but it has to be done before the subroutine is invoked, and then restored after the subroutine is completed. One result of this local variable feature is that a subroutine can pass arguments to another subroutine without the hassle of having to restate the value in the first subroutine. Each subroutine keeps its own @_ variable.

If we update a previous example, we can add the @_ variable:

sub two_variables {$_[Ø] + $_[1];
{
print &two_variables(1Ø,1); # which prints 11
$S = two_variables(1,2); # which gives $S the
value of 3

You can expand this example to manipulate many values at a time, like so:

sub plus {
$add = Ø; # start of the addition
foreach $_ (@_) {
$add += $_; # add each new element
}
$add; # the expression evaluated outside the 
nested argument
}
$P = &plus(1,2,3,4); # gives $P the value of 10
print &plus(6,3,5,2); # prints 16
print &plus(1..4); # prints 1Ø

If we were already using a variable named $plus somewhere else in the script, we might encounter a problem because after this subroutine the value for $plus is completely changed. You can avoid accidently "stepping on" variable values by understanding how local variables work with user functions.

User Functions and Local Variables

Just like the special @_ variable discussed with user functions, those that have the ability to have a value locally (scalars, arrays, and associative arrays) can work this way, too.

Perl provides the operator local() to designate, or instantiate, local versions of the variable in question. Using local() might look like

sub plus {
local($add); # makes $add a local variable
$add = Ø; # starts the value
foreach $_ (@_) {
$add += $_; # add each element together
}
$add; # the total
}

where at the beginning of the subroutine the current value of the global variable $add is stored away, and a brand new variable $add is created. This variable starts out with the value undef. Once the subroutine finishes, the local variable $add dies, and the global variable $add is restored, with its global value intact.

WARNING!
The local () operator is a full, executable operator, and can wreak havoc on your variable values if not used carefully. Good Perl etiquette suggests that you place all your local() operators at the start of your subroutine definitions.

You can also use the local() operator on an assignable array list, which means you can use it like this:

local($P, @q) = @_;

or as the left side to an array assignment operator. The scalar $P is added to the array @q and held locally in the default array @_. This array list only has value locally in a block statement or subroutine.

More Control Structures

There are a few more Perl operators that act as control structures that you should know about. They work very well with subroutines, and other self-contained Perl script structures, like loops and block statements. These operators are last, next, and redo, and they, like the other operators, work both in and out of subroutines, modifying the action of the subroutine, or block statement.

The main difference between a subroutine and a block statement is that the subroutine, once defined, can be called into action in a script by simply prefacing the subroutine name with the command Sub. A block statement, being a self-contained function in a script (like a while loop or array definition), only performs its action where it is placed in the script, and must be written out in whole again, if you want to repeat its action. To avoid unnecessary scripting, subroutines are very popular in Perl.

The Last Operator

There may be a time when you need to get out of a loop before it has completed all of its cycles. You can use the last operator to do this. It will cause the loop to stop, and the first statement following the loop statement block will stop also. A format for using the last operator might look like this:

while (a_condition) {
statement1;
statement2;
statement3;
if (another_condition) {
statement4;
statement5;
statement6;
last;
}
statement7;
statement8;
statement9;
}
# last goes here to break the loop

where if a_condition in the loop is true, then statements 1, 2, and 3 are executed, and then ifanother_condition is true, then statements 4, 5, and 6 are executed, and then the last operator breaks into the loop and takes you to the end of the entire loop statement block.

The last operator works only on those statement blocks of the for, foreach, while, and naked kind. Naked blocks are those that belong to no other script construct.

To demonstrate a more practical use of the last operator, we can apply it to searching through an e-mail header. The header might look like this:

From: jhagey@sentex.net (Jonathan Hagey)
To: reader@my_server.com 
Date: 12-MAY-96 05:23:14 AM EST -0600
Subject: Good luck!

Writing good scripts takes practice, patience, and a good manual!

If you wanted to have a script scan through the e-mail to check and see if it went to the right person, and that there were no spelling mistakes in the recipient's e-mail address, you might try this:

while (<STDIN>) {
if (/^To:/) { # check to see if it starts
# with To:
if (/reader/) { # correct name
print "It's going to the right person.\n";
}
last; # have found To:, so exit
} # end of if To: loop
if (/^$/) { # in case of a blank line
last; # when blank line, stop checking
}
} # end of while loop

This loop could be adapted to take a variable that held any recipient's name, to give the script more flexibility. The variable, possibly the default $_, would replace the static "reader" in the regular expression and then this variable would use <STDIN> to read the person's name from your input.

The Next Operator

This operator is used inside a loop, and when it is used it causes the execution to jump ahead to the next statement block, without killing the one it occupies.

The typical format for the next operator is this:

while (a_condition) {
start_of_block1;	
block1;
end_of_block2;
if (next_condition) {
start_of_block2;
block2;
end_of_block2;
next; # skips next block
}
start_of_block3;
block3;
end_of_block3;
# this is where next goes to
}

You might need this function to avoid running unnecessary checks- which take up valuable online time-on users who have already established themselves as members of your site, and don't need page after page of security scrutiny, or have already defined their tastes as far as sections of your site go, so they are directly moved by the script to that location.

The Redo Operator

This operator works like the Next operator, only in reverse. If you need a loop to go back and run through a portion of script again, you can use the Redo operator. The format is similar:

while (a_condition) {
# redo returns here
start_of_block1;
block1;
end_of_block1;
if (next_condition) {
start_of_block2;
block2;
end_of_block2;
redo; # goes back up to the marked block
}
start_of_block3;
block3;
end_of_block3;
}

It's as straightforward as that.

Labeling Blocks

When dealing with statement blocks in Perl you can use the last, next, and redo operators to move around your script with great flexibility. If you wanted to get out of two nested blocks at the same time, you can use a label with the block statements. Labeling is a programming method very similar to subroutines, because once you define the action of the label, you can apply it as many times as you need to the actions in your script. Labeling is not as extensive as a subroutine though, as it is only able to define a specific function, like a particular loop condition.

Statement blocks are labeled when you give them a name, following this format:

LABELNAME: loop_operator(condition) {
statement1;
     statement2;
if(another_condition) {
block_modifier LABELNAME;
}}

where the label's name is always in uppercase letters to prevent confusing it with scalars, arrays, associative arrays, reserved words, and user functions.

A typical use for a labeled block might be similar to the following, where you have a nested loop that needs to be exited once a match has been found:

FIRST: for ($n = 1; n$ <= 10; $n++) {
SECOND: for ($x = 1; $x <= 10; $x+) {
if ($n*$x == 48) {
print "$n times $x makes 48!\n";
last FIRST;
}
if ($x >= $n) {
next FIRST; # insures $n is always 
# the larger of the two numbers
}
}
}

where a search for two integers, producing the value 48, is found using a nested loop. The labeled block provides the way out of the loop once the two variables, $n and $x, satisfy the condition. In the previous example only one answer for the search is produced before the loop is exited.

NOTE
Block labeling is only effective to exit a loop. You cannot use block labels with the last, next, and redo operators to enter a statement block.

Before moving on, it's worthwhile to note that there is one other way that Perl allows you to place a condition on a statement block. You can use expression modifiers to accomplish this in much the same way that a loop works. In fact, you use the same loop modifiers.

Using an expression modifier is done by using the following format, where the if modifer is used:

an_expression if a_control_expression;

a_control_expression is the first expression to be evaluated if that condition is met, then the expression an_expression is modified.

Using this method you can greatly reduce your scripting time, and still retain a great deal of sense, or readibility, in your Perl script. Now that you have all these functions and control structures straight, let's move on to text processing itself.

Processing Text

Perl is great at processing text, but remember that text is not limited only to letters, but also includes all the ASCII characters. In order to process text, you need to have some to work with, so for this chapter you will use the following, taken from the Summa Theologia by St. Thomas Aquinas:

Summa Theologica I-II, 90, 1

Whether law is something pertaining to reason?

Objection 1. It would seem that law is not something pertaining to reason. For the Apostle says (Rm. 7:23): "I see another law in my members," etc. But nothing pertaining to reason is in the members; since the reason does not make use of a bodily organ. Therefore law is not something pertaining to reason.

Objection 2. Further, in the reason there is nothing else but power, habit, and act. But law is not the power itself of reason. In like manner, neither is it a habit of reason: because the habits of reason are the intellectual virtues of which we have spoken above (57). Nor again is it an act of reason: because then law would cease, when the act of reason ceases, for instance, while we are asleep. Therefore law is nothing pertaining to reason.

Objection 3. Further, the law moves those who are subject to it to act aright. But it belongs properly to the will to move to act, as is evident from what has been said above (9, 1). Therefore law pertains, not to the reason, but to the will; according to the words of the Jurist (Lib. i, ff., De Const. Prin. leg. i): "Whatsoever pleaseth the sovereign, has force of law."

On the contrary, It belongs to the law to command and to forbid. But it belongs to reason to command, as stated above (17, 1). Therefore law is something pertaining to reason. I answer that, Law is a rule and measure of acts, whereby man is induced to act or is restrained from acting: for "lex" [law] is derived from "ligare" [to bind], because it binds one to act. Now the rule and measure of human acts is the reason, which is the first principle of human acts, as is evident from what has been stated above (1, 1, ad 3); since it belongs to the reason to direct to the end, which is the first principle in all matters of action, according to the Philosopher (Phys. ii). Now that which is the principle in any genus, is the rule and measure of that genus: for instance, unity in the genus of numbers, and the first movement in the genus of movements. Consequently it follows that law is something pertaining to reason.

NOTE
Any of you who find St. Thomas Aquinas interesting can check out this site for more information, and an online version of the entire Summa Theologia: http://www.epas.utoronto.ca:8080/~loughlin/index.html.

Searching for key words in the text is a useful application of <STDIN>. Since the section deals with Aquinas' views on law, I might want to know where the word "law" occurs in the section. To receive a list of these lines I could use a script like this:

while (<STDIN>) {
print if /law/;
}

where I would receive this output.

Whether law is something pertaining to reason?
Objection 1. It would seem that law is not something
"I see another law in my members," etc. But nothing
law is not something pertaining to reason.
else but power, habit, and act. But law is not the
law would cease, when the act of reason ceases, for
instance, while we are asleep. Therefore law is nothing
Objection 3. Further, the law moves those who are
been said above (9, 1). Therefore law pertains, not to
"Whatsoever pleaseth the sovereign, has force of law."
On the contrary, It belongs to the law to command and
stated above (17, 1). Therefore law is something
restrained from acting: for "lex" [law] is derived from
Consequently it follows that law is something

This output can be sent to a file, which can then be stored, or sent to a satisfy a client's request. You would do this by using filehandles.

Filehandles

In Perl we have already touched on three filehandles in our script examples from previous chapters: <STDIN>, <STDOUT>, and <STDERR>, and we didn't even realize we were doing so. This is because these filehandles are special in Perl, being the default files Perl moves data in and out of. A filehandle is a name given to the I/O connection between Perl and whatever it is dealing with. Other filehandles are defined in each script. Like labeled blocks, filehandles are written to code using all uppercase letters. Filehandles are opened with the open() operator in this format:

open (FILEHANDLE, "outside_file_name");

where FILEHANDLE is the name of the filehandle you've created and outside_file_name is the name of the file or device (since you could send something to a printer) that you are sending data to or receiving data from FILEHANDLE. This particular statement opens FILEHANDLE for takingdata, or reading, from it.

If you want to put data into, or write, to FILEHANDLE, then you would use this format:

open (FILEHANDLE, ">outside_file_name");

or you can append the file by adding another ">" symbol, like so:

open (FILEHANDLE, ">>outside_file_name");

If the file you are trying to open is busy, not there, or otherwise inaccessible to your Perl script, you will get the false value undef returned instead of the file contents. You can have the script notify you that this has happened by using the die operator, explained shortly.

When your script is done with a filehandle, you use the operator close() to close it, like this:

close (FILEHANDLE);

If you need to reopen the filehandle later, you need not insert a close argument until the end, because Perl will close the last incantation of the file in question automatically and re-open it with the new open operator.

When a filehandle is opened, but the attempt fails, that filehandle can still appear later in your script and cause problems. Perl provides the die() operator to deal with this problem.

To use the die operator to report a failure in opening a filehandle you can use this format: 
open (FILEHANDLE,">/dir/path/name") ||
die "The script has failed.";

where the die operator will write the data of the failed filehandle into <STDERR>. When the die operator is implemented by Perl, you will receive the program name and line number of the failed code in the message alerting you that your script has died. You can customize each returned "die" statement by including the filehandle name, or other signifier, so you will know which filehandle has failed. If you don't want that information returned with the alert message, add a \n to the die argument, like this:

open (FILEHANDLE,">/dir/path/name") ||
die "The script has failed.\n";

and that information will be erased.

You can use filehandles in text searches for all kinds of problems, such as the administration duty of finding passwords in an unencrypted password file, by doing this:

open(PW,"/user/passwd");
while (<PW>) {
chop:
print "This password is already taken.\n";

where the filehandle name PW is used just like <STDIN> and the other reserved filehandles.

To write to a file using filehandles, follow this format:

print FILEHANDLE "Text to be written\n";

where the data in the quotes is put into the file associated with FILEHANDLE.

Using all this you can create a script that takes data from one file and places it into another:

open(FIRST,$f) || die "$f will not open.";
open(SECOND,$s) || die "$s is not found.";
while (<FIRST>) { # takes data from $f
print SECOND $_; # write line to $s
}
close(FIRST);
close(SECOND);

where you have included the fail safe die operator as well.

There is a problem when we use filehandles. If we don't check whether there is already an existing file of the same name before we write something to, or create a new file using filehandles, then we will completely overwrite it, and lose its data. This is something that people in the UNIX world understand well, but those of us using Windows are used to being prompted before doing such a thing. To prevent this from happening, Perl has file tests.

File Tests

To prevent your script from overwriting an existing file, you can use the file test -e before the filehandle to test if it exists, like so:

-e FILEHANDLE

or before the scalar associated with a filehandle, like this:

-e $scalar_name   

and can be used in your script like this:

$name = "/user/passwd";
if (-e $name){
} else {
print "Sorry, login again.\n";
}

where the script checks the password file for the password held in $name. If the password is not there, then the user is given the print statement. If you are going to use this script in your security procedures, remember to include a block that exists of users with false passwords.

There are other file tests that can be used, all of which are listed in Table 4.1 in order of simplest test to the more involved or obscure.

Table 4.1 File Tests

File Test
Test Action
-r
File or directory is readable
-w
File or directory is writable
-x
File or directory is excutable
-o
File or directory is owned by user
-R
File or directory is readable by real user, not effective user
-W
File or directory is writable by real user, not effective user
-X
File or directory is writable by real user, not effective user
-O
File or directory is owned by real user, not effective user
-e
File or directory exists
-z
File exists and has zero size
-s
File or directory exists and has nonzero size
-f
Entry is plain file
-d
Entry is a directory
-l
Entry is a symlink
-S
Entry is a socket
-p
Entry is a pipe
-b
Entry is a block-special file
-c
Entry is a character-special file
-u
File or directory is setuid
-g
File or directory is setgid
-k
File or directory has the sticky bit set
-t
Isatty() on the filehandle true
-T
File is text
-B
File is binary
-M
Modification age in days
-A
Access age in days
-C
Inode-modification age in days

where you can see that there are file tests for every occasion you might need. Each can be used to check whether the file in question is readable, writable, executible, and so forth. Remember that Perl lets you use these file tests on both filehandles and filenames.

If the information about a file is not covered by a file test, you can use the stat() operator to get any other data you may need. You can do this by putting the operator into a statement like this one:

($blksize) = (stat(/usr/bin/index.hmtl));

where the block size of the file index.html is placed in the variable $blksize. For a full list of the file information you can receive (because there is a lot of data concerning each file, most of it not really useful to you), it is recommended that you check the stat(2) manpage. If you can't find this manpage on your own, you really don't need to know the information the stat operator can give you.

HTML Tagging and Perl

Now that we have covered most of the main functions of Perl, the next step is to begin the interface with the Web browser. To finish this chapter we'll look at combining a Perl script with HTML tags. It is assumed that you know your way around HTML, but if you feel a little rusty, check out this URL to find an HTML tutorial that's suitable for you:

http://www.yahoo.com/Computers_and_Internet/Internet/World_Wide_Web/
Information_and_Documentation/Tutorials_Demos_Talks/

Combining Perl and HTML

Remember the guestbook you've been working on? Forms are an excellent way to retrieve data from a user, so let's add one to our guestbook. The HTML code might look like this:

<HTML>
<BODY>
<CENTER>
<H1>Welcome to the Guestbook!</H1><BR>
</CENTER>
<HR>
<P>
Please enter your first name, last name, and your favorite color, then click on submit.<P>
<FORM METHOD="POST" ACTION="http://www.yourdomain.com/cgi-bin/guest.pl">
<STRONG>
First Name: <INPUT TYPE="TEXT" NAME="firstname" SIZE="25"><BR>
Last Name : <INPUT TYPE="TEXT" NAME="lastname" SIZE="25"><BR>
Favorite Color: <SELECT NAME="color">
<OPTION>Red
<OPTION>Yellow
<OPTION>Blue
<OPTION>Green
<OPTION>Magenta
</SELECT>
<P>
<INPUT TYPE="SUBMIT" NAME="Submit">
</FORM>
</BODY>
</HTML>

to give us a Web page that looks like Figure 4.1.

Figure 4.1 : The Guestbook page.

Now we have to modify our guestbook program to handle this input, but only slightly:

read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
@pairs=split(/&/, $buffer);
# This is the Name-Value pair splitter.. Put into # $FORM array
foreach $pair (@pairs) {
($name,$value)=split(/=/,$pair);
$value=~tr/+//;
$value=~s/%([a-fA-F0-9][a-fA-F0-9])/pack("C",hex($1))/eg;
$FORM{$name}=$value;
}
$name=$FORM{firstname};
$lastname=$FORM{lastname};
$color=$FORM{color};
# This line tells the browser what type of data to 	# expect
print "Content-type: text/html\n\n";
# Now print out the standard HTML header stuff....
print "<HTML>\n<BODY>\n<H3>\n\n";
$newline=$name.':'.$lastname.':'.$color."\n";
# make line delimited with colons
open (GUESTBOOK, "guest.pl");
while ($line=<GUESTBOOK>) {
($gbname, $gblastname, $gbcolor)=split(':', $line);
if (($gbname=~/^$name/i) && 	($gblastname=~/^$lastname/i)) {
print "You are already in the guestbook, $name!\n";
close (GUESTBOOK);
if ($gbcolor!~/$color/i) {
print "You have a different favorite color!\n";
print "Your old favorite color is: $gbcolor\n";
print "Your new favorite color is: $color\n";
}
print "</H3>\n</BODY>\n</HTML>\n";
exit;
}
}
close (GUESTBOOK);
open (GUESTBOOK, ">>guest.pl"); # Open file for
# append
print GUESTBOOK "$newline"; # Append the field
# line to the guestbook file
print "Thank you, $name!  Your name has been added to the Guestbook.\n";
print "</H3>\n</BODY>\n</HTML>\n";
close(GUESTBOOK);

to give us a browser output like that shown in Figure 4.2.

Figure 4.2 : User acknowledgement page.

Now, what are the differences? The first one we'll notice is the way we get information from the user. The input is done through a Web page form. We have three form fields, one for firstname, one for lastname, and one for color. These will be passed to the CGI by the POST method.

We read the passed information from STDIN, which is how the POST method passes its information. We then split the line into name-value pairs, which include the variable name, an "=" symbol, and the value. These pairs are separated by an"&" symbol.

Once that is done, we then populate an associative array called %FORM, which will hold the name of the variable, and the associated value. The input line for our form would look something like this:

firstname=Joe&lastname=Van+Horne&color=Red

Notice that the space between Van Horne was translated into a "+" symbol. Any special characters, like slashes and +, will be converted into hexidecimal escape codes, which the pack() function will turn back into a readable characters.

Another part to remember is to put the "\n" in between each new HTML tag. Your browser will appreciate the newlines being given to each consecutive tag. The newlines will organize the HTML tags in regular script order, instead of running them all together, where they can cause delays in loading time, and even errors with the browser.

Once completed, we assign the $FORM{} variables from the associative array into the regular variables used by our program. Next we print the line:

Content-type: text/html\n\n

which tells the browser what type of data is coming next, so that it will format it correctly. Notice the two newlines. These are both required for the command to be recognized.

Once we have done this, anything we print will automatically be printed to the browser window. Thus, in the next lines, we print HTML codes to format our output nicely. This really is all there is to integrating user input from a Web browser into a CGI program.

Conclusion

In this chapter, you have done the groundwork for a basic understanding of Perl and how it works. By adding user functions like subroutines, we can make a script that can be substantially shorter (as well as cutting down the time it takes to write the script) . Subroutines cut down on processing time, too. Subroutines can be a small as a single action, to extemely long to include as many actions as the script needs.

Additional control structures were added in the form of more operators: redo, next, and last. These work very well with subroutines and the other self-contained blocks of script in Perl.

Labeling block statements provide a similar functionality to the ones subroutines do, but in a more singular way, like focusing on a single loop modifcation.

Text processing was introduced, with a simple use of regular expressions. Filehandles and file tests are both ways of manipulating file contents, or text.

To get you on your way to writing CGI scripts in Perl, you also added HTML to your guestboook script from the previous chapter. Now you need to look into the ways in which Perl assists us in programming, which is covered in the next chapter.