Chapter 6

Simple Tasks for Perl


CONTENTS

The groundwork has now been laid for you to progress quickly in your skills and understanding of Perl. This chapter covers several simple scripts that are often used with Web pages to do various tasks. Some of these scripts are for the user, and some are for the Web administrator, but all of them are very useful in getting the most out of a Web site.

What is not included in this chapter are the details relating to how the Common Gateway Interface (CGI) works and the protocols under which it operates. This very important information is covered in detail in later chapters, so don't worry about client/server relations and MIME specifications right now. Don't even worry if you're not sure about what client/server relations and MIME specifications are, because these will be explained later, and understanding them is not pertinent to this chapter. Any elements of the CGI that are important to this chapter will be explained where necessary.

The Counter

Counters are everywhere. They are very helpful to Web surfers looking for popular sites, and also for researchers looking for well-used online resources. Counters can be very sophisticated, with impressive graphics, but the base Perl script is still the same.

The script for the counter is as follows:

#!/usr/bin/perl
# counter.pl
open(COUNT,"count.tot"); # Open the counter file
$total=<COUNT>; # Read in the first (and only) line
close(COUNT); # Close the file
$total++; # Increment the count by 1
open(COUNT,">count.tot");# Open the file in 
# write mode
print COUNT $total; # Store the new count in 
# the file
close(COUNT); # Close the file and print the result
print "The total number of people to hit this page is: $total\n";

The returned number will resemble the Web page shown in Figure 6.1.

Figure 6.1 : The basic counter.

This script is best implemented by a Server Side Include. Unfortunately, there is no Web server for Windows NT that supports SSIs. The best way to implement this counter, then, would be to include an HTML tag that points to the URL of the script (that is, http://www.myserver.com/count.pl) and have the Perl script print out the entire page, including the count. This is not very flexible, especially if you want to change the pages a lot, but it will work as a simple counter. More advanced counters can be created by linking graphics to the number values.

The Hidden Counter

Some sites do not lend themselves to having a counter visible to the user's browser. If the site is an industrial commercial site that has a small audience, it may not be desirable to display a counter to users, so a hidden counter can be created.

The hidden counter will work the same as the script above, minus the print statement. If you want to load a page after the counting, you may consider the following:

#!/usr/bin/perl
# hidden_counter.pl
$htmlfile="chipper.htm";
open(COUNT,"count.tot"); # Open the counter file
$total=<COUNT>; # Read in the first (and only) line
close(COUNT); # Close the file
$total++; # Increment the count by 1
open(COUNT,">count.tot"); # Open the file in 
# write mode
print COUNT $total; # Store the new count in
# the file
close(COUNT); # Close the file and print the result
open(HTML, $htmlfile); # Open the HTML file
print "Content-type: text/html\n\n"; # Prepare to
# Print to the browser
while ($line=<HTML>) { # While not at the end of
# file
    print $line; # Print each line
}
close(HTML); # Close the file

This script will do the counting behind the scenes and print the HTML file specified in the script to your browser (the one making the URL call to receive the script), not the user's. It is a little more flexible than the previous script, but you still have to call the URL with the script name.

Time Clock

Web pages that deal with reporting news events, or other time-sensitive pieces of information, use a time clock to display the current date and time. The time clock can be set to the server's time zone or to the zone of the majority of the site's users. In either case you have control over which time zone it will be based on.

Displaying the local time on a Web page can be done in a manner similar to that of printing the counter script:

#!/usr/bin/perl
# time.pl
# This line gets and populates an array with the local time variables

($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
# The following two lines change the day and month # to strings rather than numbers
$day=(Sunday,Monday,Tuesday,Wednesday,Thursday,Friday,Saturday)[$wday];
$month=(January,February,March,April,May,June,July,
August,September,October,November,December)[$mon];
print "Local time is: $day, $month $mday, 19$year $hour:$min:$sec\n\n";

The resulting page looks like that shown in Figure 6.2.

Figure 6.2 : The time stamp.

The only tricky part to this script is extracting the information you need out of the localtime construct. The $wday and $mon variables are stored as numbers, so you must convert them to the proper strings using array addressing. Also this script should really be implemented using a Server Side Includes, or SSI, to produce the fastest results. This script is bound by the same limitations as the previous two scripts, that is, the time zone concern, and so forth.

Generating Lottery Numbers

For fun, this script generates random numbers based on the parameters of a specific lottery. This script can be adapted so the user can choose: (1) how many numbers are generated, and (2) the range in which they must fall by creating an HTML form interface. Also, this script can be adapted to generate random numbers for any other purpose.

The following script will pick six numbers between 1 and 49:

#!/usr/bin/perl
# lotto.pl
$count=Ø;
$again=Ø;
while (count<=5) {
    $num= int(rand(49)) +1; # Here we generate 
# an integer between 1 and 49.
    foreach $n (@lotto) { # We check to see if
# there are repeats
        if ($n==$num) {
            $again=1;
        }
    }
    if ($again==Ø) { # If the new number is
# different
        @lotto=(@lotto,$num); # Add it to the 
# array, and 
        $count++; # Increment the counter
    }
    else { # Otherwise, reset the flag and loop
        $again=Ø; # again.
    }
}
print "Your lotto numbers are:\n\n";
print "@lotto\n";

The number generated by this program returns a page like that shown in Figure 6.3.

Figure 6.3 : Randomly generated lotto numbers.

Again, the same limitations apply here as to the previous scripts. The script would have to be called with an URL, and then printed as an HTML document.

Keyword Search

Web sites that house a large amount of data for user access function better with some kind of search feature. The next example is a simple search procedure that examines the Web page's current directory. To have this search an entire site you must make sure that all the files available for searching are stored in the same directory. If you must keep some pages in another directory for security, or other reasons, you would have to be able to call this search function on one of those pages, using the same HTML to be able to search them. To do a keyword search, you will have to use a form. Here is the HTML:

<HTML>
<TITLE>Keyword Search</TITLE>
<BODY>
<H1>Keyword Search</H1>
<HR>
<P>
<FORM METHOD="POST" ACTION="http://www.myserver.com/cgi-bin/search.pl">
Enter a word to search for in the box below:<BR>
<INPUT TYPE="TEXT" NAME="key" SIZE=3Ø>
<P>
<INPUT TYPE="SUBMIT" VALUE="Search">
<BR>
</BODY>
</HTML>

This produces a page that looks like that shown in Figure 6.4.

Figure 6.4 : The keyword search form.

Now, the search.pl script:

#!/usr/bin/perl
# search.pl
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
@pairs=split(/&/, $buffer);
# This is the Name-Value pair splitter.. Put 
# into $FORM array
foreach $pair (@pairs) {
    ($name,$value)=split(/=/,$pair);
    $value=~tr/+/ /;
    $value=~s/%([a-fA-FØ-9][a-fA-FØ-9])/pack("C",hex($1))/eg;
    $FORM{$name}=$value;
}
foreach $file (<*.htm>) { # We search all the .html
     # files in:
    open(HTML, $file); # the current directory
    undef $/;
    $body=<HTML>; # read the entire file into a
# variable
    $/="\n";
    close(HTML);
    if ($body=~/$FORM{key}/i) { # Check to see if
# it contains the key
        push(@doclist, $file); # if so, put in 
# the array
    }
}
# Print the results in HTML form to the browser.
print "Content-type: text/html\n\n";    
print "<HTML>\n<BODY>\n<TITLE>Search Results</TITLE>\n";
print "<H1>Search Results</H1>\n<HR><P>\n";
print "The following documents contain the word:     <B>$FORM{key}</B>\n<P>\n";
print "<UL>\n";
foreach $name (@doclist) {
    print "<li><a href=\"$name\">$name</a>\n";
}
print "</UL>\n";
print "</BODY>\n</HTML>\n";

The results of the user's keyword search request are displayed on a page as shown in Figure 6.5.

Figure 6.5 : The results of a keyword search.

Because this script uses a form, it does not have the limitation of relying on an URL to initiate the Perl script, like the previous scripts. This gives it greater flexiblity and security.

The CGI and Image Maps

The ability to combine the linking power of HTML with a visual image instead of text has developed a distinctive way to use the Web. These Perl scripts allow you to add "clickable" graphics to your Web pages. The main caution with image maps is that some users do not have the graphics capacity with their browsers, such as those like Lynx.

There are also the many users who search the Web with the graphics option on their browsers turned off to save time. It is important to remember this when designing a page with an image map so that you also include alternatives for these users. You can include this script, bi_modal.pl, which will screen the user's browser, and return either the image map page, or a nonimage alternative.

#!/usr/bin/perl
     # bi_modal.pl
     print "Content-type: text/html\n\n";
     # decide if the browser is text or graphics 
     if ( $ENV{HTTP_USER_AGENT} =~ /Lynx|LineMode|W3/i ) {
     print <<EOG; # this is the non-image page
     <HTML><HEAD><TITLE>Page Title</TITLE></HEAD>
     <BODY><H2>Choices - Text Version</H2>
     <A HREF="http://my_server.com/choice1.htm">
     Choice One</A>
     <A HREF="http://my_server.com/choice2.htm"> 
Choice Two</A>
     <A HREF="http://my_server.com/choice3.htm">
     Choice Three</A>
     <BR><H></BODY></HTML>
     EOG
     }
     else {
     print <<EOI; # this is the image map page
     <HTML><HEAD><TITLE>Page Title</TITLE></HEAD>
     <BODY><H2>Choices - Image Map Version</H2>
     <A HREF="http://my_server.com/choice.map">
     <IMG src="http://my_server.com/choice.gif></A>
     <BR><H></BODY></HTML>
     EOI
     }
     exit;

where the text choices are found in the files choice1.html, choice2.html, and choice3.html. The file choice.map is the map file, and choice.gif is the graphics file that is being used as the image map. These different file types are explained in depth later in this chapter.

The actual Perl scripts that are used with image maps are too advanced for the scope of this book, but most HTTP servers come with a Perl script that handles image maps.

Normally, what happens in an image map is that you make an

<A HREF="pic-name.map"><IMG SRC="picture.gif" ISMAP></a>

and the server will check the "pic-name.map" file for the coordinates corresponding to each document, like so

RECT     4ØØ,3ØØ 5ØØ,35Ø    newpage.html
CIRCLE   2ØØ,1ØØ,5Ø         anotherpage.html

There is usually space for polygons as well. The server would normally do all of this work. There is, however, another way. If you make a form using

<FORM METHOD=POST
ACTION="http://www.my_server.com/cgi-bin/script.pl> 

You can enter a line like so:

<INPUT TYPE="IMAGE" NAME="picture.html" IMG
SRC="http://www.yourserver.com/picture.gif" BORDER="Ø" HEIGHT=5Ø WIDTH=155>

If the picture is selected, it will be passed as form data to the CGI program in the form of picture.html.x=500 picture.html.y=200, where 500 and 200 are the clicked x and y coordinates, respectively. The CGI program then has to handle the x and y material, check the map file, and figure out the coordinate information.

The Three Steps to Defining an Image Map

There are three basic steps that each image map has to go through to be completely functional. The first is to create a graphic that will be used for the image map. The second is to write a file that will store the destinations of the "hot spots" specified in the image map. Finally, in the third step, you must make sure there is a reference to this file in your HTML document.

A quick overview of these steps will give you the gist of what you need to know. Then we can examine each area in greater depth.

How to Create an Image Map

The first step in defining an image map is to select an appropriate graphic to use as your image. The best graphics are those that are easy to understand, small in file size (meaning they will load quickly) and have well-defined areas from which the user can choose.

If the graphic you want to use is not divided in an obvious way, then you should alter it so that these borders are readily apparent to the user. In these separate areas you will be selecting the area that will have the link, called a "hot spot." These hot spots are pixels that have been specified to have an URL, which is called the same way as an HTML link can call an URL.

Writing a Map File

Once the image is designed and divided, you have to create a map file of the different destinations of the image map's hot spots. In this file, you designate the hot spots' locations and what shape these areas will have under the image map. The standard format for a map file is as follows:

default default-URL
     rect URL UL-corner LR-corner
     poly URL POINT1 POINT2 POINT3
     circle URL CENTER EDGE-POINT

where the default-URL is the location of the default file if the user selects an area of the active image map that does not have a specified destination. The commands stand for the different shapes: rect means a rectangle, poly means a polygon, and circle means a circle. The commands after the shape operators define the area of that shape. If two areas overlap, then the first area defined will be the one to which the image map link responds.

Each server has its own shape designations, so check your documentation carefully when you are writing your image map file.

Referencing the Image Map File

The final phase of image map design requires you to reference your image map in the HTML document displaying the image map. This is done by using an HTML tag that identifies the image map file location, and has this general format:

<A HREF="http://my_server.com/cgi-bin/image.map">
<IMG SRC="image_map.gif" ISMAP></A>

where an <A HREF> tag with the map file location is associated with an <IMG> tag of the image map itself. Different servers have different requirements for how this information should be sent, so be sure to read your HTTP server's documentation.

A Sample Image Map

To solidify a basic understanding of the image map process before you move into the related Perl scripts, this example will illustrate the steps you've just covered.

The image you are going to use is shown in Figure 6.6. It is a simple map of three crests.

Figure 6.6 : An image of three crests for an image map.

The shape underneath each of the crests is a rectangle, and the pixel locations for the upper-left and lower-right corners to define are 33,69/137,197 for the first crest, 175,69/279,197 for the second, and 309,69/419,197 for the third.

This HTML document is from a medieval Web game that uses these pixel coordinates to link the map definition to the image and its assigned links.

<HTML>
<HEAD>
<TITLE>
Choose Your Army
</TITLE>
</HEAD>
<BODY>
<H2>
Choose your army!
</H2> 
<A HREF="http://www.my_server/cgi-bin/game/crests.map">
<IMG src="http://www.my_server/game/crests.gif" ISMAP></A>
<P>
<HR>
</BODY>
</HTML>

where the <A HREF> address is the location of the map file and the <IMG> address in the location of the image, crests.gif, is used as the map.

There is a call in the <A HREF> tag to the file crests.map, which is the image map file defining the various links.

# crests.map
     default http://www.my_server.com/cgi-bin/game/armies.html
	 rect http://www.my_server.com/cgi-bin/game/army_one.html 33,69 137,197
     rect http://www.my_server.com/cgi-bin/game/army_two.html 175,69 279,197
     rect http://www.my_server.com/cgi-bin/game/army_three.html 3Ø9,69 419,197

When this HTML file is called up with the associated map file, you get something like the figure shown in Figure 6.7.

Figure 6.7 : A sample image map.

So, now that you've made a brief pass over the different facets of the image map, let's review them in greater depth.

Dividing an Image Map

To demonstrate dividing up an image we'll use the graphic in Figure 6.8. Using an actual map is a good example to work with, because the original borders create ready-made sections. This is a map of Canada, which is divided into the various regions, or provinces, that make up the country. With this map the user can select a province and get more information about that area.

Figure 6.8 : A map of Canada.

The CGI Image Map Methods

There are several ways to send the image map coordinates to the CGI. Each of these ways has its merits, but the method you choose will be based on what your https require.

After the graphic, or image, is clearly divided, the hot spots must be defined. This is done by selecting the x and y coordinates for each hot spot. Once the hot spots are defined, you only have to test the image map before you add it to your Web page.

Once you have an appropriate graphic for your image map you have to determine the size of your image. The important size when dealing with image maps is not the memory size, but the pixel size. This example is 560 pixels by 480 pixels. Pixels are always measured from the upper-left and lower-right of the screen. The uppermost pixel location has an x location of 0 and a y location of 0, which is written in pixel coordinate notation like this: 0,0. The x coordinate increases going from left to right on the image and the y coordinate increases going from top to bottom. A hot spot that was located 300 pixels to the left and 250 pixels down would have the x,y coordinates of 300,250.

An easy way to determine the location of a pixel for a hot spot is to use a graphics viewer. Viewing the image you have selected for your image map, the cursor will pass over the hot spot and display the x and y coordinates. You also can use this Perl script to produce an HTML document that will display the x and y coordinates of an image that is clicked on with the cursor.

#!/usr/bin/perl
     # imap.pl
     push (@INC, "your_directory_location/cgi-bin");
     require("CGI-lib.pl");
     &parse(*location);
     print &header;
     print "<HTML><HEAD><TITLE>\n";
     print "Finding Hot Spots</HEAD></TITLE>\n";
     print "<BODY><H2>/n";
     print "Your X & Y Coordinates Are:</H2>\n";
     print "<P><HR><P>/n";
     print "X Coordinate is $location{'xyhot.x'}<BR>/n";
     print "Y Coordinate is $location{'xyhot.y'}<BR>/n";
     print "</BODY></HTML>";     

The result is something like that seen in Figure 6.9.

Figure 6.9 : The results of an x,y coordinate search.

It is important not to limit the size of your image maps the same way you limit the size of other images on your Web pages. The image map has to maintain a certain pixel size which can be hindered if you set an image size for it. If the user's browser cannot immediately display your entire image map, the user will be able to scroll around to view the entire map.

To send these coordinates, the user clicks on the hot spot and that location is sent to an image map script. Many servers have a standard image map script that you can use, so, again, be sure to check your server's documentation. You can also use one of the many freeware and shareware programs for creating image maps, like Map Edit and Web Hotspots, both of which are on the CD-ROM included with this book.

Conclusion

These programs are all basic to using Perl with the CGI, and use a basic call and response cycle. The user, either the Web browser or a human being, makes a call to the Perl script on your server through the CGI, and the Perl script sends a response, again through the CGI. Understanding this cycle is very important to the effective operation of your Perl scripts through the CGI. Although the scripts in this chapter operate on a single data request, the scripts in the next chapter often rely on multiple requests for data being passed from the user to the Perl script, and vice versa.