Chapter 5 Perl Scripts for Windows NT

CONTENTS

Sorting Procedures
Miscellaneous Procedures
- Processing All the Elements of an Associative Array
Conclusion

IN the previous chapters you have gone through Perl's features and how they will appear in a script. You have also touched on a few uses for these features, and even tried out combinations of these features. What you need to do now is to expand this list of specific Perl functions to include an index of Perl code to do regular programming duties-such as figuring out the difference between two arrays, or finding the first array element that has a true value. All these sections of Perl code may be inserted into your larger scripts to perform the task specified. The Perl procedures in this chapter either reorganize or sort data, modify or process data, or perform some other miscellaneous duty.

Sorting Procedures

It is often necessary to change the order of information in a file, or generate a random number or selection of data. You can use Perl this way to sort through the data to find information that matches a particular parameter.

Searching an Array

You may need to search an array to find a specific element and its index number, such as to look for the location of someone's password or match an image to a location. To do this, try the format shown in the listing that follows.

undef $location; # creating an undefined variable 
# that will hold the element's index number
for ($[ .. $#passwd) { # where @passwd is the
# array of user passwords
$location = $_, last if ($passwd[$_]); # where
# $passwd is the password entered for element
# location
}
print "$location\n" if defined($location);

Printing All of an Array

This procedure is a straightforward output of an array's entire contents, separated by an arbitrary string. You could use this procedure to return a list of user form requests to confirm their order, separating them with HTML tags and new line commands. The format takes two forms, shown in the following two listings.

print join($separator, @array_name), $terminator;

{
local($,,$/) = ($separator, $terminator);
print @array_name;
}

where in both cases the variable $separator has the value of the separator string. Printing one element at a time (per line) requires the use of the newline command, "/n." The terminator string value is held in $terminator, which also requires the use of "/n" as this string-if you want to print one element at a time per line. This second method works best with long arrays because it provides a temporary value for both "$" and "$/." These variables control the array field and record the delimiters. To have the array elements separated by only one space, add to the final print command as shown in the next listing.

print "@array_name\n";

Printing All of a File to <STDOUT>

To take a file and place it into <STDOUT>, follow the format of this snippet:

open(F,$file_name) || die "Unable to open $file_name: $!";
while (<F>) {
print;
}
close (F);

where F is any filehandle and $file_name is a scalar variable that holds a file name as its literal value.

Reversing a Character Line

To reverse a line of characters you can use the reverse operator:

$reversed_string = reverse($frontward_string);

where $reversed_string is given the reversed value of $frontward_string. A loop in generic format for reversing anything put into <STDIN> would be as shown as in the example below:

while (<STDIN>) {
chop;
$reversed = reverse($_);
print = $reversed, "n/";
}

where the contents of <STDIN> are reversed and put into $reversed.

Reversing Words in a String

Similar to the previous procedure, you can reverse the order of words in a string by executing the following procedure:

while (<STDIN>) {
chop;
$reversed = join("",reverse split(/(\S+)/, $_));
print $reversed, "/n";
}

where (\S+) tells Perl to split the words on nonwhite space, preserving white space in the string.

Choosing an Element at Random from an Array

In Chapter 7you will be creating a script that simulates a poker game on your Web site. In order to generate a random selection of cards, you have to be able to select a random number, in this case from an array. To do this you can use the format as shown below.

$element = splice(@array_name, rand @array_name, 1);

where the element selected from @array_name is held in $element. To create a random selection of cards, follow the procedure shown below.

@poker_cards = 'H01'..'H13', 'D01'..'D13', 'S01'..'S13', 'C01'..'C13');
while (@poker_cards) {$poker_cards = splice(@poker_cards, rand @poker_cards, 1);
print "$poker_cards ";
print "/n" if @poker_cards % 5 >= Ø;
}

where each suit of cards is divided into their 13 cards; for example, hearts as "H01".."H13." This deals out four poker hands of five cards each.

Shuffling an Array for Random Results

In conjunction with the previous procedure, this function will allow you to shuffle the elements in an array, something you might need to do in a card game program such as poker. The standard format looks like this:

local(@temp);
push(@temp, splice(@array_name, rand@array_name),1))
while @array_name;
@array_name - @temp;

where @temp is the temporary array that holds the shuffled elements of @array_name.

To use this in a poker script, shuffling an array might look like this:

@cards = 1..52;
{
local(@temp);
push(@temp, splice(@numbers, rand(@numbers), 1))
while @cards;
@cards = @temp;
}
print "@numbers\n";

The 52 cards in the deck are kept in @cards, and the shuffled cards are kept in @temp. Instead of using the print statement, you could send the shuffled cards to <STDIN> for use in a game.

Sorting an Array Using the Numeric Values

To sort an array using numeric-not string-comparisons, a subroutine that uses the sort operator will do the comparison. The following Perl format can be used

sub bynumber { $a <=> $b }
@order = sort bynumber @chaos;

where @chaos is sorted into @order.

If you wanted to take an array of jumbled numbers and sort them, the script would look like this:

sub bynumber { $a <=> $b }
@chaos = (23,4,7,456,324,3,98,45,2);
@order = sort bynumber @chaos;

The numbers in @chaos are sorted into numeric order in @order.

Sorting an Array Using a Computable Field

When you have an array filled with a lot of elements that are strings instead of numbers, it might be handy to sort them, too. You can use a foreach loop, which is quite different from the previous example:

local(@keys);
foreach(@original_array) {
push(@keys, key_expression);
}
sub sort_function { key_comparison_expression; }
@sorted_array = @original_array[sort sort_function $[..$#original_array];

where @keys hold each element's key value; @original_array is the array to be sorted; key_expression is the expression applied to $_ to compute a key value; sort_function is a subroutine that signifies sorting parameters; and key_comparison_expression and @sorted_array are the new arrays with the sorted elements.

This procedure may be used to sort user IDs in a user name file into alphabetical order, as shown below.

@user = split(/\n/, 'cat /user/ids');
local(@userkeys);
foreach(@user) {
push(@userkeys, (split(/:/))[2]);
}
sub byuserkeys { $userkeys[$x] <=> $userkeys[$y]; }
@sortuser = @user[sort byuserkeys $[..$#user];
print join("/n",@sortuser);

The user IDs in /user/ids are sorted and put into @sortuser in alphabetical order.

Processing Procedures

It is important to be able to modify the values in parts of your script, and there are several problems that regularly come up. These are dealt with as processing procedures-procedures which modify a specified portion of the script.

Basic Procedures in Perl

These snippets of Perl code all accomplish a specific small task, or procedure, within a Perl program, but they occur so often that having them here for you will greatly reduce the time you spend scripting.

Finding the Difference between Two Arrays

You may have a special page at your site that you want accessed only once by each user, such as a quiz answer page. By comparing two arrays-a list of registered users to your site and those who have accessed the page already-you can eliminate those users who have already called up the page. The script for comparing two arrays follows.

@list_of_called_users = ('user2');
@list_of_all_users = ('user1','user2','user3');
local(%check); # an associative array is setup
# to compare the two arrays
grep($check{$_}++,@list_of_called_users); # grep
# makes a list of all the entries in
# @list_of_called_users
@users_with_access = grep(!$check{$_},@list_of_all_users);

where @users_with_access now contains a full list of those registered users who can access the page. This can be used to test who can view the page.

Finding the Intersection of Two Arrays

If you have a site that has pages you want accessed only by users registered to two of your sites, then you can create a new list of users for this with the format that follows.

@site_one_users = ('Bob','Sue','Dan,'Doug');
@site_two_users = ('Don','Jack','Bob','Lisa');
local(%check);
grep($check{$_}++,@site_one_users);
@special_users = grep($check{$_},@site_two_users);

where @special_users now contains a full list of all those users who are registered to two of your sites, and who may now access the special pages.

Running Block Statements for Each User

If you offer your users the ability to house their own Web pages, you want to have a file that contains a list of all your users with pages housed at your site. To do this you use the getpwent and associated functions, as in the next listing.

setpwent;
while (@users = getpwent) {	($name,$passwd,$uid,$gid,$quota,$comment,$gcos,$dir,$shell) = 
@user; # where all the getpwent standard 
variables are listed
if (-e "$dir/index.html") { # looking for the
# standard homepage filename index.html
print "$name has a Home Page\n";
}
}
endpwent;

where the users who have Web pages on your site are printed out using their user names. You could print these names to an array, or write them to a file, as well.

NOTE

The functions getpwent, setpwent, and endpwent all relate to each other. Their purpose is to go through a specified file to return this user information: ($name,$passwd,$iud,$gid,$quota,$comment,$gcos,$dir,$shell). Whether you have all the information for each user is not important because any undefined variable will get the Perl undef value. Some of these variables may also have the same value on your system, such as a user's name and login ID, which are often one in the same.

An easy way to make a user's name ($name) into the user ID ($uid) is to use this script:

while (($name,$passwd,$uid) = getpwent) {
$uid{$name} = $uid;
}

Interpreting a String for Variable References

You may need to be able to print the literal values of any variables found in a string. To do this, you can use the following script snippet:

while (<STDIN>) {
s/"/\\"/g;
print eval qq/"$_"/;

where the string you want expanded with literal values is already in <STDIN>. The resulting output has the literal string values split apart to double-quote expanded form.

Modifying All of an Array

If you want to change all of the elements of an array at the same time, you can use grep to do it. The resulting format could look like this:

grep(OPERATION, @array_name);

OPERATION may be the operation to modify the array, or a series of do statements, as in the following listing.

do {
statement1;
statement2;
statement3;
{

@array_name is the array having all its elements modified. If you were an unscrupulous Web administrator, you might double the entries to the array where your Web page hits are stored, like so:

grep($_ *= 2), @hits);

where Web page hits are stored in @hits.

Operating on a Series of Integers

If you need to pass a series of integers through the same operation, you can do this by using a foreach loop:

foreach $temp(low..high) {
OPERATION;
}

$temp is the scalar variable into which each integer is placed for the operation; low and high are the range of the integers; and OPERATION is the statement (or statements) that you need every integer to go through. In this procedure, the command for will work in place of foreach. A sample use of this procedure might look like this:

tfor $count(1..2Ø) {
print "$count\n";
}

where a countdown from 1 to 20 will take place.

Processing All the Elements of an Array

This procedure is used to execute one or more commands on the entire list of elements in an array. The basic format is:

foreach (@array_name) {
statement1;
statement2;
}

where @array_name has both statement1 and statement2 applied to it. A possible use for this procedure might be as shown in the following:

foreach (@accounts) {
last if $_ eq 'Ø';
print "Account balance is $_\n"; 
print "Account is empty.\n" if /Ø/;
}

where you check @accounts to output the balance of each, and if it is a zero balance, the "Account is empty" string is returned.

Removing Multiple Consecutives of the Same Character

If you need to remove multiple, consecutive occurrences of characters in your script, you can use this procedure, which uses the "tr///" function to transliterate each character of one string into the corresponding character of another string, but with the "s" option causing the "tr" function to push any consecutive occurrences of a character into only one character. The absence of a replacement string normally used with "tr" leaves only the changed string.

To have the procedure check all the characters in a string, the following script applies:

while (<STDIN>) {
tr/\Ø-\377//s;
print;
}

where the string being held in <STDIN> is processed and all multiple, consecutive occurrences of a character are removed.

Counting Items in an Array

Arrays may be used to count the number of times a particular string sequence occurs in a file. This procedure makes use of a simple for loop with the following basic format

for (@array_name) {
$count{$_}++;
}
for (sort keys %count) {
printf "%Xs %d\n", $_, $count{$_};
}

The array being searched for the string count is @array_name; each part of the string is placed into $_ and then counted; X is the integer value of the number of columns to hold the tabulated values; and $count is the number of times of string occurrence.

This procedure can be used to report on how many times users' real names occur:

@users = grep(s/\s.*\n?$//,'user');
for (@users) {
$count{$_}++;
}
for (sort keys $count) {
printf "%8s %d/n", $_, $count{$_};
}

where @user holds the users' real names, and the results are printed to eight columns of output.

Miscellaneous Procedures

These are the many procedures that don't easily fall into either of the previous categories.

Renaming a File

Put simply, this procedure renames a file

rename($previous,$current) ||
die " Unable to make file name change from $previous to $current: $!";

$previous is the name of the file to be changed and $current is the new name for the file. You can use this procedure to take any ASCII files and make them useful to a browser by adding the .html file extension. In action this script looks like the following:

rename($html, "$html.html") ||
die " Unable to make file name change from $html to "$html.html": $!";

where you take the file in $html and add the extension ".html."

Using Subroutines to Return Errors

You may need to have a subroutine return a pass/fail indication when it returns a list. The way to do this is be sure to include the pass/fail indicator as the first entry of the list. The basic format of this procedure is as shown here:

sub subroutine_name {
statement1;
statement2;
}
{
local(@return) = &subroutine_name(subroutine_parameters);
if (shift @return) {
success_statement;
} else {
fail_statement;
}
}

The subroutine_name is the subroutine created to return a list; statement1 and statement2 are where the values of a successful or failed return are defined; @return is the local array where subroutine_name and subroutine_parameters are stored; and the if/else loop returns either the success_statement or the fail_statement.

If you wanted to have a list of a Web site page returned, and also to be notified if this procedure has succeeded or failed, then you might use this procedure:

sub site_pages {
local(*directory);
opendir(directory, $_[Ø]) || return Ø
    local(@pages) = sort grep(!/^\.\.?$/, readdir(directory));
closedir(directory);
return 1, @pages;
}
 {
local(@return) = &pages($ENV{HOME}) || $ENV{LOGDIR});
if (shift @return) {
print "The HTML documents which make up this site are @return\n";
}
}

where "directory" is the directory housing the HTML documents you are listing.

Hiding Escaped Characters

There may be a time in your script when you'll need certain characters shielded, or hidden, from a global operation being applied to that area of the Perl script. If you want to hide the special backslash characters because they might be viewed by Perl as regularly occurring characters without special status, then use this format:

s/\\(.)/"unmatch1".ord$1).("unmatch2")/e.g.;
operation(s);
s/unmatch1(d+)unmatch2/pack(C,$1)/e.g.;

Here, the string being processed is assumed to already be in $_. In order to hide the escaped characters you use "unmatch1" and "unmatch2" as two characters not likely to be occurring in the string being processed. Good examples of these are \376 and \377.

If you need to change overall occurrences of the string "guest" with a new string "member," but you don't want to affect "/guest," which you are using for another purpose, then your script might look like the following (assuming the text you are changing is already in $_):

s/\\(.)/"\376".ord($1)."\377"/eg;
s/guest/member/g;
s/\376(\d+)\377/pack(C,$1)/e.g.;

where every occurrence of the word "guest" is replaced with "member."

Processing All the Elements of an Associative Array

Just like processing all the elements of an array, you may also want to perform an operation on all the elements of an associative array. This is done by the following procedure:

foreach (sort keys %array_name) {
statement1;
statement2;
}

The associative array is sorted in ascending value of the keys using sort (if this operator is left out, the associative array will be processed in the order it is found) and the operation being applied to each element in the associative array is done by statement1 and statement2.

If you wanted to find out your environmental variables, and have them output in order, you could use this script:

foreach (sort keys %env) {
print "$_ = $env{$_}\n"
}

where your list of environmental variables is already in %env.

Prepending Block Labels to Consecutive Lines

This procedure takes consecutive lines of text and tacks on, or prepends, a label to them, until the next label is found. The format for the procedure looks like this:

$label_name = initial_value;
while (<READ>) {
if (/:$) {
$label_name = $';
} else {
print WRITE $label_name, SEPARATOR, $_;
}
}

where initial_value is a string expression with the initial value for $label_name. The consecutive lines are taken from the source specified by READ, the destination of these prepended lines is WRITE, and each line is separated by the separator string SEPARATOR. The default label is the "." symbol.

A common problem when working in Windows is that the .html extension is often chopped to .htm, which will still work on some servers, but when embedding hypertext links in your Web pages it can cause problems. If your link says to go to index.html, but Windows chops the filename to .htm, your link will not work. You could take a list of your HTML files and pass them through this procedure, with a check to see if the file already has the full .html extension, as shown here:

$ext = 'html';
while (<STDIN>) {
if (/:$/) {
$ext = $_';
} else {
print '.', $_, $ext;
}
}

The new names are printed with .html attached, unless they already have .html.

Making Long Numbers the Proper Format

If you are presenting long numbers with your script, then you will want to have appropriately inserted commas every three digits. Perl does not automatically make breaks in any numbers, but runs them together if they are output to the screen.

You can make the necessary breaks by using a subroutine that matches any string ending with four digits, takes the last three and places them into $2, and puts the former string into $1, as shown in the following:

sub commas {
local($_) = @_;
1 while s/(.*\d)(\d\d\d)/$1,$2/;
$_;
}
statement1;
statement2;
$new_number = &commas(output_variable)

The body statements after the while loop (statement1 and statment2) define the numbers which are to be processed; output_variable is the variable expression that holds the value of each number to be modified; and $new_number holds the modified number.

As an example, you might want to insert commas into the results of a lottery draw. You could do this by modifying the code above to this one:

sub commas {
local($_) = @_;
1 while s/(.*/d)(/d/d/d)/$1,$2/;
$_;
}
for ($integer = 1ØØ; $integer <= 1Øe8; $integer *= 1Ø) {
$prizes = &commas($integer);
print "$prizes\n";
}

All the prize amounts for the lottery, all numbers that are powers of 10 less than 10e8 but greater than 100, are put into $prize.

Truncating a File

If you need to erase, or truncate, a file in your script for security or memory reasons, it is done simply by using the truncate operator.

truncate ('answers.txt',Ø);

where answers.txt is the file being truncated. When a file is truncated, Perl is reducing its file to a length of zero, effectively emptying the file.

Conclusion

The procedures covered here are only the most obvious examples of what you will be able to do with Perl in your own scripts. The methods of manipulating the different elements of your script-sorting, processing, and others-are not limited to the procedures listed here. There are many other ways to affect the data involved with your script. Operators other than the ones used here can be used to process variables.

Using what you've learned with these procedures, it is time to advance to full-length programs in Perl. You've covered how arrays and regular expressions are used to manipulate strings and files in small sections, but now you need to fit this into the bigger picture. In the next two chapters we will work with full-length Perl programs that are not only useful for running a Web server, but they can also teach you to solve more problems with Perl.