Register for EarthWeb's Million Dollar Sweepstakes!
home account info subscribe login search My ITKnowledge FAQ/help site map contact us


 
Brief Full
 Advanced
      Search
 Search Tips
To access the contents, click the chapter and section titles.

Platinum Edition Using HTML 4, XML, and Java 1.2
(Publisher: Macmillan Computer Publishing)
Author(s): Eric Ladd
ISBN: 078971759x
Publication Date: 11/01/98

Bookmark It

Search this book:
 
Previous Table of Contents Next


Listing 31.7 Tfind.pl—Perl Script to Recursively Find Files in Subdirectories


#!/usr/local/bin/perl
# define the directory to start at
# you could prompt user for this
$BASEDIR = “/web/home/acn”;

# print page preamble to STDOUT
print “Content-type: text/html\n\n”;
print “<HEAD><TITLE>Test Find Capability</TITLE>\n”;
print “<BODY bgcolor=#FFFFFF>\n”;

# call subroutine to find files
&finddir($BASEDIR);
# close the page
print “<\/BODY><\/HTML>\n”;

sub finddir
{
     local ($BASEDIR) = @_;

# open directory and load file names into array
     opendir(BASE, $BASEDIR) || die(“Can’t open directory $BASEDIR”);
     @files = grep(!/^\.\.?$/, readdir(BASE));
     closedir(BASE);

     ITEM:
# for every file in the array
     foreach $file (@files)
     {
# check to see if it’s a directory
          if (-d “$BASEDIR/$file”)
          {
# if it is, recursively call the subroutine
               $next = “$BASEDIR/$file”;
               &finddir($next);
# if not a directory, you’ve got a hit
          }
          else
          {
               print “<P>Found a file called $BASEDIR/$file\n”;
               next ITEM ;
          }
     }
}

When you run this Perl program, you see a display similar to that shown in Figure 31.2.


FIGURE 31.2  The basic file recursion script produces a listing line for each file found.

Note that all HTML files are found in both the base directory (/web/home/acn) as well as in all subdirectories (/web/home/acn/press).

You can create your own directory walking code, as in this example, or you can use Find.pl, part of the Perl distribution library (available at http://www.perl.com/). This Perl script steps through all files recursively and executes a subroutine that you define for each file found. Find returns the name of a file in the variable $name and executes a subroutine in your wrapper script called wanted. You can refer to the $name variable in the wanted subroutine to display the name of the file or grep for a search string. It is easy use Find.pl to develop a slightly more sophisticated find routine (see Listing 31.8).

Listing 31.8 Tsfind.pl—Using Find.pl to Recursively Search Directories


#!/usr/local/bin/perl

# requires find.pl
require(“/public/local/lib/perl5/find.pl”);
$BASEDIR = “/web/home/acn”;

print “Content-type: text/html\n\n”;
print “<HEAD><TITLE>Test Find Capability Using Find.pl</TITLE>\n”;
print “<BODY bgcolor=#FFFFFF>\n”;

&find(“$BASEDIR”);

# close the page
print “<\/BODY><\/HTML>\n”;

sub wanted
{
# if it’s an HTML file
     if (($name =~ /.htm/) && !($name =~ /.html/))
     {
# print its name
               print “<P>Found a file called $BASEDIR/$name\n”;

     }
}

Like the previous script, this script merely prints the name of each file in which the search string is found. You can easily insert a call to a grepping routine in place of the code that prints out the name of the file.

The grepping routine needs to open the file and read through it to search for instances of the search string. The normal Perl searching function works nicely. This approach is demonstrated in Listing 31.9.

Listing 31.9 Tsrch.pl—A Basic Search Script


#!/usr/local/bin/perl
# define the directory, file name, and search string
# you could prompt user for these, or find files
# using find.pl

$BASEDIR = “/web/home/acn”;
$file = “acn.htm”;
$term = “ACNielsen”;

# print page preamble to STDOUT
print “Content-type: text/html\n\n”;
print “<HEAD><TITLE>Test Find Search Engine</TITLE>\n”;
print “<BODY bgcolor=#FFFFFF>\n”;

# call subroutine to search the file
&findstr($BASEDIR);
print “<\/BODY><\/HTML>\n”;

sub findstr
{
# open the file
      open(FILE,”$file”);
# read all lines into an array
      @LINES = <FILE>
      close(FILE);

# create one huge string to search
      $string = join(‘ ‘,@LINES);
      $string =~ s/\n/ /g; 
               if (!($string =~ /$term/i))
{
# don’t include this file name
              last;
               }
# if string is found
               else
{
# include the file name
          print “<P>Found string in $BASEDIR/$file\n”;
               }

}

Now if you combine these two scripts, as in Listing 31.10, you will have a rudimentary search engine that still produces output similar to Figure 31.2.

Listing 31.10 Tstfind.pl—A Basic Recursive Search Engine


#!/usr/local/bin/perl

# requires find.pl
require(“/public/local/lib/perl5/find.pl”);
$BASEDIR = “/web/home/acn”;
# hardcode the search term - you could prompt user
$term = “ACNielsen”;

print “Content-type: text/html\n\n”;
print “<HEAD><TITLE>A Basic Recursive Search Engine</TITLE>\n”;
print “<BODY bgcolor=#FFFFFF>\n”;

&find(“$BASEDIR”);

# close the page
print “<\/BODY><\/HTML>\n”;

sub wanted
{
# if it’s an HTML file
     if (($name =~ /.htm/) && !($name =~ /.html/))
     {
# search for string
               $findstr($BASEDIR/$name, $term);

     }
} # sub wanted
sub findstr
{
# get the name of the file and search term as a parameter
my ($file,$term) = @_;
# open the file
      open(FILE,”$file”);
# read all lines into an array
      @LINES = <FILE>
      close(FILE);

# create one huge string to search
      $string = join(‘ ‘,@LINES);
      $string =~ s/\n//g;
               if (!($string =~ /$term/i))
{
# don’t include this file name
              last;
               }
# if string is found
               else
{
# include the file name
          print “<P>Found string $term in $BASEDIR/$file\n”;
               }

} # sub findstr


Previous Table of Contents Next


Products |  Contact Us |  About Us |  Privacy  |  Ad Info  |  Home

Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc.
All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement.