|
To access the contents, click the chapter and section titles.
Platinum Edition Using HTML 4, XML, and Java 1.2
Listing 31.7 Tfind.plPerl Script to Recursively Find Files in Subdirectories #!/usr/local/bin/perl # define the directory to start at # you could prompt user for this $BASEDIR = /web/home/acn; # print page preamble to STDOUT print Content-type: text/html\n\n; print <HEAD><TITLE>Test Find Capability</TITLE>\n; print <BODY bgcolor=#FFFFFF>\n; # call subroutine to find files &finddir($BASEDIR); # close the page print <\/BODY><\/HTML>\n; sub finddir { local ($BASEDIR) = @_; # open directory and load file names into array opendir(BASE, $BASEDIR) || die(Cant open directory $BASEDIR); @files = grep(!/^\.\.?$/, readdir(BASE)); closedir(BASE); ITEM: # for every file in the array foreach $file (@files) { # check to see if its a directory if (-d $BASEDIR/$file) { # if it is, recursively call the subroutine $next = $BASEDIR/$file; &finddir($next); # if not a directory, youve got a hit } else { print <P>Found a file called $BASEDIR/$file\n; next ITEM ; } } } When you run this Perl program, you see a display similar to that shown in Figure 31.2.
Note that all HTML files are found in both the base directory (/web/home/acn) as well as in all subdirectories (/web/home/acn/press). You can create your own directory walking code, as in this example, or you can use Find.pl, part of the Perl distribution library (available at http://www.perl.com/). This Perl script steps through all files recursively and executes a subroutine that you define for each file found. Find returns the name of a file in the variable $name and executes a subroutine in your wrapper script called wanted. You can refer to the $name variable in the wanted subroutine to display the name of the file or grep for a search string. It is easy use Find.pl to develop a slightly more sophisticated find routine (see Listing 31.8). Listing 31.8 Tsfind.plUsing Find.pl to Recursively Search Directories #!/usr/local/bin/perl # requires find.pl require(/public/local/lib/perl5/find.pl); $BASEDIR = /web/home/acn; print Content-type: text/html\n\n; print <HEAD><TITLE>Test Find Capability Using Find.pl</TITLE>\n; print <BODY bgcolor=#FFFFFF>\n; &find($BASEDIR); # close the page print <\/BODY><\/HTML>\n; sub wanted { # if its an HTML file if (($name =~ /.htm/) && !($name =~ /.html/)) { # print its name print <P>Found a file called $BASEDIR/$name\n; } } Like the previous script, this script merely prints the name of each file in which the search string is found. You can easily insert a call to a grepping routine in place of the code that prints out the name of the file. The grepping routine needs to open the file and read through it to search for instances of the search string. The normal Perl searching function works nicely. This approach is demonstrated in Listing 31.9. Listing 31.9 Tsrch.plA Basic Search Script #!/usr/local/bin/perl # define the directory, file name, and search string # you could prompt user for these, or find files # using find.pl $BASEDIR = /web/home/acn; $file = acn.htm; $term = ACNielsen; # print page preamble to STDOUT print Content-type: text/html\n\n; print <HEAD><TITLE>Test Find Search Engine</TITLE>\n; print <BODY bgcolor=#FFFFFF>\n; # call subroutine to search the file &findstr($BASEDIR); print <\/BODY><\/HTML>\n; sub findstr { # open the file open(FILE,$file); # read all lines into an array @LINES = <FILE> close(FILE); # create one huge string to search $string = join( ,@LINES); $string =~ s/\n/ /g; if (!($string =~ /$term/i)) { # dont include this file name last; } # if string is found else { # include the file name print <P>Found string in $BASEDIR/$file\n; } } Now if you combine these two scripts, as in Listing 31.10, you will have a rudimentary search engine that still produces output similar to Figure 31.2. Listing 31.10 Tstfind.plA Basic Recursive Search Engine #!/usr/local/bin/perl # requires find.pl require(/public/local/lib/perl5/find.pl); $BASEDIR = /web/home/acn; # hardcode the search term - you could prompt user $term = ACNielsen; print Content-type: text/html\n\n; print <HEAD><TITLE>A Basic Recursive Search Engine</TITLE>\n; print <BODY bgcolor=#FFFFFF>\n; &find($BASEDIR); # close the page print <\/BODY><\/HTML>\n; sub wanted { # if its an HTML file if (($name =~ /.htm/) && !($name =~ /.html/)) { # search for string $findstr($BASEDIR/$name, $term); } } # sub wanted sub findstr { # get the name of the file and search term as a parameter my ($file,$term) = @_; # open the file open(FILE,$file); # read all lines into an array @LINES = <FILE> close(FILE); # create one huge string to search $string = join( ,@LINES); $string =~ s/\n//g; if (!($string =~ /$term/i)) { # dont include this file name last; } # if string is found else { # include the file name print <P>Found string $term in $BASEDIR/$file\n; } } # sub findstr
|
Products | Contact Us | About Us | Privacy | Ad Info | Home
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement. |