This chapter includes tables for two important areas of Perl programming. First, although regular expressions are explained in the "Perl Overview" chapter, it is useful to have a quick reference table for the various symbols and their meanings in regular expressions. Secondly, a list of the Perl 5 standard modules is included.
A regular expression is a way of specifying a pattern so that some strings match the pattern and some strings do not. Parts of the matching pattern can be marked for use in operations such as substitution. This is a powerful tool for processing text, especially when producing text-based reports. Many UNIX utilities, such as egrep, use a form of regular expressions as a pattern-matching mechanism, and Perl has adopted this concept, almost as its own.
Like arithmetic expressions, regular expressions are made up of
a sequence of legal symbols linked with legal operators. Table
12 lists all of these operators and symbols in one table for easy
reference. If you are new to regular expressions you may find
the description in the "Perl Overview" chapter informative.
Description | |
This meta-character, the caret, matches the beginning of a string or, if the /m option isused, match the beginning of a line. It is one oftwo pattern anchors, the other anchor is the $. | |
This meta-character will match any single character except for the newline character unless the /s option is specified. If the /s option is specified, then the newline will also be matched. | |
This meta-character will match the end of a string or,if the /m option is used, match the end of a line.It is one of two pattern anchors; the other anchoris the ^. | |
This meta-character, called alternation, lets you specify two values that can cause the match to succe|ed. For instance, m/a|b/ means that the $_variable must contain the "a" or "b" character forthe match to succeed. | |
This meta-character indicates that the "thing" immediately to the left should be matched zero or more times in order to be evaluated as true (thus .*matches any number of characters). | |
This meta-character indicates that the "thing" immediately to the left should be matched one or more times in order to be evaluated as true. | |
This meta-character indicates that the "thing" immediately to the left should be matched zero or one times to be evaluated as true. When used inconjunction with the +, ?, or {n, m} meta-characters and brackets, it means that the regular expression should be non-greedy and match the smallest possible string. | |
Description | |
The parentheses let you affect the order of pattern evaluation and act as a form of pattern memory. See the "Special Variables" chapter for moredetails. | |
If a question mark immediately follows the left parentheses, it indicates that an extended mode component is being specified; this is new to Perl 5. | |
Extension: comment is any text. | |
Extension: regx is any regular expression but () are not saved as a backreference. | |
Extension: Allows matching of zero-width positive lookahead characters (that is, the regular expression is matched but not returned as being matched). | |
Extension: Allows matching of zero-width negative lookahead characters (that is, negated form of (=regx)). | |
Extension: Applies the specified options to the pattern bypassing the need for the option to specified in the normal way. Valid options are: i (case insenstive), m (treat as multiple lines), s (treat as single line), and x (allow whitespace and comments). | |
Braces let you specify how many times the "thing" immediately to the left should be matched. {n} means that it should be matched exactly n times. {n,} means it must be matched at least n times. {n, m} means that it must be matched at least n times but not more than m times. | |
Square brackets let you create a character class. For instance, m/[abc]/ evaluates to True if any of "a", "b", or "c" is contained in $_. The square brackets are a more readable alternative to the alternation meta-character. | |
Description | |
This meta-character "escapes" the character which follows. This means that any special meaning normally attached to that character is ignored. For instance, if you need to include a dollar sign in a pattern, you must use \$ to avoid Perl's variable interpolation. Use \\ to specify the backslash character in your pattern. | |
Any octal byte where nnn represents the octal number; this allows any character to be specified by its octal number. | |
The alarm character; this is a special character which, when printed, produces a warning bell sound. | |
This meta-sequence represents the beginning of the string. Its meaning is not affected by the /m option. | |
This meta-sequence represents the backspace character inside a character class; otherwise, it represents a word boundary. A word boundary is the spot between word (\w) and non-word (\W) characters. Perl thinks that the \W meta-sequence matches the imaginary characters of the end of the string. | |
Match a non-word boundary. | |
Any control character where n is the character (for example, \cY for Ctrl+Y). | |
Match a single digit character. | |
Match a single non-digit character. | |
The escape character. | |
Terminate the \L or \U sequence. | |
The form feed character. | |
Match only where the previous m//g left off. | |
Change the next character to lowercase. | |
Change the following characters to lowercase until a \E sequence is encountered. | |
The newline character. | |
Quote regular expression meta-characters literally until the \E sequence is encountered. | |
The carriage return character. | |
Match a single whitespace character. | |
Match a single non-whitespace character. | |
The tab character. | |
Change the next character to uppercase. | |
Change the following characters to uppercase until a \E sequence is encountered. | |
The vertical tab character. | |
Match a single word character. Word characters are the alphanumeric and underscore characters. | |
Match a single non-word character. | |
Any hexadecimal byte. | |
This meta-sequence represents the end of the string. Its meaning is not affected by the /m option. | |
The dollar character. | |
The ampersand character. | |
The percent character. |
This is a list of the standard modules that come with Perl 5 along with a brief description.
For a list of all current modules, including many extra non-standard modules other than those listed here, see the CPAN archive. The contents of the Perl Module List at
The modules of the Perl Module List sorted by authors atftp://ftp.funet.fi/pub/languages/perl/CPAN/modules/00modlist.long.html
The modules of the Perl Module List sorted by category atftp://ftp.funet.fi/pub/languages/perl/CPAN/modules/by-authors
The modules of the Perl Module List sorted by module atftp://ftp.funet.fi/pub/languages/perl/CPAN/modules/by-category
ftp://ftp.funet.fi/pub/languages/perl/CPAN/modules/by-module
Module Name | Description |
AnyDBM_File | Accesses external databases. |
AutoLoader | Special way of loading subroutines on demand. |
AutoSplit | Special way to set up modules for use of AutoLoader. |
Benchmark | Time code for benchmarking. |
Carp | Reports errors across modules. |
Config | Reports compiler options used when Perl is installed. |
Cwd | Functions to manipulate current directory. |
DB_File | Accesses Berkley DB files. |
Devel::SelfStubber | Allows correct inheritance autoloaded methods. |
diagnostics | pragma; enables diagnostic warnings. |
DynaLoader | Used by modules which link to C libraries. |
English | pragma; allows the use of long special variable names. |
Env | Allows access to environment variables. |
Exporter | Standard way for modules to export subroutines. |
ExtUtils::Liblist | Examines C libraries. |
ExtUtils::MakeMaker | Creates Makefiles for extension modules. |
ExtUtils::Manifest | Helps maintain a MANIFEST file. |
ExtUtils::Miniperl | Used by Makefiles generated by ExtUtils::MakeMaker. |
ExtUtils::Mkbootstrap | Used by Makefiles generated by ExtUtils::MakeMaker. |
Fcntl | Accesses C's Fcntl.h. |
File::Basename | Parses file names according to various operating system rules. |
File::CheckTree | Multiple file tests. |
File::Find | Finds files according to criteria. |
File::Path | Creates/deletes directories. |
FileHandle | Allows object syntax for file handles. |
Getopt::Long | Uses POSIX style command-line options. |
Getopt::Std | Uses single letter command-line options. |
I18N::Collate | Uses POSIX local rules for sorting 8-bit strings. |
integer | pragma; uses integer arithmetic. |
IPC::Open2 | Inter-Process Communications (process with read/write). |
IPC::Open3 | Inter-Process Communications (process with read/write/error). |
less | pragma; unimplemented. |
Net::Ping | Tests network node. |
overload | Allows overloading of operators (that is, special behavior depending on object type). |
POSIX | Uses POSIX standard identifiers. |
Safe | Can evaluate Perl code in safe memory compartments. |
SelfLoader | Allows specification of code to be autoloaded in module (alternative to the AutoLoader procedure). |
sigtrap | pragma; initializes some signal handlers. |
Socket | Accesses C's Socket.h. |
strict | pragma; forces safe code. |
subs | pragma, predeclares specified subroutine names. |
Test::Harness | Runs the standard Perl tests. |
Text::Abbrev | Creates an abbreviation table. |