The basic interprocess communication facilities available via Perl are built on the old UNIX facilities: signals, named pipes, pipe opens, the Berkeley socket routines, and SysV Ipc calls. I have already covered sockets, shared memory, semaphores, and message queues in the previous sections. I cover the use of signals and pipes in this chapter.
A signal is a message sent to a process that indicates an event has occurred. The event can be something unexpected and, in general, cause the process to terminate. Types of such signal events include division by zero, a bus error, a segmentation fault, or sometimes even imminent power failure. All signals are not bad news. UNIX kernels use signals for timing. Users can send signals by hitting keys such as Ctrl+C, Break, or Delete.
The types of signals recognized by the kernel are listed in the
standard UNIX header file /usr/include/signal.h.
The names of these signals are listed in Table 14.1. Not all of
these signals may be implemented in your UNIX system, and even
fewer are usable in Perl. However, I will cover all the signals
that you definitely need to know.
Signal | Description | |
SIGHUP | On hangup | |
SIGINT | On interrupt | |
SIGQUIT | On Quit key | |
SIGILL | Illegal instruction | |
SIGTRAP | Trap instruction | |
SIGABRT | Abort message | |
SIGIOT | Input/output transfer | |
SIGBUS | Bus error | |
SIGFPE | Floating-point error | |
SIGKILL | Kill signal from system | |
SIGUSR1 | User defined | |
SIGSEGV | Segmentation violation | |
SIGUSR2 | User defined | |
SIGPIPE | Pipe fault (broken pipe) | |
SIGALRM | Alarm | |
SIGTERM | Termination | |
SIGSTKFLT | Stack fault | |
SIGchLD | Signal from child | |
SIGCONT | Continuing a stopped process | |
SIGSTOP | Stopping a process | |
SIGTSTP | Stopping a process from terminal | |
SIGTTIN | Stopping a process reading from controlling terminal | |
SIGTTOU | Stopping a process writing to controlling terminal | |
SIGURG | Urgent condition | |
SIGXCPU | Excessive CPU limits reached | |
SIGXFSZ | Excessive file size limits reached | |
SIGVTALRM | Virtual interval timer expired | |
SIGPROF | Profiling interval timer expired | |
SIGWINch | Window size changed by background process | |
SIGIO | Asynchronous I/O | |
SIGPWR | Power failure | |
SIGUNUSED | Unused |
The names of the signals available to you on your system are the ones listed by kill -l on your system. You can also retrieve them from the Config module. The following example shows how to obtain a list of available signals on your system:
$ kill -l
1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL
5) SIGTRAP 6) SIGIOT 7) SIGBUS 8) SIGFPE
9) SIGKILL 10) SIGUSR1 11) SIGSEGV 12) SIGUSR2
13) SIGPIPE 14) SIGALRM 15) SIGTERM 17) SIGchLD
18) SIGCONT 19) SIGSTOP 20) SIGTSTP 21) SIGTTIN
22) SIGTTOU 23) SIGURG 24) SIGXCPU 25) SIGXFSZ
26) SIGVTALRM 27) SIGPROF 28) SIGWINch 29) SIGIO
30) SIGPWR
Another way of getting a list of signals on your system is by using the supplied Config module, as shown in Listing 14.1. The output from this simple script is shown following the listing. The list of signal names in the Config module is accessed via the $Config{sig_name} variable. (The signal names are separated by white space.) By the way, the program shown in Listing 14.1 requires Perl 5 and will not work in Perl 4.
Listing 14.1. Using the Config module to get a list of signals.
1 #!/usr/bin/perl
2 #
3 # Use the Config module to get list of signals
4
5 use Config;
6
7 defined $Config{sig_name} || die "No Config Module?";
8
9 foreach $name (split(' ', $Config{sig_name})) {
10 $i++;
11 printf "%3d) %s \t", $i, $name;
12 if (($i % 5) == 0) { print "\n"; }
13 }
14 print "\n";
Line 5 uses the Config module to include its definitions. The existence of the Config signal names array is confirmed in line 7. Lines 9 through 14 print out a columnar output of all the signal names defined in the Config array. Here is the output generated by the code in Listing 14.1.
$ listsigs.pl
1) ZERO 2) HUP 3) INT 4) QUIT 5) ILL
6) TRAP 7) IOT 8) BUS 9) FPE 10) KILL
11) USR1 12) SEGV 13) USR2 14) PIPE 15) ALRM
16) TERM 17) STKFLT 18) chLD 19) CONT 20) STOP
21) TSTP 22) TTIN 23) TTOU 24) URG 25) XCPU
26) XFSZ 27) VTALRM 28) PROF 29) WINch 30) LOST
31) PWR 32) UNUSED
A process does not know when a signal might occur because signals are asynchronous events. Programmatically, you can either ignore almost all signal messages or handle them yourself with a signal handler. This process of handling the signal is referred to as trapping a signal. A handler is simply a function that is called when the signal arrives.
In Perl, a special hash called %SIG contains either names or references to all the signal handlers. You have to install your own signal handler to trap a signal. If you do install a handler, Perl uses the default handling mechanism for that signal. Signal handlers are considered by the Perl interpreter to be in the main module unless otherwise specified. To specify a different module, you have to specify the package name. Therefore, you would use something like this:
$SIG{'QUIT'} = 'myModule::babysitter';
Signal handlers are basically just Perl functions. However, some restrictions apply while you are in the handler. First of all, signal handlers are called with an argument that is the name of the signal that triggered it. Therefore, your handler can be written to trap more than one signal if you want. Simply use the passed argument to determine what to do in the handler. Next, if your signal handler is being called when something out of the ordinary happens, you should not attempt to do lengthy operations while in the handler. For example, doing lengthy disk writes within a handler is not a good idea. Finally, try to use references instead of names because the code is cleaner and faster.
Another important point to keep in mind is that there is only one %SIG{} array in the Perl script you are running. Setting an entry in %SIG{} in one subroutine sets it for all objects in your program. There is no "my" %SIG{} array with which you can set your own handler functions.
To set a signal handler, you use the following statement:
$SIG{INT} = \&meHandleINT;
In this statement, meHandleINT is a reference to a subroutine defined as something similar to this:
sub meHandleSignals {
my $signame = shift; #Grab signal name from passed input
die "Caught Signal: SIG$signame";
}
You can use the signal-handling function's name instead of the reference; however, this is no longer the preferred, cool way of doing things because references give a faster lookup. Therefore, avoid setting handlers with statements like this one:
$SIG{INT} = 'meHandleINTs';
You can use anonymous functions to install signal handlers. However, be warned that some systems will not allow you to handle this signal more than once, especially on System V machines where the signal handler must be reinstalled every time it's hit. Use references instead of trying a statement like this one:
$SIG{INT} = sub { die "\nBye\n" }; # Not portable
By modifying the value of the reference in %SIG{} entries, you can add or remove signal handlers. Listing 14.2 presents an example. While the program is asleep, the signal handler vocalHandler can be called if the Ctrl+C key combination is pressed. However, after the sleep session is over, the default value of $SIG{INT} is restored to allow the script to be killed via the default Ctrl+C combination. Notice also how the hangup signal is set to be ignored while the program is running with $SIG{HUP}='IGNORE' assignment.
Listing 14.2. Signal handler usage.
1 #!/usr/bin/perl
2
3 sub vocalHandler {
4 local($sig) = @_;
5 print "Hey! Stop that! SIG$sig hurts! \n";
6 exit(1);
7 }
8
9 $SIG{INT} = \&vocalHandler;
Listing 14.2. continued
10
11 $SIG{HUP} ='IGNORE';
12
13 printf "\n I am about to sleep";
14
15 sleep (10);
16
17 print "\n Restoring. default signal handler. ";
18
19 $SIG{INT} ='DEFAULT';
The string "Hey! " will be printed out every time you hit the Ctrl+C key combination during the 10-second interval that your Perl script is asleep.
There are two signal names that are reserved: IGNORE and DEFAULT. When a signal handler is set to IGNORE, Perl tries to discard the signal. When a signal handler is set to DEFAULT, Perl takes the default action taken by functions called from within that block. Some signals cannot be trapped or ignored (for example, the KILL and STOP signals).
All modules and inherited objects affect the %SIG{} hash. However, the %SIG hash can be redeclared in a function as a local variable to completely mask out the default %SIG{} hash. The original hash still remains, but due to scoping rules in Perl, the local %SIG{} is used.
The %SIG hash is not local to each module unless you declare it as such. This is used to disable signals temporarily within certain function calls. Here's a sample fragment of code:
sub warpper {
local $SIG{INT} = 'IGNORE'; # declare a new SIG hash.
&localFunction; #Then call this function.
}
sub localFunction {
# interrupts are still ignored in this function.
}
Upon entry into the wrapper function, a local %SIG hash is created and the INT interrupt is set to be ignored. The localFunction is then undisturbed by SIGINT. On exiting from the wrapper function, the %SIG hash in the main module is the default array. Therefore, the values in the %SIG hash are inherited by the lower modules.
The kill() function is used to send the KILL signal to a process or a set of processes. The first parameter of the kill() function is the signal number, followed by a list of process IDs. The usual UNIX permissions apply. Your script must own the process it's about to blow out of the water. That is, the effective and real UID (user ID) must be the same for the process sending the signal and the process receiving the signal. Here's the syntax for the kill() function:
kill signalNumber,processID
You can specify more than one process on the line by the command:
kill signalNumber,processID, processID, ....
The kill function can be used to terminate a lot of processes at once. All processes in a process group can be terminated by another process in the group by sending a negative processID. Such measures are taken by daemons that terminate all child processes. The use of the blanket kill keeps you from keeping a table of process IDs of all the child processes for an application program. The process making the call with the negative ID is exempt from this pillage. That is, all the rest of the processes, except the one making the call, will be killed. The syntax for this call would be of the following form:
kill(9, -$$);
To see whether a particular task is alive, you can use signal 0 with the kill function. A non-zero return value tells you that the task is alive; a zero value tells you that the task does not exist.
if (kill(0,$id)) {
print "Task $id is alive;
} else {
print "Task $id is dead";
}
Sometimes in Perl scripts you have to warn the user about certain conditions in the system by writing to the console. Normally, you use the die() function to print the cause of death of a Perl script just before bailing out. If you did not want to stop right at the cause of the error but rather limp along with an error message only, a print statement may be redirected to some file. In this case, you would use the warn function. The warn function in Perl is the same as the die function, except that the program keeps going with the warn function instead of stopping as it would with die.
You can use signals to handle timeouts in UNIX. The alarm function comes in handy for these types of signals. Here's the syntax for the alarm() function:
alarm numberOfseconds;
For example, the following stub of code sets up SIGALRM for a program:
sub aclk {
my $is = shift;
print "something";
}
$SIG{ALRM} = \&clk;
You can wait for a child process, too, with the wait command. The wait command returns after a child process dies and returns the process ID of the dead child process. Calling the wait function may cause you to hang forever if the child runs away and never dies. If there are no children, this function returns -1. To wait explicitly for a process, you can specify its process ID in the call to waitpid. The syntax for the call is waitpid $PID 0. The zero is required.
A UNIX system call to pipe() can be used to create a communications channel between two processes. Generally you call this function just before a fork call and then use the pipe to communicate between a parent process and its child process. Contrast this usage of pipes with message queues, which can be used to communication between unrelated processes. Here's the syntax for this call:
pipe ( READHANDLE, WRITEHANDLE);
Listing 14.3 illustrates a sample run of a pipe communication.
Listing 14.3. A sample run when using pipes.
1 #!/usr/bin/perl
2 $|=1;
3 #
4 # Create a pipe with the system call
5 #
6 $smoke = pipe(PIPE_R, PIPE_W);
7 #
8 # Fork off (almost) immediately.
9 #
10 if ($pid = fork) {
11 printf "\n Parent:";
12 close PIPE_R; # <-- the handle you do not need.
13 #
14 # now write to the child.
15 #
16 select PIPE_W;
17 for ($i=0;$i<5;$i++) {
18 sleep 1;
19 printf PIPE_W "[Sending: %d]\n", $i;
20 }
21 }
22 else
23 {
24 printf "\n Child:";
25 close PIPE_W; # <- won't be writing to it.
26 #
27 # now read from parent
28 #
29 for ($i=0;$i<5;$i++) {
30 $buf = <PIPE_R>;
31 # For fixed length records you would use:
32 # read PIPE_R,$buf,20;
33 print "Child: Received $buf \n";
34 }
35 }
Line 2 of this listing sets the output buffers to be flushed as soon as they are written to. Line 6 creates the pipe for you to make PIPE_R and PIPE_W valid handles. We immediately do a fork in line 10.
After the fork, there are actually four handles: two PIPE_R handles and two PIPE_W handles. Because the parent does not want to read its own echo on PIPE_R, it closes PIPE_R. Similarly, the child has no reason to write to itself, so it closes PIPE_W. Then the parent writes to the PIPE_W handle while the child reads from the PIPE_R handle.
The following lines are the output from this program. Note how the output from the child and parent process is intermixed. This is because both processes are running at the same priority and accessing the output device (stdout) at the same time.
hmm.. okParent:
Child:Child: Received [Sending: 0]
Child: Received [Sending: 1]
Child: Received [Sending: 2]
Child: Received [Sending: 3]
Child: Received [Sending: 4]
Pipes are great when used between related processes. However, if you want to communicate between two different processes, you have to use FIFOs. A FIFO is also referred to as a named pipe. Unrelated processes use named pipes to talk to each other. The FIFO appears like a normal filename.
In Listing 14.4, line 6 names the FIFO pathname as it would appear in the output from an ls command. In line 7, the FIFO is created with a system call to the command mkfifo using the -m option and pathname used in line 6. The unless clause checks to see whether such a FIFO already exists before attempting to recreate it. The code in line 8 actually opens the FIFO after it is created. Then, in lines 9 through 11, you create a string with the current date and send it to the FIFO. In line 12, you close the FIFO. The FIFO is not destroyed when it is closed at line 12. You have to unlink the pathname-that is, remove the FIFO by name to get rid of it, which is shown in line 14.
You can use an ls command on the FIFO name to see whether it exists. A way to test whether a filename is a FIFO is to use the option -p on the filename from within a Perl script.
Listing 14.4. Using FIFOs.
1 #!/usr/bin/perl
2 #
3 # Create a fifo and then return the date
4 # back to the caller when FIFO is read.
5 #
6 $path = "./ch14_fifo";
7 unless (-p $path) { system("mkfifo -m 0666 $path"); }
8 open(FIFO,"> $path") || die "Cannot open $! \n";
9 $date = `date`;
10 chop($date);
11 print FIFO "[$date]";
12 close FIFO;
13 # Remove when done.
14 unlink $path;
The FIFO is opened in line 8. Now the program blocks until there's something on the other end trying to read from it. Using a command like cat ch14_fifo triggers line 9. The program then gets the system date and prints it out to the FIFO. Line 14 cleans up after itself.
You would probably want to have a signal handler to clean up if a terminating signal arrives before input is read from the FIFO and the program exits before destroying the FIFO. This is the reason for the unless clause, which checks for any existing FIFOs before creating a new one. The signal to catch is SIGPIPE for broken pipes. To trap SIGPIPE, you have to add this segment to your code:
sub pipeHandler {
my $sig = shift @_;
print " Caught SIGPIPE: $sig $1 \n";
exit(1);
}
$SIG{PIPE} = \&pipeHandler;
The open() statement can be used to start communication pipes. The catch is that these pipes are unidirectional. To read the results from a process, put | at the end of the command being executed. To write your results to the standard input of a process, put | at the start of the command.
For example, here's how to write your results to the sendmail mailing program:
open(LOG, "| /usr/apps/formatData | /usr/bin/sendmail ");
To read the results back from a process, use a line like this:
open(READINGS, "/usr/apps/commProgram |");
As an example, consider the program in Listing 14.5, which prints the names of the files with the string passed to them at the command-line argument. It then lists all the files within the associative array fname.
Listing 14.5. Storing results of a command in an array.
1 #!/usr/bin/perl
2 # Storing results from a program in an array.
3 my %fname = ();
4 # Open all files with names ending in .pl
5 open(IncOMING,"grep ARGV[1] *.pl |");
6 while (<IncOMING>) {
7 ($name,@line) = split(':',$_);
8 if (!$fname{$name}) {
9 $fname{$name} = $name;
10 }
11 }
12 close IncOMING;
13
14 while (($key,$value) = each (%fname)) {
15 print "File: " . $key . "\n";
16 }
It's probably tempting to use the back ticks to run a program for you and collect the results. For example, $a = `who` returns the entire results of the who command in variable $a. It's impractical to use this method to collect results from a verbose command. Using a command like $a=`ls -lr` is a lot slower than actually opening a file handle to the output of this command and then processing it one line at a time. The problem is that the entire result of the command is stored in variable $a, chewing up memory and time while $a is appended to. It's easier and far more efficient to simply read from the output in manageable chunks.
Use two pipes to do bidirectional communications. Do not use a statement with a | at end of the command. Perl does not allow commands using this syntax:
open(HANDLE, "| sort |");
Perl gives an error message about not being able to perform bidirectional pipes.
The open() function can accept a file argument of either -| or |-. Accepting either of these parameters forks a child connected to the file handle you've just opened. The child is then running the same program as the parent. The return value from the call to open() is the process ID of the child or zero for the parent. The function dies if it cannot fork within the open() call. Here's a sample usage of -| to create a receiving child:
$| = 1; # Always do unbuffered IO here.
$pid = open(MYHANDLE, "|-");
if ($pid == 0) {
#parent acts as server
print MYHANDLE, " something";
} else {
#child acts as receiver.
getSomething = <MYHANDLE>;
}
An implicit fork() is possible with the open2() command. Basically, with this call you start off two processes running the same code. The child process ID is returned to the client. Errors cause the function to bail out. The Ipc::Open2 module is required for this to work. Also, you have to make sure you are reading and writing continuously to this buffer. Be careful to run only programs for which you know the input and output sequences.
$| = 1;
$pid = open('BI_R', 'BI_W', "myPerlScript");
if ($pid == 0) {
#parent acts as server
print B_R, " something";
} else {
#child acts as receiver.
getSomething = <MYHANDLE>;
}
There is an open3() function call, too, that adds STDERR to the list of opened files with which you may work.
This is a very quick introduction to using UNIX pipes and signals with Perl. Signals are a way of asynchronously telling a process about an event. Generally, signals can only be caught and an error message displayed about the type of signal through the use of a subroutine, called a handler. The types of signals vary depending on the type of UNIX system you use.
Using pipes is a method of communication between two processes that is old, yet still heavily used in UNIX. If a pipe (|) is placed at the beginning of a filename in an open() call, you'll be writing to a pipe with a UNIX command, file, or device at the other end. If command | is placed at the end of a filename to the open() call, you'll be reading the output of the command. Bi-directional and even tridirectional pipes are possible using the open2() and open3() calls.