-->
Page 568
break causes the current (innermost) loop to be exited. It behaves as if the conditional test was performed immediately with a false result. None of the remaining code in the loop (after the break statement) executes, and the loop ends. This is useful when you need to handle some error or early end condition.
continue causes the current loop to return to the conditional test. None of the remaining code in the loop (after the continue statement) is executed, and the test is immediately executed. This is most useful when there is code you want to skip (within the loop) temporarily. The continue is different from the break because the loop is not forced to end.
The for statement provides a looping construct that modifies values within the loop. It is good for counting through a specific number of items.
The for statement has two general formsthe following
for (loop = 0; loop < 10; loop++) statement
and
for (subscript in array) statement
The first form initializes the variable (loop = 0), performs the test (loop < 10), and then performs the loop contents (statement). Then it modifies the variable (loop++) and tests again. As long as the test is true, statement will execute.
In the second form, statement is executed with subscript being set to each of the subscripts in array. This enables you to loop through an array even if you don't know the values of the subscripts. This works well for multidimension arrays.
statement can be one statement or multiple statements enclosed in curly braces. The condition (loop < 10) is any valid test like those used with the if statement or the pattern used to trigger actions.
In general, you don't want to change the loop control variable (loop or subscript) within the loop body. Let the for statement do that for you, or you might get behavior that is difficult to debug.
For the first form, the modification of the variable can be any valid operation (including calls to functions). In most cases, it is an increment or decrement.
TIP |
This example showed the postfix increment. It doesn't matter whether you use the postfix (loop++) or prefix (++loop) incrementthe results will be the same. Just be consistent. |
Page 569
The for loop is a good method of looping through data of an unknown size:
for (i=1; i<=NF; i++) print $i
Each field on the current record will be printed on its own line. As a programmer, I don't know how many fields are on a particular record when I write the code. The variable NF lets me know as the program runs.
The final loop structure is the while loop. It is the most general because it executes while the condition is true. The general form is as follows:
while(condition) statement
statement can be one statement or multiple statements enclosed in curly braces. condition is any valid test like those used with the if statement or the pattern used to trigger actions.
If the condition is false before the while is encountered, the contents of the loop will not be executed. This is different from do, which always executes the loop contents at least once.
In general, you must change the value of the variable in the condition within the loop. If you don't, you will have a loop forever condition because the test result (condition) would never change (and become false).
In addition to the simple input and output facilities provided by awk, there are a number of advanced features you can take advantage of for more complicated processing.
By default, awk automatically reads and loops through your program; you can alter this behavior. You can force input to come from a different file, cause the loop to recycle early (read the next record without performing any more actions), or even just read the next record. You can even get data from the output of other commands.
On the output side, you can format the output and send it to a file (other than the standard output device) or as input to another command.
You don't have to program the normal input loop process in awk. It reads a record and then searches for pattern matches and the corresponding actions to execute. If there are multiple files specified on the command line, they are processed in order. It is only if you want to change this behavior that you have to do any special programming.
Page 570
The next command causes awk to read the next record and perform the pattern match and corresponding action execution immediately. Normally, it executes all your code in any actions with matching patterns. next causes any additional matching patterns to be ignored for this record.
The exit command in any action except for END behaves as if the end of file was reached. Code execution in all pattern/actions is ceased, and the actions within the END pattern are executed. exit appearing in the END pattern is a special caseit causes the program to end.
The getline statement is used to explicitly read a record. This is especially useful if you have a data record that looks like two physical records. It performs the normal field splitting (setting $0, the field variables, FNR, NF, and NR). It returns the value 1 if the read was successful and zero if it failed (end of file was reached). If you want to explicitly read through a file, you can code something like the following:
{ while (getline == 1) { # process the inputted fields } }
You can also have getline store the input data in a field instead of taking advantage of the normal field processing by using the form getline variable. When used this way, NF is set to zero, and FNR and NR are incremented.
Input from a file
You can use getline to input data from a specific file instead of the ones listed on the command line. The general form is getline < "filename". When coded this way, getline performs the normal field splitting (setting $0, the field variables, and NF). If the file doesn't exist, getline returns -1; it returns 1 on success and 0 on failure.
You can read the data from the specified file into a variable. You can also replace filename with stdin or a variable that contains the filename.
NOTE |
If you use getline < "filename" to read data into your program, neither FNR nor NR is changed. |
Input from a Command
Another way of using the getline statement is to accept input from a UNIX command. If you want to perform some processing for each person signed on the system (send him or her a message, for instance), you can code something like the following: