-->
Page 169
exp(expr) | The exponential function. |
int(expr) | Truncates to integer. |
log(expr) | The natural logarithm function. |
rand() | Returns a random number between 0 and 1. |
sin(expr) | Returns the sine in radians. |
sqrt(expr) | The square root function. |
srand(expr) | Use expr as a new seed for the random number generator. If no expr is provided, the time of day will be used. The return value is the previous seed for the random number generator. |
STRING FUNCTIONS
awk has the following predefined string functions:
gsub(r, s, t) | For each substring matching the regular expression r in the string t, substitute the string s, and return the number of substitutions. If t is not supplied, use $0. |
index(s, t) | Returns the index of the string t in the string s,or 0 if t is not present. |
length(s) | Returns the length of the string s, or the length of $0 if s is not supplied. |
match(s, r) | Returns the position in s where the regular expression r occurs, or 0 if u is not present, and sets the values of RSTART and RLENGTH. |
split(s, a, r) | Splits the string s into the array a on the regular expression r, and returns the number of fields. If r is omitted, FS is used instead. The array a is cleared first. |
sprintf(fmt, expr-list) | Prints expr-list according to fmt, and returns the resulting string. |
sub(r, s, t) | Just like gsub(), but only the first matching substring is replaced. |
substr(s, i, n) | Returns the n-character substring of s starting at i. If n is omitted, the rest of s is used. |
tolower(str) | Returns a copy of the string str, with all the uppercase characters in str translated to their corresponding lowercase counterparts. Nonalphabetic characters are left unchanged. |
toupper(str) | Returns a copy of the string str, with all the lowercase characters in str translated to their corresponding uppercase counterparts. Nonalphabetic characters are left unchanged. |
TIME FUNCTIONS
Since one of the primary uses of awk programs is processing log files that contain time stamp information, gawk provides the following two functions for obtaining time stamps and formatting them.
systime() | Returns the current time of day as the number of seconds since the Epoch (Midnight UTC, January 1, 1970 on systems). |
strftime(format, timestamp) | Formats timestamp according to the specification in format. The timestamp should be of the same form as returned by systime(). If timestamp is missing, the current time of day is used. See the specification for the strftime() function in C for the format conversions that are guaranteed to be available. A public-domain version of strftime(3) and a man page for it are shipped with gawk; if that version was used to build gawk, then all of the conversions described in that man page are available to gawk. |
STRING CONSTANTS
String constants in awk are sequences of characters enclosed between double quotes ("). Within strings, certain escape sequences are recognized, as in C. These are
\\ | A literal backslash. |
\a | The "alert" character; usually the ASCII BEL character. |
\b | Backspace. |
\f | Formfeed. |
\n | Newline. |
Page 170
\r | Carriage return. |
\t | Horizontal tab. |
\v | Vertical tab. |
\xhex digits | The character represented by the string of hexadecimal digits following the \x. As in C, all following hexadecimal digits are considered part of the escape sequence. (This feature should tell us something about language design by committee.) For example, "\x1B" is the ASCII ESC (escape) character. |
\ddd | The character represented by the 1-, 2-, or 3-digit sequence of octal digits. For example, "\033" is the ASCII ESC (escape) character. |
\c | The literal character c. |
The escape sequences may also be used inside constant regular expressions (for example, /[\\t\f\n\r\v]/ matches whitespace characters).
FUNCTIONS
Functions in awk are defined as follows:
function name(parameter list) { statements }
Functions are executed when called from within the action parts of regular pattern-action statements. Actual parameters supplied in the function call are used to instantiate the formal parameters declared in the function. Arrays are passed by reference, other variables are passed by value.
Functions were not originally part of the awk language, so the provision for local variables is rather clumsy: They are declared as extra parameters in the parameter list. The convention is to separate local variables from real parameters by extra spaces in the parameter list. For example
function f(p, q, a, b) { # a & b are local ..... } /abc/ { ... ; f(1, 2) ; ... }
The left parenthesis in a function call is required to immediately follow the function name, without any intervening whitespace. This is to avoid a syntactic ambiguity with the concatenation operator. This restriction does not apply to the built-in functions listed earlier.
Functions may call each other and may be recursive. Function parameters used as local variables are initialized to the null string and the number zero upon function invocation.
The word func may be used in place of function.
EXAMPLES
Print and sort the login names of all users:
BEGIN { FS = ":" } { print $1 j "sort" }
Count lines in a file:
{ nlines++ } END { print nlines }
Precede each line by its number in the file:
{ print FNR, $0 }
Concatenate and line number (a variation on a theme):
{ print NR, $0 }