-->
Page 116
A configuration file is composed of a series of rulesets, which are somewhat like subroutines in a program. Rulesets are used to detect bad addresses, to rewrite addresses into forms that remote mailers can understand, and to route mail to one of sendmail's internal mailers. (See the section "The M OperatorMailer Definitions" earlier in this chapter.)
sendmail passes addresses to rulesets according to a built-in order. Rulesets also can call other rulesets not in the built-in order. The built-in order varies depending on whether the address being handled is a sender or receiver address, and what mailer has been chosen to deliver the letter.
Rulesets are announced by the S command, which is followed by a number to identify the ruleset. sendmail collects subsequent R (rule) lines until it finds another S operator, or the end of the configuration file. The following example defines ruleset 11:
# Ruleset 11 S11 R$+ $: $>22 $1 call ruleset 22
This ruleset doesn't do much that is useful. The important point to note is that sendmail collects ruleset number 11, which is composed of a single rule.
sendmail uses a three-track approach to processing addresses: one to choose a delivery agent, another to process sender addresses, and one for receiver addresses.
All addresses are first sent through ruleset 3 for preprocessing into a canonical form that makes them easy for other rulesets to handle. Regardless of the complexity of the address, ruleset 3's job is to decide the next host to which a letter should be sent. Ruleset 3 tries to locate that host in the address and mark it within angle brackets. In the simplest case, an address like joe@gonzo.gov becomes joe<@gonzo.gov>.
Ruleset 0 then determines the correct delivery agent (mailer) to use for each recipient. For example, a letter from betty@whizzer.com to joe@gonzo.gov (an Internet site) and pinhead!zippy (an old-style UUCP site) requires two different mailers: an SMTP mailer for gonzo.gov and an old-style UUCP mailer for pinhead. Mailer selection determines later processing of sender and recipient addresses because the rulesets given in the S= and R= mailer flags vary from mailer to mailer.
Addresses sent through ruleset 0 must resolve to a mailer. This means that when an address matches the lhs, the rhs gives a triple of mailer, user, and host. The following line shows the syntax for a rule that resolves to a mailer:
Rlhs $#mailer $@host $:user your comment here...
Page 117
The mailer is the name of one of the mailers you've defined in an M commandfor example, smtp. The host and user are usually positional macros taken from the lhs match. (See "The Right-Hand Side (rhs) of Rules" later in the chapter.)
After sendmail selects a mailer in ruleset 0, it processes sender addresses through ruleset 1 (often empty) and then sends them to the ruleset given in the S= flag for that mailer.
Similarly, it sends recipient addresses through ruleset 2 (also often empty) and then to the ruleset mentioned in the R= mailer flag.
Finally, sendmail post-processes all addresses in ruleset 4, which among other things removes the angle brackets inserted by ruleset 3.
Why do mailers have different S= and R= flags? Consider the previous example of the letter sent to joe@gonzo.gov and pinhead!zippy. If betty@whizzer.com sends the mail, her address must appear in a different form to each recipient. For Joe, it should be a domain address, betty@whizzer.com. For Zippy, because whizzer.com expects old-style UUCP addresses (and assuming it has a UUCP link to pinhead and whizzer.com's UUCP hostname is whizzer), the return address should be whizzer!betty. Joe's address must also be rewritten for the pinhead UUCP mailer, and Joe's copy must include an address for Zippy that his mailer can handle.
sendmail passes an address to a ruleset and then processes it through each rule line by line. If the lhs of a rule matches the address, it is rewritten by the rhs. If it doesn't match, sendmail continues to the next rule until it reaches the end of the ruleset. At the end of the ruleset, sendmail returns the rewritten address to the calling ruleset or to the next ruleset in its built-in execution sequence.
If an address matches the lhs and is rewritten by the rhs, the rule is tried againan implicit loop (but see the "$: and $@Altering a Ruleset's Evaluation" section for exceptions).
As shown in Table 7.1, each rewriting rule is introduced by the R command and has three fieldsthe left-hand side (lhs, or matching side), the right-hand side (rhs, or rewriting side), and an optional commenteach of which must be separated by tab characters:
Rlhs rhs comment
sendmail parses addresses and the lhs of rules into tokens and then matches the address and the lhs, token by token. The macro $o contains the characters that sendmail uses to separate an address into tokens. It's often defined like this:
# address delimiter characters Do.:%@!^/[]
Page 118
All the characters in $o are both token separators and tokens. sendmail takes an address such as rae@rainbow.org and breaks it into tokens according to the characters in the o macro, like this:
"rae" "@" "rainbow" "." "org"
sendmail also parses the lhs of rewriting rules into tokens so they can be compared one by one with the input address to see whether they match. For example, the lhs $-@rainbow.org gets parsed as follows:
"$-" "@" "rainbow" "." "org"
(Don't worry about the $- just yet. It's a pattern-matching operator, similar to shell wildcards, that matches any single token and is covered later in the section "The Left-Hand Side [lhs] of Rules.") Now you can put the two together to show how sendmail decides whether an address matches the lhs of a rule:
"rae" "@" "rainbow" "." "org" "$-" "@" "rainbow" "." "org"
In this case, each token from the address matches a constant string (for example, rainbow) or a pattern-matching operator ($-), so the address matches and sendmail would use the rhs to rewrite the address.
Consider the effect (usually bad!) of changing the value of $o. As shown previously, sendmail breaks the address rae@rainbow.org into five tokens. However, if the @ character were not in $o, the address would be parsed quite differently, into only three tokens:
"rae@rainbow" "." "org"
You can see that changing $o has a drastic effect on sendmail's address parsing, and you should leave it alone until you really know what you're doing. Even then, you probably won't want to change it because the V8 sendmail configuration files already have it correctly defined for standard RFC 822 and RFC 976 address interpretation.
The lhs is a pattern against which sendmail matches the input address. The lhs can contain ordinary text or any of the pattern-matching operators shown in Table 7.2.
Table 7.2. lhs pattern-matching operators.
Operator | Description |
$- | Matches exactly one token |
$+ | Matches one or more tokens |
$* | Matches zero or more tokens |
$@ | Matches the null input (used to call the error mailer) |