-->

Previous | Table of Contents | Next

Page 1085

As mentioned, root words in any dictionary may be extended by flags. Each flag is a single alphabetic character, which represents a prefix or suffix that may be added to the root to form a new word. For example, in an English dictionary the D flag can be added to bathe to make bathed. Because flags are represented as a single bit in the hashed dictionary, this results in significant space savings. The munchlist script will reduce an existing raw dictionary by adding flags when possible.

When a word is extended with an affix, the affix will be accepted only if it appears in the same case as the initial (prefix) or final (suffix) letter of the word. Thus, for example, the entry UNIX/M in the main dictionary (M means add an apostrophe and an s to make a possessive) would accept UNIX'S but would reject UNIX's. If UNIX's is legal, it must appear as a separate dictionary entry, and it will not be combined by munchlist. (In general, you don't need to worry about these things; munchlist guarantees that its output dictionary will accept the same set of words as its input, so all you have to do is add words to the dictionary and occasionally run munchlist to reduce its size.)

As mentioned, the affix definition file describes the affixes associated with particular flags. It also describes the character set used by the language.

Although the affix-definition grammar is designed for a line-oriented layout, it is actually a free-format grammar and can be laid out weirdly if you want. Comments are started by a pound (sharp) sign (#), and continue to the end of the line. Backslashes are supported in the usual fashion (\nnn, plus specials \n, \r, \t, \v, \f, \b, and the new hex format \xnn). Any character with special meaning to the parser can be changed to an uninterpreted token by backslashing it; for example, you can declare a flag named asterisk or colon with flag n*: or flag n::.

The grammar will be presented in a top-down fashion, with discussion of each element. An affix-definition file must contain exactly one table:


table           :[headers][prefixes][suffixes]

At least one of prefixes and suffixes is required. They can appear in either order.


headers     :[options ] char-sets

The headers describe options global to this dictionary and language. These include the character sets to be used and the formatter, and the defaults for certain ispell flags.


options : { fmtr-stmt | opt-stmt | flag-stmt | num-stmt }

The options statements define the defaults for certain ispell flags and for the character sets used by the formatters.


fmtr-stmt : { nroff-stmt | tex-stmt }

A fmtr-stmt statement describes characters that have special meaning to a formatter. Normally, this statement is not necessary, but some languages may have preempted the usual defaults for use as language-specific characters. In this case, these statements may be used to redefine the special characters expected by the formatter.


nroff-stmt : { nroffchars | troffchars } string

The nroffchars statement allows redefinition of certain nroff control characters. The string given must be exactly five characters long, and must list substitutions for the left and right parentheses, the period, the backslash, and the asterisk. (The right parenthesis is not currently used, but is included for completeness.) For example, the statement:


nroffchars {}.\\*

would replace the left and right parentheses with left and right curly braces for purposes of parsing nroff/troff strings, with no effect on the others (admittedly a contrived example). Note that the backslash is escaped with a backslash.


tex-stmt : { TeXchars | texchars } string

The TeXchars statement allows redefinition of certain TeX/LaTeX control characters. The string given must be exactly thirteen characters long, and must list substitutions for the left and right parentheses, the left and right square brackets, the left and right curly braces, the left and right angle brackets, the backslash, the dollar sign, the asterisk, the period or dot, and the percent sign. For example, the statement:


texchars ()\[]<\><\>\\$*.%

Previous | Table of Contents | Next