-->
Page 229
Hour 14, "Text Processing," discussed text editing tools and programs used to create, edit, and save documents. This hour introduces you to programs and utilities used to format and print your text documents. It starts off with a discussion of different text formatting programs, and shows you how to easily format your documents for printing without using complicated formatting commands. It then shows you how to use more complex programs by using simple commands.
The hour concludes with a discussion of the basics of document printing under Linux, as well as detailed information about how to configure your printer to get the best possible output.
While you can do basic formatting of text documents using a text editor or word processor, most of these programs for Linux lack the necessary features to add page numbering, boldfacing, font changes, indenting, or other fancy text layout.
Page 230
Producing nicely typeset documents by using the programs included with your Linux distribution is usually a three-step process. First, you create the document with a text editor. You intersperse your text with typesetting commands to create a certain effect when you filter your document through a formatting program. The second step is to process your document through the typesetting program to produce formatted output. The third step is to either check your formatting by previewing the document, or if you're confident about your formatting, to print the document.
This section shows you how to use typesetting programs to produce nicely typeset documents. It also discusses the basic syntax, or commands, that these programs recognize.
In Hour 14 you were introduced to some text filters to change the output of different programs or the contents of a document. These filters can help you format your documents if you don't want to learn a complex formatting program or use complicated typesetting commands, and are useful for quickly building formatted documents with headers, footers, margins, and page numbers.
Sending directory listings or short text files directly to your printer without formatting may be okay, but printing larger text files usually requires nicer formatting. One command you can use is the pr command, found under the /usr/bin directory. The pr command has 19 different command-line options to format documents. Look at the following example:
# pr +4 -h "Draft Number 1" -o 8 <mydocument.txt >output.txt
This command line formats the file mydocument.txt, starting at page 4, with a header containing the date, time, page number, and the words "Draft Number 1," with a left margin of 8 spaces. Using the > redirection operator, the formatted output is then saved to a file called output.txt, which you can print at a later time.
The pr program also formats selected streams of text. One handy use is to create formatted columns. For example, if you have a paragraph, you can quickly create columns of text by combining several filter programs and then formatting the text with the -COLUMN command-line option.
# cat cities.txt Cinncinati, San Francisco, Philadelphia, Austin, Washington D.C., Albany, Indianapolis, Raleigh, Chicago, Miami, Norfolk, Savannah, Seattle, Pittsburgh, St. Louis, Phoenix, Nashville, Las Vegas, Atlantic City # cat cities.txt | sed `s/, /#/g' | tr "#" `\n' | sort | pr -l 1 -3 Albany Atlantic City Chicago Indianapolis Las Vegas Miami Nashville Norfolk Phoenix Pittsburgh Raleigh Savannah Seattle St. Louis Washington
Page 231
The preceding example is a list of cities, separated by commas. By using several filters, including the pr command, the text has been formatted into a readable list, sorted alphabetically. This command line works by first using the sed stream editor (discussed in Hour 14) to change each comma and space to a pound (#) sign. The pound sign is then translated to a carriage return, by the tr command (also discussed in Hour 14) and the text, now a list of words, one per line, is fed into the sort command. The sorted text is then fed into the pr command to produce three columns of text (the -l, or page length option, with a value of 1 has been used to inhibit the page header).
You also can use the fmt command with the pr command, to change the word wrap, or width of your text documents. The following example is part of a public-domain software license (found under the /usr/doc/shadow-utils directory).
# cat LICENSE ... 1. You may make and give away verbatim copies of the source form of the Standard Version of this Package without restriction, provided that you duplicate all of the original copyright notices and associated disclaimers. 2. You may apply bug fixes, portability fixes and other modifications derived from the Public Domain or from the Copyright Holder. A Package modified in such a way shall still be considered the Standard Version. ...
While this text may be okay when printed on the screen, or by using the default settings with the pr command, the text overruns lines if you use a left-hand margin of 10 spaces. Look at the following example,
# pr -o 10 < <LICENSE ... 1. You may make and give away verbatim copies of the source form of the Standard Version of this Package without restriction, provided that you duplicate all of the original copyright notices and associated disclaimers. 2. You may apply bug fixes, portability fixes and other modifications derived from the Public Domain or from the Copyright Holder. A Packag e modified in such a way shall still be considered the Standard Version. ...
The preceding certainly won't look nice when printed! To fix it, use the fmt command to format the text into a smaller line width before formatting with the pr command:
# cat LICENSE | fmt -w 60 | pr -o 10 ... 1. You may make and give away verbatim copies of the source form of the Standard Version of this Package
Page 232
without restriction, provided that you duplicate all of the original copyright notices and associated disclaimers. 2. You may apply bug fixes, portability fixes and other modifications derived from the Public Domain or from the Copyright Holder. A Package modified in such a way shall still be considered the Standard Version. ...
As you can see, the text now fits nicely on your page when printed. But what if you want to save paper and would like to see at least two pages of output on a single page? In this case, use the mpage, or multiple page command to print several sheets of paper on a single page. Look at the following example:
# mpage -2 myfile.txt >myfile.ps
This command line uses the mpage command, found under the /usr/bin directory, to create a PostScript file you can later print, which contains the contents of the myfile.txt document as two side-by-side pages on each sheet of paper.
JUST A MINUTE |
You can use the mpage command's -O, -E, or -R options to print full-duplex, or back-to-back documents. This can be handy for producing small booklets using letter-size paper with a printer that's not capable of printing on both sides of each sheet. You also can set different margins, fonts, and order of printed pages. See the mpage manual page for details. |
By using the fmt, pr, or mpage commands, along with other text filters, you can perform quick and dirty rudimentary text formatting. If you want to try more complex formatting, use a text formatting program.
If you've used Linux for the past several hours and have read some of its manual pages, you should be familiar with at least one complex formatting program: groff. When you read a manual page with
# man ls
it is equivalent to the following:
# nroff -man /usr/man/man1/ls.1 | less
Notice that the nroff command is used on the command line instead of groff. This is because the nroff command installed on your system is a shell script, written to use the GNU groff formatting program to emulate the nroff command. All Linux manual pages are written using a special set of nroff commands called man macros. You also can tell the