by Peter MacKinnon
IN THIS CHAPTER
The GNU project, administered by the Free Software Foundation (FSF), seeks to provide software (in the form of source code) that is freely available to anyone who wants to use it. The project has a lengthy manifesto that explains the motivation behind this libertarian undertaking (for which we should all be thankful, since GNU has some of the best software around!). One of the key ideas within this manifesto is that high-quality software is an intrinsic human right, just as the air that we breathe is. Although GNU software is freely distributed, it is not public domain and is protected by the GNU General Public License. The main purpose behind the license is to keep GNU software free.
For more information on the FSF, you can write to them at
Free Software Foundation
675 Massachusetts Avenue
Cambridge, MA 02139
You can also request copies by sending e-mail to gnu@prep.ai.mit.edu.
The distribution of Linux on this book's CD-ROM comes with virtually all of the GNU programs that are currently available. They are archived using the tar program and compressed using the GNU gzip utility. gzip tends to compress better than the standard UNIX compression utility, compress. Files compressed with gzip end with a .gz suffix, whereas compress files end in .Z. However, gzip can uncompress compress files as well as its own.
Each of these compressed files has a version number included in its filename so that you can determine what version is most current. Once you decompress and un-tar the GNU file, the program can be compiled and installed on your system. Most of the files come with their own makefile. Most of the programs are refinements of standard Linux utilities such as make and bc.
So much software, either developed by or made available through the Free Software Foundation (which develops the GNU products), is available that each program cannot be described in detail. The following sections have brief descriptions of the GNU utilities and programs that are included with this distribution of Linux. They are summaries based on the descriptions of the programs as supplied by GNU.
autoconf generates shell scripts that can automatically configure source code packages (such as those for GNU). autoconf creates a script for a software package from a file which lists the operating system features that the package can utilize. autoconf requires GNU m4 to generate the required macro calls for its operation.
The shell called bash is an enhancement of the Bourne shell (thus the name, which stands for Bourne Again SHell). It offers many of the extensions found in csh and ksh. The bash shell also has job control, csh-style command history, and command-line editing with Emacs and vi modes built in. See Chapter 10, "bash."
bc is an algebraic language that can be used interactively from a shell command line, or with input files. GNU bc has a C-like syntax with several extensions including multicharacter variable names, an else statement, and full Boolean expressions. Unlike standard bc, GNU bc does not require the separate dc program, which is another GNU calculator utility.
The BFD library allows a program that operates on object files (such as ld or gdb) to support many different formats efficiently. BFD provides a portable interface, so that only BFD needs to know the details of a particular format. One result is that all programs using BFD will support formats such as a.out (default C executable) and COFF.
binutils includes a collection of development programs, including ar, as, c++filt, gasp, gprof, encaps, ld, nm, objcopy, objdump, ranlib, strings, strings-gnu, strip, ld86, and as86.
binutils Version 2.6.0.14 is written to use the BFD library. The GNU linker ld emits source-line numbered error messages for multiply-defined symbols and undefined references. The objdump program can display data such as symbols from any file format understood by BFD.
bison is an upwardly compatible replacement for the parser generator yacc. bison takes a description of tokens in the form of a grammar and generates a parser in the form of a C program.
Version 2.7 of the GNU C Compiler (gcc) supports three languages: C, C++, and Objective-C. The language selected depends on the source file suffix or a compiler option. The runtime support required by Objective-C programs is now distributed with gcc. The GNU C Compiler is a portable optimizing compiler that supports full ANSI C, traditional C, and GNU C extensions. GNU C has been extended to support features such as nested functions and nonlocal goto statements. Also, gcc can generate object files and debugging information in a variety of formats. See Chapter 27, "Programming in C," for more detailed information about C language support.
The GNU C library supports ANSI C and adds some extensions of its own. For example, the GNU stdio library lets you define new kinds of streams and your own printf formats.
The GNU C++ library (libg++) is an extensive collection of C++ classes, a new iostream library for input/output routines, and support tools for use with g++. Among the classes supported are multiple-precision integers and rational numbers, complex numbers, and arbitrary-length strings. There are also prototype files for generating common container classes.
calc is a desk calculator and mathematical tool that is used within GNU Emacs. calc can be used as a basic calculator, but it provides additional features including choice of algebraic or Reverse Polish Notation (RPN), logarithmic functions, trigonometric and financial functions, complex numbers, vectors, matrices, dates, times, infinities, sets, algebraic simplification, differentiation, and integration.
GNU Chess pits you against the computer in a full game of chess. It has regular-terminal, curses (a full-screen interface library for C), and X-terminal interfaces. GNU Chess implements many specialized features, including sophisticated heuristics that will challenge your best Bobby Fischer moves.
CLISP is an implementation of Common Lisp, the list-processing language that is widely used in artificial-intelligence applications. CLISP includes an interpreter and a byte compiler and has user interfaces in English and German that can be chosen at compile time.
GNU Common Lisp (gcl) has a compiler and interpreter for Common Lisp. It is highly portable, extremely efficient, and has a source-level LISP debugger for interpreted code. gcl also has profiling tools and an Xlib interface.
cpio is a program that copies file archives to and from tape or disk. It can also be used to copy files into a larger archive file or to other directories.
The Concurrent Version System (CVS) manages software revision and release control in a multideveloper, multidirectory, multigroup environment. It works in conjunction with RCS, another source code control program.
dc is an RPN calculator that can be used interactively or with input files.
DejaGnu is a framework for writing scripts to test any program. It includes the embeddable scripting language Tcl and its derivative expect, which runs scripts that can simulate user input.
The Diffutils package contains the file-comparison programs diff, diff3, sdiff, and cmp. GNU diff compares files showing line-by-line changes in several formats and is more efficient than its traditional version.
ecc is a error-correction checking program that uses the Reed-Solomon algorithm. It can correct a total of three byte errors in a block of 255 bytes and can detect more severe errors.
ed is the standard line-based text editor.
This is a small library of Emacs LISP functions, including routines for using doubly linked lists.
GNU Emacs is the second implementation of this highly popular editor developed by Richard Stallman. It integrates LISP for writing extensions and provides an interface to X. In addition to its own powerful command set, Emacs has extensions that emulate other popular editors such as vi and EDT (DEC's VMS editor). For more information on Emacs, refer to Chapter 16, "Text Editors."
Emacs 19.31.1 is a richer version of the Emacs editor with extensive support for the X Window system. It includes an interface to the X resource manager, has X toolkit support, has good RCS support, and includes many updated libraries. Emacs works equally well on character-based terminals as it does under X.
es is a shell based on rc that has an exception system and supports functions that return values other than just numbers. It works well interactively or in scripts, particularly because its quoting rules are simpler than the C and Bourne shells.
fileutils is a GNU collection of standard (and not-so-standard) Linux file utilities, including chgrp, chmod, chown, cp, dd, df, dir, du, install, ln, ls, mkdir, mkfifo, mknod, mv, mvdir, rm, rmdir, touch, and vdir.
find is a program that can be used both interactively and in shell scripts to find files given certain criteria and then execute operations (such as rm) on them. This program includes xargs, which applies a command to a list of files.
finger displays information about one or more Linux users. GNU finger supports a single host that can act as the finger server host in sites that have multiple hosts. This host collects information about who is logged into other hosts at that site. Thus, a query to any machine at another site will return complete information about any user at that site.
flex is a replacement for the lex scanner generator. The flex program generates more efficient scanners than does lex. The flex program also has the advantage that it generates C code. Scanners are used to identify tokens from input.
The fontutils create fonts for use with Ghostscript or TeX. They also contain general conversion programs and other utilities. Some of the programs in Fontutils include bpltobzr, bzrto, charspace, fontconvert, gsrenderfont, imageto, imgrotate, limn, and xbfe.
gas is the GNU assembler that converts assembly code into object files. Native assembly works for many systems, including Linux.
gawk is upwardly compatible with the awk program, which uses pattern-matching to modify files. It also provides several useful extensions not found in other awk implementations (awk, nawk), such as functions to convert the case of a matched string. For more detailed information, see Chapter 26, "gawk."
gdb is a debugger with a command-line user interface. Object files and symbol tables are read using the BFD library, which allows a single copy of gdb to debug programs of multiple object file formats. Other new features include command-language improvements, remote debugging over serial lines or TCP/IP, and watchpoints (breakpoints triggered when the value of an expression changes). An X version of gdb, called xxgdb, is also available.
The gdbm library is the GNU replacement for the traditional dbm and ndbm database libraries. It implements a database using lookup by hash tables.
Ghostscript is GNU's PostScript-compatible graphics language. It accepts commands in PostScript and executes them by writing directly to a printer, drawing in an X window, or writing to a file that you can print later (or to a bitmap file that you can edit with other graphics programs).
Ghostscript includes a graphics library that can be called from C. This allows client programs to use Ghostscript's features without having to know the PostScript language. For more information, consult Chapter 25, "Ghostscript."
Ghostview is an X-based previewer for multipage files that are interpreted by Ghostscript.
GNU mp (gmp) is an extensive library for arbitrary precision arithmetic on signed integers and rational numbers.
GNats: GNU's A Tracking System is a problem-reporting system. It uses the model of a central site or organization that receives problem reports and administers their resolution by electronic mail. Although it is used primarily as a software bug-tracking system, it could also be used for handling system-administration issues, project management, and a variety of other applications.
gnuplot is an interactive program for plotting mathematical expressions and data. It handles both curves (two-dimensional) and surfaces (three-dimensional).
gperf is a utility to generate "perfect" hash tables. There are implementations of gperf for C and C++ that generate hash functions for both languages.
GNU Graphics is a set of programs that produces plots from ASCII or binary data. It supports output to PostScript and the X Window system, has shell scripts examples using graph and plot, and features a statistics toolkit.
This package contains GNU grep, egrep, and fgrep. These utilities, which search files for regular expressions, execute much faster than do their traditional counterparts.
groff is a document-formatting system that includes drivers for PostScript, TeX dvi format, as well as implementations of eqn, nroff, pic, refer, tbl, troff, and the man, ms, and mm macros. Written in C++, these programs can be compiled with GNU C++ Version 2.5 or later.
gzip can expand LZW-compressed files but uses a different algorithm for compression that generally produces better results than the traditional compress program. It also uncompresses files compressed with the pack program.
GNU hp2xx reads HPGL files, decomposes all drawing commands into elementary vectors, and converts them into a variety of vector (including encapsulated PostScript, Metafont, and various special TeX-related formats, and simplified HPGL) and raster output formats (including PBM, PCX, and HP-PCL).
GNU indent formats C source code according to the GNU coding standards but, optionally, can also use the BSD default, K&R, and other formats. It is also possible to define your own format. indent can handle C++ comments.
ispell is an interactive spell checker that suggests other words with similar spelling as replacements for unrecognized words. ispell can use system and personal dictionaries, and standalone and GNU Emacs interfaces are also available.
GNU m4 is an implementation of the traditional macroprocessor for C. It has some extensions for handling more than nine positional parameters to macros, including files, running shell commands, and performing arithmetic.
GNU make adds extensions to the traditional program that is used to manage dependencies between related files. GNU extensions include long options, parallel compilation, flexible implicit pattern rules, conditional execution, and powerful-text manipulation functions. Recent versions have improved error reporting and added support for the popular += syntax to append more text to a variable's definition. For further information about make, see Chapter 55, "Source Code Control."
mtools is a set of public-domain programs that allow Linux systems to read, write, and manipulate files on an MS-DOS file system (usually a diskette).
MULE is a MULtilingual Enhancement to GNU Emacs 18. It can handle many character sets at once including Japanese, Chinese, Korean, Vietnamese, Thai, Greek, the ISO Latin-1 through Latin-5 character sets, Ukrainian, Russian, and other Cyrillic alphabets. A text buffer in MULE can contain a mixture of characters from these languages. To input any of these characters, you can use various input methods provided by MULE itself.
NetFax is a freely available fax-spooling system that provides Group 3 fax transmission and reception services for a networked Linux system. It requires a fax modem that accepts Class 2 fax commands.
NetHack is a display-oriented adventure game that supports both ASCII and X displays.
The NIH Class Library is a portable collection of C++ classes, similar to those in Smalltalk-80, that has been developed by Keith Gorlen of the National Institutes of Health (NIH) using the C++ programming language.
nvi is a free implementation of the vi text editor. It has enhancements over vi including split screens with multiple buffers, the capability to handle 8-bit data, infinite file and line lengths, tag stacks, infinite undo, and extended regular expressions.
octave is a high-level language primarily intended for numerical computations. It provides a convenient command-line interface for solving linear and nonlinear problems numerically.
octave does arithmetic for real and complex scalars and matrices, solves sets of nonlinear algebraic equations, integrates functions over finite and infinite intervals, and integrates systems of ordinary differential and differential-algebraic equations.
oleo is a spreadsheet program that supports X displays and character-based terminals. It can output encapsulated PostScript renditions of spreadsheets and uses Emacs-like configurable keybindings. Under X and in PostScript output, oleo supports variable-width fonts.
p2c translates from Pascal code to C. It recognizes many Pascal variants including Turbo, HP, VAX, and ISO, and produces entirely usable C source code.
patch is a program that takes the output from diff and applies the resulting differences to the original file in order to generate the modified version. It would be useful for developing a source code control system, if one were so inclined.
PCL is a free implementation of a large subset of CLOS, the Common Lisp Object System. It runs under CLISP, mentioned earlier.
perl is a programming language developed by Larry Wall that combines the features and capabilities of sed, awk, shell programming, and C, as well as interfaces to system calls and many C library routines. It has become wildly popular for sophisticated applications that are not dependent on complex data structures. A "perl" mode for editing perl code comes with GNU Emacs included on this distribution.
GNU ptx is the GNU version of the traditional permuted index generator. It can handle multiple input files at once, produce TeX-compatible output, and produce readable KWIC (KeyWords In Context) indexes without needing to use the nroff program.
rc is a shell that features C-like syntax (even more so than csh) and better quoting rules than the C and Bourne shells. It can be used interactively or in scripts.
The Revision Control System (RCS) is used for version control and management of software projects. When used with GNU diff, RCS can handle binary files such as executables and object files. For more information on RCS, refer to Chapter 55.
GNU recode converts files between character sets and usages. When exact transformations are not possible, it may get rid of any offending characters or revert to approximations. This program recognizes or produces nearly 150 different character sets and is able to transform files between almost any pair.
regex is the GNU regular expression library whose routines have been used within many GNU programs. Now it is finally available by itself. A faster version of this library comes with the sed editor.
Scheme is a language that is related to LISP. The chief difference is that Scheme can pass functions as arguments to another function, it can return a function as the result of a function call, and functions can be the value of an expression without being defined under a particular name.
screen is a terminal multiplexer that runs several separate "screens" (ttys) on a single physical character-based terminal. Each virtual terminal emulates a DEC VT100 plus additional functions. screen sessions can be idled and resumed later on a different terminal type.
sed is a non-interactive, stream-oriented version of ed. It is used frequently in shell scripts and is extremely useful for applying repetitive edits to a collection of files or to create conversion programs. GNU sed comes with the rx library, which is a faster version of regex.
shellutils can be used interactively or in shell scripts and includes the following programs: basename, date, dirname, echo, env, expr, false, groups, id, nice, nohup, printenv, printf, sleep, stty, su, tee, test, true, tty, uname, who, whoami, and yes.
Shogi is a Japanese game similar to chess, with the exception that captured pieces can be returned to play. GNU Shogi is based on the implementation of GNU Chess: it implements the same features and uses similar heuristics. As a new feature, sequences of partial board patterns can be introduced in order to help the program play a good order of moves toward specific opening patterns. There are both character- and X-display interfaces.
GNU Smalltalk is an interpreted object-oriented programming language system written in C. Smalltalk itself has become extremely popular among programmers recently and tends to be regarded as a "pure" object-oriented implementation language.
The features of GNU Smalltalk include a binary image save capability, the ability to invoke user-written C code and pass parameters to it, a GNU Emacs editing mode, a version of the X protocol that can be called from within Smalltalk, and automatically loaded per-user initialization files. It implements all of the classes and protocol in Smalltalk-80, except for the graphic user interface (GUI) related classes.
superopt is a function sequence generator that uses a repetitive generate-and-test approach to find the shortest instruction sequence for a given function. The interface is simple: you provide the GNU superoptimizer, gso, a function, a CPU to generate code for, and how many instructions you can accept.
GNU tar is a file-archiving program that includes multivolume support, automatic archive compression/decompression, remote archives, and special features that allow tar to be used for incremental and full backups.
The GNU termcap library is a replacement for the libtermcap.a library. It does not place an arbitrary limit on the size of termcap entries, unlike most other termcap libraries.
TeX is a document-formatting system that handles complicated typesetting, including mathematics. It is GNU's standard text formatter. For more information on TeX, please refer to Chapter 19, "TeX."
texinfo is a set of utilities that generate both printed manuals and online hypertext-style documentation (called "Info"). There are also programs for reading online Info documents. Version 3 has both GNU Emacs LISP and standalone programs written in C or shell script. The texinfo mode for GNU Emacs enables easy editing and updating of texinfo files. Programs provided include makeinfo, info, texi2dvi, texindex, tex2patch, and fixfonts.
The textutils programs manipulate textual data and include the following traditional programs: cat, cksum, comm, csplit, cut, expand, fold, head, join, nl, od, paste, pr, sort, split, sum, tac, tail, tr, unexpand, uniq, and wc.
Tile Forth is a 32-bit implementation of the Forth-83 standard written in C. Traditionally, Forth implementations are written in assembler to use the underlying hardware as optimally as possible, but this also makes them less portable.
time is used to report statistics (usually from a shell) about the amount of user, system, and real time used by a process.
tput is a portable way for shell scripts to use special terminal capabilities. GNU tput uses the termcap database, instead of terminfo as many others do.
This version of UUCP (UNIX-to-UNIX copy) supports the f, g, v (in all window and packet sizes), G, t, e, Zmodem, and two new bi-directional (i and j) protocols. If you have a Berkeley sockets library, it can make TCP connections. If you have TLI libraries, it can make TLI connections.
uuencode and uudecode are used to transmit binary files over transmission media that support only simple ASCII data.
wdiff is another interface to the GNU diff program. It compares two files, finding which words have been deleted or added to the first in order to create the second. It has many output formats and interacts well with terminals and programs such as more. wdiff is especially useful when two texts differ only by a few words and paragraphs have been refilled.
The GNU project provides UNIX-like software freely to everyone, with the provision that it remains free if distributed to others. GNU software can be compiled for many different types of systems, including Linux. Many GNU utilities are improvements of existing Linux counterparts and include new implementations of shells, the C compiler, and a code debugger. Other types of GNU software include games, text editors, calculators, and communication utilities. Each utility can be separately un-compressed, un-tarred, and compiled itself.