Linux
by Tim Parker
IN THIS CHAPTER
- What Is CGI?
- CGI and HTML
- CGI and Perl
If you do any work with the World Wide Web, you will come across the term CGI,
or Common Gateway Interface. Although we can't hope to cover all you need to know
about CGI in a chapter, we can look at what CGI is and does, why you want to use
it, and how to go about using it.
If you get involved in doing more than simple Web page design (we look at HTML
and Java in the next couple of chapters), you will eventually end up using CGI in
some manner, usually to provide extra functionality to your Web pages. For that reason,
and so that you will know just what the term means, we need to look at CGI in a bit
of detail.
You now know what CGI stands for--Common Gateway Interface--but that doesn't help
you a lot when it comes to understanding what CGI does. The name is a little misleading.
Essentially, CGI is involved in an application that takes its starting commands from
a Web page. For example, you might have a button on your Web page that launches a
program to display statistics about how many people have visited your Web site. When
the button is clicked, an HTML command starts a program that performs the calculation
for you. CGI is involved in the interface between the HTML code and the application,
and it allows you to send information back and forth between the HTML code and other
applications that aren't necessarily part of the Web page.
CGI does more than that, but it is usually involved in applications that interface
between a Web page and a non-Web program. CGI programs don't have to be started from
a Web page, but they often are because a CGI program has a special set of environment
conditions that involve interactions between components that are otherwise hard to
simulate.
What does that mean? When you run a Web page written in HTML, the Web server sets
up some environment variables that control how the server operates. These environment
variables are used to control and pass information to programs, as well as many other
operations. When a person clicks a button on your Web page to launch an external
application, those environment variables are used to pass parameters to the program
(such as who is starting the application or what time it is). When the application
sends information back to the Web server, that information is passed back through
variables.
So when we talk about CGI programming, we really mean writing programs that involve
an interface between HTML and some other program. CGI deals with the interface between
the Web server and the application (hence the "interface" in the name).
What's so exciting about this? In reality, the number of behaviors you can code
on a Web page in HTML is somewhat limited. CGI lets you push past those barriers
to code just about anything you want, and have it interact properly with the Web
page. So if you need to run custom statistics on your Web page based on a client's
data, you can do it through CGI. CGI can pass the information to the numbers-crunching
application and then pass the results back to HTML for display on the Web page, to
take a simple example. In fact, there's a whole mess of things you can do on even
the simplest Web page when you start using CGI, and that is why it is so popular.
The CGI is usually built into the Web server, although it's not required to exist
in all Web servers. Luckily, almost every server on the market (except the very early
servers and a few stripped-down ones) contain the CGI code. The latest versions of
the Web servers from NCSA, Netscape, CERN, Apache, and many others all have CGI built
in.
To run a CGI application from a Web page, you make a request to the Web server
to run the CGI application. This request is made through a particular method that
is responsible for invoking CGI programs. (A method is a procedure or function.)
Many methods are built into HTTP (HyperText Transfer Protocol, the protocol used
by the World Wide Web); the method used to call the CGI application depends on the
type of information you want to transfer. We'll come back to methods in a moment,
after we look at how the CGI code is embedded in the HTML for the Web page.
As you will see in the next chapter, HTML involves the use of a bunch of tags.
To call a CGI program, a tag is used that gives the name of the program, as well
as the text that will appear on the Web page when the HTML code is executed. For
example, the HTML tag
<a href="crunch_numbers"> Click here to display statistics </a>
displays the message Click here to display statistics on the Web page.
When the user clicks there, the program called crunch_numbers is called.
(The <a> and </a> HTML tags are "anchor"
tags that indicate a link to something else. Wherever the tag is positioned in the
rest of the HTML code dictates exactly how the page will look on a Web browser.)
As you will see when we look at HTML in the next chapter, you can even use hyperlinks
to call a program on another machine by supplying the domain name. For example, the
HTML tag
<a href="www.tpci.com/stats.cgi"> Display Statistics </a>
displays the message Display Statistics on whatever Web page the code
runs on. When it is selected by the user, the program stats.cgi on the Web
server www.tpci.com is located and run. This server could be across the
country--it doesn't matter to either HTML or CGI, as long as the reference can be
resolved.
Three kinds of methods are normally used to call a CGI application: the GET,
HEAD, and POST methods (all are part of HTTP). They differ slightly
in when you use them. We will look at each method briefly so that you know what each
does and when it is used.
A GET method is used when the CGI application is to receive data in an
environment variable called QUERY_STRING. The application reads this variable
and decodes it, interpreting what it needs in order to perform its actions. The GET
method is usually used when the CGI application has to take information but doesn't
change anything.
The HEAD method is much the same as the GET method, except that
the server only transmits HTTP headers to the client. Any information in the body
of the message is ignored. This method can be useful when you need to handle only
a user ID, for example.
The POST method is much more flexible and uses stdin (standard input)
to receive data. A variable called CONTENT_LENGTH tells the application
how much of the data coming into the standard input is important so that it knows
when all the data has arrived. The POST method was developed to allow changes
to the server, but many programmers use POST for almost every task to avoid
the truncation of URLs that can occur with GET.
Various environment variables are used by CGI, most of which are covered in much
more detail in any CGI programming book. Describing all the variables here without
showing you how to use them would be a bit of a waste.
If you do get into CGI programming, you will probably find that most of it is
done in the Perl programming language (which we looked at in Chapter 29, "Perl").
CGI programming can be done in any language (and many Web page designers like C,
C++, or Visual Basic because they are more familiar with those languages), but Perl
seems to have become a favorite among UNIX Web programmers. Shell scripts are also
popular under UNIX (and hence Linux), but they are not portable to other operating
systems.
Perl's popularity is easy to understand when you know the language: It's powerful,
simple, and easy to work with. Perl is also portable, which means you can develop
CGI programs on one machine and move them without change to another platform.
Many Perl CGI scripts can be found on the Web. A quick look with a search engine
such as AltaVista will usually reveal hundreds of examples that can be downloaded
and studied. For example, one of the most commonly used Perl scripts is called GuestBook.
Its role is to allow users of your Web site to sign into a guest book and leave a
comment about your Web pages. Usually, the guest book records the user's name and
e-mail address, her location (normally a city and state or province), and any comments
she wants to make. Guest books are a good way to get feedback on your Web pages,
and they also make those pages a little more friendly.
When run, the GuestBook CGI program displays a form that the user can fill in,
and it then updates your server's database for you. Various versions of GuestBook
can be found around the Web, but a sample browser display showing the GuestBook Perl
CGI script is shown in Figure 52.1.
Each GuestBook Perl script looks slightly different, but the one shown in Figure
52.1 is typical. The information entered by the user is stored in the server's database
for the administrator there to read.
FIGURE
52.1. A sample GuestBook Perl script
requesting information about the user.
Figure 52.2 shows another Web page with a bunch of sample CGI programs launched from
a menu. The selection for the domain-name lookup shown in Figure 52.2 results in
the CGI application doing a bunch of standard HTTP requests to the server and client,
displaying the results shown in Figure 52.3. As you can see, the output shown in
Figure 52.3 is in standard font and character size, and no real attempt has been
made to produce fancy formatting. This is often adequate for simple CGI applications.
FIGURE
52.2. A Web page with some sample CGI
applications, a mix of Perl and C, with the domain-name CGI sample ready to launch.
The Perl CGI scripts are not complicated. The top example (Who Are You?) in the demonstration
page shown in Figure 52.2 looks up your information through an HTTP request. The
Perl code for this is shown in Figure 52.4, displayed through Netscape. As you can
see, only a few lines of code are involved. Any Perl programmer can write this type
of CGI application quickly.
FIGURE
52.3. The domain-name lookup Perl CGI
script results in this screen for the author's machine.
FIGURE
52.4. The Perl source code for
the Who Are You? application shown in Figure 52.2.
CGI programming is easy to do, especially with Perl, and adds a great deal of
flexibility to your applications. When you feel comfortable writing HTML code and
developing your own Web pages (which we can't explain in this book because of space
restrictions), you should try your hand at CGI programming and really put some zing
into your Web site.
Contact
reference@developer.com with questions or comments.
Copyright 1998
EarthWeb Inc., All rights reserved.
PLEASE READ THE ACCEPTABLE USAGE STATEMENT.
Copyright 1998 Macmillan Computer Publishing. All rights reserved.