Chapter 48

Serving the Net with Jeeves

by Mike Flether


CONTENTS


This chapter discusses Jeeves, Sun's Java-based Web server. The chapter starts with an introduction to Web servers for those not familiar with them and continues with a discussion of how Jeeves differs from other servers. An overview of some of Jeeves's features follows, including a sample "servlet" (a Java class that runs in the server to dynamically create content) and an introduction to the servlet API.

How the Other Half Lives: Web Servers

Web servers are the complement to Web browsers. When you type a URL or click a link, your browser contacts the HTTP server residing on the host from which you want to retrieve content. Using the HTTP protocol, the browser indicates what resource it wants to obtain, and the server sends back the requested data (or an error if the request fails).

You may hear an HTTP server sometimes referred to as an HTTP daemon. No, it doesn't mean you need to have your PC exorcised. This usage comes from UNIX terminology for a process that provides system services. (In mythology, a daemon is a helpful spirit.) A typical UNIX system has several daemon processes that provide services such as FTP, Telnet, and e-mail. Daemons can start running at system boot time, or they can be started by a process called inetd. The inetd daemon determines what service is requested by the port on which the request comes in. For performance reasons, an HTTP daemon is usually started at boot time rather than from inetd to avoid the overhead of starting up a new process every time a Web page is requested.

Originally, Web servers were limited to returning HTML content from files located on the server's file systems. The only interaction between a client and a server originally was a simple search facility. A page would be marked as ISINDEX, which indicated to the client that an argument could be appended to the URL for the page.

To provide interaction between a client browsing the Web and a server returning content, forms were added to the HTML standard. With the added capabilities provided by forms, the Web started moving towards its much more interactive form. Web servers also changed to support the new interaction. The Client Gateway Interface (CGI) is a standard interface that allows an HTTP server to interact with an external application. The CGI specification defines things such as how an external application is given command-line options and what environment variables contain information.

What Makes Jeeves Different from Other HTTP Servers?

Although it provides a needed functionality in today's interactive Web, CGI has some problems. One of the worst is that of security. When users access a URL provided by a CGI program, they are running a program on your Web server. If a CGI program is not carefully written, it could allow a malicious user to gain access to your server or destroy data. This problem can be avoided, but it is something to be aware of.

Another problem with CGI programs is that a separate external program must be started up each time someone requests a URL provided by a CGI program. This extra overhead may not be noticed on an average Web server, but it can make a difference. Several alternatives to CGI exist to address this problem, such as the FastCGI specification from Open Market (which uses persistent external processes to handle requests) or Netscape's server API (which is a C interface that allows code that handles requests to be dynamically linked into their Web servers).

Jeeves addresses both these concerns. In addition to providing the usual CGI interface, Jeeves supports "servlets." A servlet is a Java object that is run by Jeeves to handle a request from a client. Servlets can be loaded from the local machine on which the Web server is running, or they can be loaded over the network. Untrusted code loaded over the network is treated similarly to classes loaded by a Web browser and is limited in what it can access.

Jeeves also takes advantage of Java's threading capabilities. In addition to using multiple threads to dispatch incoming requests, servlets can be run in their own thread. This reduces the overhead necessary to dispatch requests for dynamic documents, especially on multiprocessor machines.

HTTP Server Administration Made Easy

Most HTTP servers, especially the freely available UNIX versions, must be set up by modifying a set of configuration files with an editor. Unless you are intimately familiar with your server, it is easy to make mistakes. Jeeves's configuration is handled by means of an interactive applet accessed through your browser (see Figure 48.1). Although the information is stored in files that can be edited, Jeeves's fill-in-the-blank configuration is much easier than searching through manual pages for the exact syntax to turn off this or that feature.

Figure 48.1: The Jeeves configuration applet.

To access the server configuration screens from the configuration applet, you must authenticate yourself to the server with a user name and password. The list box on the left of the screen shows the different sections of settings you can configure. To the right are the various fields, buttons, and whatnot that let you do the actual configuring of the server. Table 48.1 explains what the different sections control.

Table 48.1. Jeeves configuration groups.

SectionExplanation
HTTP ConfigurationSets HTTP protocol parameters.
Log ConfigurationDefines names used for log files that track server accesses and errors.
File AliasingControls the mapping of URLs to files and directories.
Servlet AliasingControls the mapping of URLs to servlets.
Servlet LoadingDefines where the code for servlets is loaded from and any parameters passed to the servlet.
MIME SectionSets up mappings from file extensions to the MIME type returned.
Users, Groups, and ACLsThese three sections control the creation, deletion, and modification of access control-related settings.
Resource ProtectionThis section allows you to grant privileges using the information entered in the preceding sections. More information on this is provided in "Access Control," later in this chapter.
ReauthenticateThis entry lets you enter (authenticate) yourself to the server with your username and password. It allows you to log in as a different user, or to reconnect to the server if you have restarted the server while leaving your browser to run the admin applet.

HTTP Configuration

The HTTP Configuration section of the Jeeves configuration applet allows control of the settings related to the HTTP protocol. The port on which the server listens is set with this section. You can define the maximum number of connections the server will accept. Related to this are the minimum and maximum numbers of threads required to start handling requests.

Jeeves provides support for the HTTP "keep alive" directive. This extension to the HTTP protocol allows a client to ask that a connection be kept open and used to retrieve multiple URLs. The HTTP Configuration section has fields for setting the maximum number of requests for each keep-alive session and a timeout value (to prevent a client from monopolizing a connection).

Log Configuration

The Log Configuration section of the Jeeves configuration applet lets you define the names of the log files to which Jeeves will write information. The level of information written to each log can be specified as a number from zero to 3 (zero turns the log off, 3 provides the most detail).

In addition to the access log (which tracks the resources being accessed by clients) and the error log (which notes errors such as when nonexistent files are requested), Jeeves provides an event log for servlets. Servlets can use the log() method of the java.servlet.Servlet class to write a String to the event log.

File Aliasing

Although the primary purpose of an HTTP server is to return files in response to client requests, you don't want to give out access to your complete file system to just anyone. The File Aliasing section of the Jeeves configuration applet allows you to map URLs to particular directories or files. This facility also lets you give an easy-to-remember URL to a particular resource. For example, if you have a page with a feature that changes monthly, you can set the URL http://myhost.com/features/current to point to the current month's page.

Servlet Aliasing and Servlet Loading

The Servlet Aliasing and Servlet Loading sections of the Jeeves configuration applet provide control over a server's Java servlets. Servlet Aliasing is the servlet equivalent of the File Aliasing section (just described). It allows the mappings between URLs and servlet names. The Servlet Loading section sets mappings from servlet names to the Java class for the servlet. The location from which the servlet code is loaded (local disk or from the network) and any parameters to be passed to the servlet can be defined here as well.

MIME Section

The MIME section of the Jeeves configuration applet controls the mappings between file extensions and MIME content types. This lets a browser know what type of resource it is retrieving so that it can handle it properly. For example, if you have several MS Word documents, you can use this configuration to tell Jeeves to return a content type of application/msword for all files that have the file extension .doc. Assuming that the client's browser is properly configured, it would automatically launch the application to view the file.

Users, Groups, ACLs, and Resource Protection

The Users, Groups, ACLs, and Resource Protection sections of the Jeeves configuration applet allow you to control who can access resources on your server. Refer to the following section, "Access Control," for a detailed description of how Jeeves provides resource controls.

Access Control

Jeeves provides a very flexible system for controlling access to Web pages and servlets. Privileges can be granted to users (referred to as principals in this context), groups of users, or network hosts.

Each user has a account name (which must be unique) and a password. The Users section of the configuration applet allows you to create and delete user accounts as well as modify a user's password. Likewise, the Groups section allows the creation and deletion of groups of users. Individual user accounts can be assigned or removed from groups.

Access Control Lists (ACLs) are the basis for resource controls with Jeeves. An ACL can be made up of any combination of users, groups, and network hosts. Membership in an ACL can be either positive (the entity is in the list) or negative (the entity is not in the list). In addition to which entities are in the ACL, the ACL defines which HTTP request (that is, GET or POST) can be sent. Once you have created an ACL, the Resource Protection section of the configuration applet allows you to assign lists to a particular URL. Figure 48.2 shows what the ACL Configuration screen looks like.

Figure 48.2: The Access Control List Configuration section.

The practical upshot of this arrangement is that Jeeves gives you very flexible control over deciding who can get what from your server. For example, you can create an ACL that contains the hostnames for each department's machines. Engineering documents can be specified as available to the developer ACL and the quality assurance ACL. This information is available only to those two departments-the marketing department's machines are not allowed to retrieve it. If the marketing people want to track who is accessing a particular resource, individual user accounts (or a group) can be placed in an ACL and that ACL can be assigned to the URL in question.

Servlet API

The API for servlets is similar to that for applets. Servlets must extend the java.servlet.Servlet class. Like the java.applet.Applet class, the Servlet class provides methods to retrieve parameters (getInitParameter()). The getServletContext() method obtains a ServletContext object, which provides references to other servlets; the method also provides a way to find out what server the servlet is running on.

If a servlet needs special initialization, an init() method can be provided. This method is called when the servlet class is loaded. A servlet can contain a getServletInfo() method to return information about what the servlet does and who its author is. Jeeves displays this information in the configuration applet.

The most important method for a servlet is the service() method. This method is invoked by the server whenever a request is received. Each time a client requests a URL corresponding to a servlet, Jeeves calls the service() method with two parameters: an object implementing the ServletRequest interface and an object implementing the ServletResponse interface.

The ServletRequest interface provides information similar to that passed to a CGI program in other servers. The interface defines methods to retrieve information such as the URL for the request, the remote host and port the request was received from, and any user authentication information.

The ServletResponse interface contains methods that allow a servlet to communicate the results of a request back to the client that asked for it. A getOutputStream() method provides an OutputStream object that writes to the client. Several methods are provided to set HTTP information such as the response code, the MIME content-type of the reply, and the status message returned.

In addition to the servlet API, Jeeves provides several other APIs such as classes that assist in creating HTML (the sun.server.html package).

Example Servlet: A Simple Phone Database

We finish off this chapter with an example servlet. This servlet reads in a text file containing names, titles, and phone numbers for people and stores this information into a java.util.Hashtable. When a client requests the URL corresponding to the servlet, the servlet returns a form with a text field. The user can enter a name in the field and click the Submit button. The servlet searches its hash table and returns the corresponding phone number if it exists.

First off are the import statements and class definition. Listing 48.1 also defines the Hashtable phoneList and initializes it to null.


Listing 48.1. The phoneSearch servlet.

import java.io.*;
import java.util.*;
import java.servlet.*;

public
class phoneSearch extends Servlet {
  // Hashtable to hold phone list information.  Loaded by init()
  Hashtable phoneList = null;

Next we define the init() method. As with an applet, this method is called automatically after the class is loaded. The init() code takes care of opening the phone database file and then calls the readDatabase() method to load the information into the hash table. It uses the Servlet.log() method to provide status updates that are written into the server's event log (see Listing 48.2).


Listing 48.2. The phoneSearch.init() method.

  // Initialize servlet.  Reads in phone database from file.
  Public void init( ) throws Exception
  {
    fileinputstream infile = null; // For reading data file
    String phonefile = null;       // Filename of database

    // Log when we start up.
    Log( "phonesearch Servlet Started." );

    // If a filename is given in our parameters use it
    if( (phonefile = getinitparameter( "file" )) == null ) {
      phonefile = "phone.txt";    // otherwise default to phone.txt
    }
    
    // Log what phone database file we're using
    log( "Using phone database file '" + phonefile + "'." );

    // Try and open the phone database file
    try {
      infile = new fileinputstream( phonefile );
    } catch( filenotfoundexception e ) {
      log( "Database file '" + phonefile + "' does not exist." );
      log( "Error was: " + e.getmessage() );
      throw e;    // Rethrow exception
    } catch( ioexception e ) {
      log( "I/O Error opening database file '" + phonefile + "'." );
      log( "Error was: " + e.getmessage() );
      throw e;    // Rethrow exception
    }

    // Read the database into our hashtable
    readdatabase( new datainputstream( infile ) );

    // Log that we're ready for business.
    Log( "Read database.  Phonesearch ready." );

  }

The readDatabase() method takes a DataInputStream from which it reads the phone information (see Listing 48.3). (Lines starting with a # character are ignored.) Each line should have three fields: the person's name, title, and phone number. The format of each line is the three fields separated by pipe (|) characters. If the line does not contain a | character, we note the line number to the event log and go on to the next line of the file. Correctly formatted lines are split into two parts: the name field, and the title and phone fields. The title and phone string is inserted into the hash table with the name (converted to all lowercase letters) as the key.


Listing 48.3. The phoneSearch.readDatabase() method.

  public void readDatabase( DataInputStream in )
    throws Exception
  {
    int pos, oldpos, lineNumber;
    String name, info;

    // Get an empty hashtable
    phoneList = new Hashtable( );

    lineNumber = 0;    // Initilaize line numbers

    try {
      try {
        // Read in first line
        String line = in.readLine( );

        // While there are lines to read . . .
        while( line != null ) {
          lineNumber++;    // Increment line count

          // Allow comment lines starting with an octothorpe
          if( line.charAt( 0 ) == '#' ) {
            line = in.readLine( ); // Read next line && loop
            continue;
          }

          // If the line doesn't have a | character log it and go on
          if( (pos = line.indexOf( '|' )) < 0 ) {
            log( "Malformed phone database line at line " + lineNumber );
            line = in.readLine( ); // Read next line && loop
            continue;
          }

          // Copy name from line
          name = line.substring( 0, pos ).toLowerCase();
          // Leave title and # with | separator as one item
          info = line.substring( pos + 1, line.length() );

          // Place info into hashtable with name as key
          phoneList.put( name, info );

          line = in.readLine( ); // Read next line
        }
      } catch( EOFException e ) {
       ;
      }
    } catch( Exception e ) {
      log( "Error while reading database: " + e.getMessage( ) );
      throw e;
    }

    log( "phoneSearch read " + lineNumber + " lines." );
    log( "phoneSearch hashtable has " + phoneList.size() + " items." );

    return;    // Done reading database 
  }

The service() method is the heart of any servlet. Whenever Jeeves receives a request for a URL that maps to a servlet, it calls that servlet's service() method. Two parameters are passed with this call: one representing the request (a ServletRequest) and one the servlet's reply (a ServletResponse). The servlet can use information from the ServletRequest to determine how it was called and who called it. The ServletResponse allows the servlet to generate the HTTP headers for its reply, as well as providing an OutputStream on which to write the reply. Listing 48.4 shows the service() method for the phoneSearch data base.


Listing 48.4. The phoneSearch.service() method.

  public void service( ServletRequest req, ServletResponse res )
    throws IOException
  {
    PrintStream out = new PrintStream( res.getOutputStream() );

    // Set our content type, that the output shouldn't be cached
    res.setContentType( "text/html" );
    res.setHeader( "Pragma", "no-cache" );
    res.writeHeaders( );    // write out HTTP headers

    // Write out HTML for our search form    
    out.println( "<html>" );
    out.println( "<head><title>Phone List Example Servlet</title></head>" );
    out.println( "<body bgcolor=\"#ffffff\">" );

    // Start of form
    out.println( "<form method=\"GET\">" );
    out.println( "<h1>Phone List Servlet</h1><hr>" );

    // Print some instructions for our user
    out.println( "Enter the name of the person you want to search for." );
    out.println( "Names are recorded as all lowercase.  Search terms" );
    out.println( "are converted to all lowercase." );

    // Create a text field for search term
    out.print( "<p><input TYPE=\"text\" NAME=\"search\" VALUE=\"" );

    // See if we were given a query parameter (i.e. someone's called
    // us already).
    String search = req.getQueryParameter( "search" );

    // If we have search will be non-null and we will use that
    // as the default value in our input box
    if( search != null )
      out.print( search );
    out.println( "\">" );    // Finish tag for search INPUT

    // Create a submit button
    out.println( "<input TYPE=\"submit\" NAME=\".submit\"><p><hr>" );

    // If we were given a search parameter and its length is non-zero
    if( search != null && search.length() != 0 ) {
      // Make the search item all lowercase
      search = search.toLowerCase();

      // See if search term is a key in hashtable
      String info = (String) phoneList.get( search );
      if( info != null ) {
        // Find separator in info 
        int pos;
        pos = info.indexOf( '|' );

        // Format the data in a spiffy table
        out.println( "<table width=\"75%\" border=\"2\">" );
        out.println( "<tr><th>Name</th><th>Title</th><th>Phone</th></tr>" );

        // Print out a row with the data from the query and hashtable
        out.println( "<tr><td>" + search + "</td>" );
        out.println( "<td>" + info.substring( 0, pos ) + "</td>" );
        out.println( "<td>" + info.substring( pos + 1, info.length() ) + "</td>" );

        out.println( "</table>" ); // Mark the end of our table
      } else {
        // Search term wasn't in hashtable, so let them know that
        out.println( "No one by the name '" + search + "' was found." );
      }
    }

    // Close out form, body, and html tags
    out.println( "<p></form>" );
    out.println( "</body></html>" );

    return;    // We're done
  }

Last are the getServletInfo() and destroy() methods (see Listing 48.5). The getServletInfo() method should return a String with information such as what the servlet does, who the author is, or version information. The destroy() method for this servlet just logs the fact that it was called to the event log. If you have a servlet that has, for example, connections to a database or information that has to be written out to disk, the destroy() method can handle those tasks.


Listing 48.5. The phoneSearch.getServletInfo() and phoneServlet.destroy() methods.

  // Provide a little information about what servlet does
  public String getServletInfo( )
  {
    return "Simple Phone Database";
  }

  public void destroy( )
  {
    // Log when we're destroyed
    log( "phoneSearch Servlet destroy() called." );
  }
}

Using the phoneSearch Servlet

The simplest way to use the servlet is to compile the code (phoneSearch.java, located on the CD-ROM that accompanies this book) and place the class file into the servlet directory under the main Jeeves directory. The phone database file should be placed in the Jeeves root directory and be named phone.txt. If you name the file differently or place it in another directory (in the servlet directory, for example), you must specify the location with the file parameter in the Servlet Loading section of the configuration applet. Figure 48.3 shows the phoneSearch servlet in action.

Figure 48.3: The phoneSearch servlet in action.

Summary

After reading this chapter, you should have an understanding of Jeeves's capabilities and what sets it apart from other HTTP servers. You should have an idea of how to configure Jeeves to map URLs to different files or servlets, as well as how to write your own servlets to provide dynamic content.