Java 1.2 Unleashed

Contents


- 33 -

Content and Protocol Handlers


In this chapter you'll learn how to write Java content handlers and protocol handlers. Content handlers support the retrieval of objects by Web browsers. They use the Multipurpose Internet Mail Extensions (MIME) to identify the type of objects that are provided by Web servers. Protocol handlers enable browsers to work with protocols other than HTTP. Both content and protocol handlers allow you to expand the capabilities of your browser. In this chapter you'll develop examples of both types of handlers.

Using Content Handlers

If you have been extensively involved with using your Web browser, you probably have encountered a number of external viewers or plug-ins that are used to supplement the capabilities provided by your browser. These external viewers are used to display and process files that are not normally supported by browsers.

Java supports additional internal or external viewers through the content handler mechanism. Content handlers are used to retrieve objects via an URLConnection object.

Content handlers are implemented as subclasses of the ContentHandler class. A content handler is only required to implement a single method, the getContent()method, which overrides the method provided by the ContentHandler class. This method takes an URLConnection object as a parameter and returns an object of a specific MIME type. You'll learn about MIME types in the following section of this chapter.

The purpose of a content handler is to extract an object of a given MIME type from an URLConnection object's input stream. Content handlers are not directly instantiated or accessed. The getContent() methods of the URL and URLConnection classes cause content handlers to be created and invoked to perform their processing.

A content handler is associated with a specific MIME type through the use of the ContentHandlerFactory interface. A class that implements the ContentHandlerFactory interface must implement the createContentHandler() method. This method returns a ContentHandler object to be used for a specific MIME type. A ContentHandlerFactory object is installed using the static setContentHandlerFactory() method of the URLConnection class.

Multipurpose Internet Mail Extensions (MIME)

Content handlers are associated with specific MIME types. Many Internet programs, including email clients, Web browsers, and Web servers, use MIME to associate an object type with a file. These object types include text, multimedia files, and application-specific files. MIME types consist of a type and a subtype. Examples are text/html, text/plain, image/gif, and image/jpeg, where text and image are the types and html, text, gif, and jpeg are the subtypes. The URL classes provided by Java support the processing of each of these types. However, the number of MIME type/subtype combinations is large and growing. Content handlers are used to support MIME type processing.

Web servers map MIME types to the files they serve using the files' extensions. For example, files with the .htm and .html extensions are mapped to the text/html MIME type/subtype. Files with the .gif and .jpg extensions are mapped to image/gif and image/jpeg. The MIME type of a file is sent to Web browsers by Web servers when the servers send the designated files to the browsers in response to browser requests.

Developing Content Handlers

The first step in implementing a content handler is to define the class of the object to be extracted by the content handler. The content handler is then defined as a subclass of the ContentHandler class. The getContent() method of the content handler performs the extraction of objects of a specific MIME type from the input stream associated with an URLConnection object.

A content handler is associated with a specific MIME type through the use of a ContentHandlerFactory object. The createContentHandler() method of the ContentHandlerFactory interface is used to return a content handler for a specific MIME type.

Finally, the setContentHandlerFactory() method of the URLConnection class is used to set a ContentHandlerFactory as the default ContentHandlerFactory to be used with all MIME types.

A Content Handler Example

This section presents an example of implementing a simple content handler. A bogus MIME type, text/cg, is created to implement objects of the character grid type. A character grid type is a two-dimensional grid made up of a single character. An example follows:

O   O

 O O

  O

 O O

O   O

This example is a character grid object that is five characters wide and five characters high. It uses the O character to draw the grid. The grid is specified by a boolean array that identifies how the drawing character is to be displayed.

This particular character grid is represented using the following text string:

55O1000101010001000101010001

The first character (5) represents the grid's height. The second character (also 5) represents the grid's width. The third character is the grid's drawing character. The remaining characters specify whether the draw character should be displayed at a particular grid position. A one (1) signifies that the draw character should be displayed and a zero (0) signifies that it should not be displayed. The array is arranged in row order beginning at the top of the grid.

The definition of the CharGrid class is shown in Listing 33.1.

LISTING 33.1. THE SOURCE CODE FOR THE CharGrid CLASS.

public class CharGrid {

 public int height;

 public int width;

 public char ch;

 public boolean values[][];

 public CharGrid(int h,int w,char c,boolean vals[][]) {

  height = h;

  width = w;

  ch = c;

  values = vals;

 }

}

The GridContentHandler Class

The GridContentHandler class is used to extract CharGrid objects from an URLConnection. Its source code is shown in Listing 33.2.

LISTING 33.2. THE SOURCE CODE FOR THE GridContentHandler CLASS.

import java.net.*;

import java.io.*;

public class GridContentHandler extends ContentHandler {

 public Object getContent(URLConnection urlc) throws IOException {

  DataInputStream in = new DataInputStream(urlc.getInputStream());

  int height = (int) in.readByte() - 48;

  int width = (int) in.readByte() - 48;

  char ch = (char) in.readByte();

  boolean values[][] = new boolean[height][width];

  for(int i=0;i<height;++i) {

   for(int j=0;j<width;++j) {

    byte b = in.readByte();

    if(b == 48) values[i][j] = false;

else values[i][j] = true;

   }

  }

  in.close();

  return new CharGrid(height,width,ch,values);

 }

}

The GridContentHandler class extends the ContentHandler class and provides a single method. The getContent() method takes an URLConnection object as a parameter and returns an object of the Object class. It also throws the IOException exception.

The getContent() method creates an object of class DataInputStream and assigns it to the in variable. It uses the getInputStream() method of the URLConnection class to access the input stream associated with an URL connection.

The height, width, and draw character of the CharGrid object are read one byte at a time from the input stream. The values array is read and converted to a boolean representation. A CharGrid object is then created from the extracted values and returned.

The GetGridApp Program

The GetGridApp program illustrates the use of content handlers. It retrieves an object of the CharGrid type from my Web server. I use the NCSA HTTPD server on a Linux system. I've set up the server's MIME type file to recognize files with the .cg extension as text/cg.

The source code of the GetGridApp program is shown in Listing 33.3.

LISTING 33.3. THE SOURCE CODE FOR THE GetGridApp PROGRAM.

import java.net.*;

import java.io.*;

public class GetGridApp {

 public static void main(String args[]){

  try{

   GridFactory gridFactory = new GridFactory();

   URLConnection.setContentHandlerFactory(gridFactory);

   if(args.length!=1) error("Usage: java GetGridApp URL");

   System.out.println("Fetching URL: "+args[0]);

   URL url = new URL(args[0]);

   CharGrid cg = (CharGrid) url.getContent();

   System.out.println("height: "+cg.height);

   System.out.println("width: "+cg.width);

   System.out.println("char: "+cg.ch);

   for(int i=0;i<cg.height;++i) {

    for(int j=0;j<cg.width;++j) {

     if(cg.values[i][j]) System.out.print(cg.ch);

     else System.out.print(" ");

    }

    System.out.println();

   }

  }catch (MalformedURLException ex){

   error("Bad URL");

  }catch (IOException ex){

   error("IOException occurred.");

  }

 }

 public static void error(String s){

  System.out.println(s);

  System.exit(1);

 }

}

class GridFactory implements ContentHandlerFactory {

 public GridFactory() {

 }

 public ContentHandler createContentHandler(String mimeType) {

  if(mimeType.equals("text/cg")) {

   System.out.println("Requested mime type: "+mimeType);

   return new GridContentHandler();

  }

  return null;

 }

}

Compile CharGrid.java and GridContentHandler.java before compiling GetGridApp.java. When you invoke the GetGridApp program, provide it with the http://www.jaworski.com/java/chargrid.cg URL as a parameter.

The GetGridApp program's output is as follows:

java GetGridApp http://www.jaworski.com/java/chargrid.cg

Fetching URL: http://www.jaworski.com/java/chargrid.cg

Requested mime type: text/cg

height: 5

width: 5

char: j

jjjjj

  j

  j

j j

 jj

This connects to my Web server, retrieves the chargrid.cg file, extracts the CharGrid object contained in the file, and displays it in the console window. The character grid object displays a grid of j characters.

The main() method creates an object of the GridFactory class, which implements the ContentHandlerFactory interface. It then sets the object as the default content handler. An URL object is created using the URL string passed as the program's parameter. The getContent() method of the URL class is then used to extract the CharGrid object from the URL. The getContent() method results in the GridFactory object assigned to the gridFactory variable being invoked to retrieve an appropriate content handler. An object of class GridContentHandler is returned and its getContent() method is invoked to extract the CharGrid object. This is performed behind the scenes as the result of invoking the URL class's getContent() method. The CharGrid object is then displayed.

The GetGridApp program defines the GridFactory class as a ContentHandlerFactory. It implements the createContentHandler() method and checks to see if the MIME type passed to it is text/cg. If it is not, the null value is returned to signal that the Java- supplied content handler should be used. If the MIME type is text/cg, the requested MIME type is displayed and a GridContentHandler object is returned.


TIP: Check your Web server's documentation if you want to learn how to set up your Web server to work with a new MIME type. Almost all Web servers provide the capability to define new MIME types. However, there is no common approach to doing this that works across all Web servers.

Using Protocol Handlers

Most popular Web browsers support protocols other than HTTP. These other protocols include FTP, gopher, email, and application-specific protocols. Support for these protocols is usually built into the browser, causing the browsers to become larger and slower to load.

Java supports additional protocols through the use of protocol handlers, also referred to as stream handlers. These protocol handlers are used to retrieve Web objects using application-specific protocols, which are specified in the URL referencing the object.

Protocol handlers are implemented as subclasses of the URLStreamHandler class. The URLStreamHandler class defines four access methods that can be overridden by its subclasses, but only the openConnection() method is required to be overridden.

The openConnection() method takes an URL with its assigned protocol as a parameter and returns an object of class URLConnection. The URLConnection object can then be used to create input and output streams and to access the resource addressed by the URL.

The parseURL() and setURL() methods are used to implement custom URL syntax parsing. The toExternalForm() method is used to convert an URL of the protocol type to a String object.

The purpose of a protocol handler is to implement a custom protocol needed to access Web objects identified by URLs that require the custom protocol. Protocol handlers, like content handlers, are not directly instantiated or accessed. The methods of the URLConnection object that is returned by a protocol handler are invoked to access the resource referenced by the protocol.

A protocol is identified beginning with the first character of the URL and continuing to the first colon (:) contained in the URL. For example, the protocol of the URL http://www.jaworski.com is http, and the protocol of the URL fortune:// jaworski.com is fortune.

A protocol handler is associated with a specific protocol through the use of the URLStreamHandlerFactory interface. A class that implements the URLStreamHandlerFactory interface must implement the createURLStreamHandler() method. This method returns an URLStreamHandler object to be used for a specific protocol. An URLStreamHandlerFactory object is installed using the static setURLStreamHandlerFactory() method of the URL class.

Developing Protocol Handlers

The first step in implementing a protocol handler is to define it as a subclass of the URLStreamHandler class. The openConnection() method of the protocol handler creates an URLConnection object that can be used to access an URL designating the specified protocol.

A protocol handler is associated with a specific protocol type through the use of an URLStreamHandlerFactory object. The createURLStreamHandler() method of the URLStreamHandlerFactory interface is used to return a protocol handler for a specific protocol type.

The setURLStreamHandlerFactory() method of the URL class is used to set an URLStreamHandlerFactory as the default URLStreamHandlerFactory to be used with all protocol types.

A Protocol Handler Example

This section presents an example of implementing a simple protocol handler. My Web server comes with a CGI program, named fortune, that returns a fortune cookie-type message when the program's URL is accessed. This section will define the fortune protocol to access the fortune program on my Web server. The fortune protocol is not a real Internet protocol; I contrived it to illustrate the use of protocol handlers. The URL for the fortune protocol consists of fortune:// followed by the host name. For example, fortune://jaworski.com accesses the fortune protocol on my Web server.

The definition of the URLFortuneHandler class is shown in Listing 33.4.

LISTING 33.4. THE SOURCE CODE FOR THE URLFortuneHandler CLASS.

import java.net.*;

import java.io.*;

public class URLFortuneHandler extends URLStreamHandler {

 public URLConnection openConnection(URL url) throws IOException {

  String host=url.getHost();

  URL newURL = new URL("http://"+host+"/cgi-bin/fortune");

  return newURL.openConnection();

 }

}

The URLFortuneHandler class extends the URLStreamHandler class and provides a single method. The openConnection() method takes an URL object as a parameter and returns an object of the URLConnection class. It also throws the IOException exception.

The openConnection() method uses the getHost() method of the URL class to extract the host name contained in the URL. It then uses a new HTTP URL by concatenating http:// with the host name and the location of the fortune CGI program, /cgi-bin/fortune. The openConnection() method of the URL class is used to return the URLConnection object associated with the new URL.

The URLFortuneHandler class wraps the fortune CGI program using the fortune protocol. This protocol is implemented through an HTTP connection to the CGI program.

The GetFortuneApp Program

The GetFortuneApp program illustrates the use of protocol handlers. It accesses the fortune CGI program on my Web server using the fortune protocol. The source code of the GetFortuneApp program is shown in Listing 33.5. Be sure to compile URLFortuneHandler.java before compiling GetFortuneApp.java.

LISTING 33.5. THE SOURCE CODE FOR THE GetFortuneApp PROGRAM.

import java.net.*;

import java.io.*;

public class GetFortuneApp {

 public static void main(String args[]){

  try{

   FortuneFactory fortuneFactory = new FortuneFactory();

   URL.setURLStreamHandlerFactory(fortuneFactory);

   if(args.length!=1) error("Usage: java GetFortuneApp FortuneURL");

   System.out.println("Fetching URL: "+args[0]);

   URL url = new URL(args[0]);

   BufferedReader inStream = new BufferedReader(

    new InputStreamReader(url.openStream()));

   String line = "";

   while((line = inStream.readLine()) != null)

    System.out.println(line);

  }catch (MalformedURLException ex){

   error("Bad URL");

  }catch (IOException ex){

   error("IOException occurred.");

  }

 }

 public static void error(String s){

  System.out.println(s);

  System.exit(1);

 }

}

class FortuneFactory implements URLStreamHandlerFactory {

 public FortuneFactory() {

 }

 public URLStreamHandler createURLStreamHandler(String protocol) {

  if(protocol.equals("fortune")){

   System.out.println("Requested protocol: "+protocol);

   return new URLFortuneHandler();

  }

  return null;

 }

}

When you invoke the GetFortuneApp program, provide it with the fortune:// jaworski.com URL as a parameter. The GetFortuneApp program's output is as follows (you will get a different fortune each time you execute the program):

java GetFortuneApp fortune://jaworski.com

Fetching URL: fortune://jaworski.com

Requested protocol: fortune

                     JACK AND THE BEANSTACK

                          by Mark Isaak

        Long ago, in a finite state far away, there lived a JOVIAL

character named Jack.  Jack and his relations were poor.  Often their

hash table was bare.  One day Jack's parent said to him, "Our matrices

are sparse.  You must go to the market to exchange our RAM for some

BASICs."  She compiled a linked list of items to retrieve and passed it

to him.

        So Jack set out.  But as he was walking along a Hamilton path,

he met the traveling salesman.

        "Whither dost thy flow chart take thou?" prompted the salesman

in high-level language.

        "I'm going to the market to exchange this RAM for some chips

and Apples," commented Jack.

        "I have a much better algorithm.  You needn't join a queue

there; I will swap your RAM for these magic kernels now."

        Jack made the trade, then backtracked to his house.  But when

he told his busy-waiting parent of the deal, she became so angry she

started thrashing.

        "Don't you even have any artificial intelligence?  All these

kernels together hardly make up one byte," and she popped them out the

window ... 

GetFortuneApp connects to my Web server, invokes the fortune CGI program, and then displays the program's results. The source code of the fortune CGI program is available at http://www.jaworski.com/jdg/.

The main() method creates an object of the FortuneFactory class that implements the URLStreamHandlerFactory interface. It then sets the object as the default protocol handler. An URL object is created using the URL string passed as the program's parameter. The openStream() method of the URL class is then used to open an input stream to extract the information generated by accessing the URL via the fortune protocol. The openStream() method results in the FortuneFactory object assigned to the fortuneFactory variable being invoked to retrieve an appropriate protocol handler. An object of class URLFortuneHandler is returned and its openConnection() method is invoked to extract the URLConnection object. This is performed behind the scenes as the result of invoking the URL class's openStream() method. The information returned from accessing the URL is then displayed.

The GetFortuneApp program defines the FortuneFactory class as implementing the URLStreamHandlerFactory interface. It implements the createURLStreamHandler() method and checks to see if the protocol type passed to it is fortune. If it is not, the null value is returned to signal that the Java-supplied protocol handler should be used. If the protocol type is fortune, the requested protocol is displayed and an URLFortuneHandler object is returned.

Summary

In this chapter you learned how to write content handlers to support the retrieval of objects by Web browsers. You learned about the Multipurpose Internet Mail Extensions and how they are used to identify the type of objects that are provided by Web servers. You developed the GridContentHandler class and integrated it with the GetGridApp program.

You also learned how to write protocol handlers to access URLs via custom protocols. You developed the URLFortuneHandler and integrated it with the GetFortuneApp program. In the next chapter you will learn how to use the JavaMail API to develop electronic mail applications.


Contents

© Copyright 1998, Macmillan Publishing. All rights reserved.