Chapter 17 Network-Extensible Applications with Factory Objects

How Factories Work
Factory Support in the Java Library
Factory Object Implementation Considerations
Supporting a New Kind of Factory
Security Considerations
Summary

One of the most important extensibility features of the Java library is its use of factory objects. Factory objects permit the actual type of new objects to be determined at runtime, based on data or circumstances of the moment. Java did not originate the concept-other languages have factory objects or similar mechanisms. However, the Java library makes effective use of factories to make some of the core facilities flexible and extensible. Factory objects are crucial building blocks for making applications which can be extended dynamically using code from the network.

This chapter explains factory objects and how they work. You learn about the existing factories in the Java library and how to extend them. Because factories and the kinds of objects they create are often closely related, you learn about those objects and how to write new, specialized versions that the factories can use for special situations. You also learn how to build support for new factory objects into your own class libraries and applications, and how to recognize situations where it would be a good idea to use factories.

How Factories Work

While an application is running, it is constantly creating new objects of various types in order to accomplish its function. The number and size of the objects are often determined at runtime-in response to changing conditions and input data-but the type of each object is usually fixed when the program is written. When you write code to allocate a new object with a new operation, you choose a particular type for the object. A new operation does what it's told; it does not allocate whatever subclass seems most appropriate in a particular situation.

Sometimes, though, that's exactly what you need. Your program needs to adapt not only to the size and number of data items it's asked to deal with; it needs to adapt to the kind of data. For example, when asked to open a file, a URL, or a mail message, the program might need to vary its behavior depending on the particular type of data found in each of those entities. It's no fun to have to write a big multiway switch to choose what sort of object to allocate, and it's not a particularly good idea, either. It would be better if that knowledge were located in one place so that applications could share it. It would also be good if knowledge about new types and data formats could be added without having to rebuild all the applications that need the new support.

Fortunately, factory objects provide just the mechanism needed to handle such situations. In some situations, instead of allocating a particular type of object explicitly with new, you request that another object allocate the object for you. That other object, the factory object, looks at the current situation and decides on a specific class of object that fits the bill. The factory allocates the new object (after loading the class, if necessary) and returns it to you.

Of course, each kind of object that the factory can return must be a valid subtype of the nominal type that the factory object returns (that is, they must all have a common superclass or implement a common interface). Somewhere in the code for the factory object, there probably will be a large multiway switch that chooses the appropriate class for each situation. The factory might also read a configuration file or fetch configuration information from a URL so that the initial list of classes can be extended without modifying any of the factory's code. The knowledge and messiness is encapsulated in one place, however, and the benefits (flexibility and extensibility) are enjoyed by all the code that uses the factory.

An Example from the Java Library

There are a few other complications, though. To get a clearer idea of how a factory object really works, let's take a look at how a factory object is used in a common Java operation: creating a URL object.

You start by allocating a URL object pointing to a particular Web page:

URL doc = new URL("http://www.utdallas.edu/~glv/");

The first thing to notice is that you really do just allocate the object with a new expression. The application code doesn't call the factory object directly; usually that's done in a library class. The URL class actually calls the factory for you. This helps keep the library interface simple: It's best if application programmers can always use new to allocate objects, rather than having to remember the cases where factory objects need to be involved. Keeping the interface simple also helps promote consistency. If application programmers have to call the factory object themselves when they need a URL object, some will surely forget, and their applications won't have the easy extensibility that most Java applications should.

The URL constructor takes the URL you give it and parses it into its various parts: protocol, host, and so on. At that point, there's little more that it can do. All the processing that is common to most different kinds of URLs has been done, and the rest depends on the protocol involved. This is where the factory comes in. The URL class contains the following instance variable:

URLStreamHandler handler;

The handler variable refers to the object that does most of the real work involved with the URL, and that's the one that gets allocated by the factory. The URL constructor initializes it this way:

if ((handler = getURLStreamHandler(protocol)) == null) { throw new MalformedURLException("unknown protocol: " + protocol); }

The getURLStreamHandler method is a static method in the URL class, and it contains the call to the factory object:

URLStreamHandler handler; if (factory != null) { handler = factory.createURLStreamHandler(protocol); }

The factory is stored in a class variable, factory, which is set by the application. The factory decides what kind of handler is needed for the protocol, allocates the appropriate object, and returns it. What does getURLStreamHandler do if factory is null? Here's what happens next:

// Try java protocol handler if (handler == null) { try { String clname = "sun.net.www.protocol." + protocol + ".Handler"; handler = (URLStreamHandler) Class.forName(clname).newInstance(); } catch (Exception e) { } }

Notice that getURLStreamHandler does a part of the factory object's job itself. In fact, it implements a sort of fallback factory: If the application has not supplied a factory for URLStreamHandler objects, or if the factory cannot supply an appropriate object, this method within the URL class can do a minimal job. It looks for a library class with a conventional name, and if it finds one that matches, that class is assumed to be the handler. This might fail, but it should fail in a reasonable way. If the class doesn't exist, the Class.forname(clname) call throws an exception. If the class isn't a valid subtype of URLStreamHandler, the cast will fail.

The URL class does this because it's intended as a general-purpose library class. Policies about how to discover and locate protocol handlers are left to the factory, which is a part of an individual application, so that the application authors can make the important decisions about how to configure and extend the application's capabilities. Probably a few good URLStreamHandlerFactory implementations will appear and will be shared by most applications, but a full-fledged, configurable factory implementation really isn't appropriate for the standard Java library. On the other hand, some minimal level of functionality is essential for the library, and the getURLStreamHandler fallback code provides it. (The fallback could have been implemented as a default factory object that could be replaced by the application, but the security model gets in the way. See the section "Security Considerations" later in this chapter for more information.)

There's one other complication that should be mentioned now. Because the type of handler is entirely dependent on the protocol, the URL class maintains a cache of handlers for different protocols and reuses them to avoid having to call the factory for each new URL. The URLStreamHandler class-as well as the interface between the handlers and the URL class-has been carefully designed so that a single handler instance can handle multiple URLs simultaneously. Alternative strategies might have involved calling the clone method (or getClass().newInstance()) on an existing handler of the appropriate type or simply calling the factory again for a new instance.

Here's the complete code for the getURLStreamHandler method, so you can see the entire picture:

/** * A table of protocol handlers. */ static Hashtable handlers = new Hashtable(); /** * Gets the Stream Handler. * @param protocol the protocol to use */ static synchronized URLStreamHandler getURLStreamHandler(String protocol) { URLStreamHandler handler = (URLStreamHandler) handlers.get(protocol); if (handler == null) { // Use the factory (if any) if (factory != null) { handler = factory.createURLStreamHandler(protocol); } // Try java protocol handler if (handler == null) { try { String clname = "sun.net.www.protocol." + protocol + ".Handler"; handler = (URLStreamHandler) Class.forName(clname).newInstance(); } catch (Exception e) { } } if (handler != null) { handlers.put(protocol, handler); } } return handler; }

Factory Support in the Java Library

The example of the URLStreamHandler is typical of the support for other types of factories in the Java library: The support is there to use the factories if they are supplied by the application, and a simple fallback is implemented within the library code; but no real factory objects are supplied. Java supports extensibility wherever it can, but it leaves specific policies (such as where to search for extensions and how to find the right one) up to the applications. This section explains the details of the factory support found in the Java library and how to implement specialized factories that the library can use. As you will see, factories and the objects they return (handlers or implementations) are closely related, so this section provides insight into how to write both the factories and the handler objects. Each factory and handler combination works a little differently.

The Java library knows about three kinds of factories:

SocketImplFactory
URLStreamHandlerFactory
ContentHandlerFactory

The Socket Implementation Factory

Socket implementations are used internally to the Socket class to provide the basic socket functionality. In general, instances of Socket and ServerSocket just pass operations on to their internal SocketImpl object, which does all the work. There is only one supplied socket implementation, PlainSocketImpl, that does conventional socket handling. Additional socket implementations can be written to handle firewalls and other situations in which sockets must use a proxy server to access certain machines.

There are hooks for two separate SocketImplFactory objects: One is in the Socket class (set with the setSocketImplFactory method), and the other is in ServerSocket (set with setSocketFactory). Typically, both factories are instances of the same class, or even the same object, but they don't have to be. Both Socket and ServerSocket call their factory's createSocketImpl method in their constructors. That method doesn't contain any parameters, so it doesn't return different types of socket implementation objects based on the parameters of the socket. It returns the same kind for every socket, based on application configuration information. In keeping with this, the fallback code (for the case where no factory has been created) is also simple: It always creates a new PlainSocketImpl.

When writing a SocketImplFactory object, you don't need to build a lot of intelligence into the factory. It is probably best simply to allow users to provide configuration information that specifies what kind of socket implementation should be used. You can supply a couple of socket implementations to handle common cases and provide a configuration dialog or some other mechanism for selecting one of those. Because all users within a particular site will probably require the same configuration, this also is a good case for permitting site administrators to supply a global configuration. That way, individual users don't have to be bothered with knowing what kind of firewall they have, and you don't have to be bothered with configuring your application on an individual basis. Finally, don't restrict users to just the socket implementations you supply. Allow them to specify an arbitrary class. That way, sites with unusual firewall policies (or vendors of new firewall software) will be able to write their own SocketImpl classes, and they won't be excluded from using your application.

The URL Stream Handler Factory

The first part of a URL, up to the colon, is called the URL's scheme or protocol. Common URL protocols include HTTP, FTP, and news. Each protocol is implemented by a subclass of URLStreamHandler. The URL class chooses which handler to use for each URL by calling the URL stream handler factory.

There is a close relationship between URL stream handlers and several other classes. Internally, a URL stream handler uses a specialized version of the URLConnection class, usually implemented to accompany a particular implementation of URLStreamHandler. The URLConnection class, in turn, contains the third factory object supported by the library: the content handler factory. URLConnection objects are responsible for determining the type of the URL's data and passing that content type to the content handler factory.

The division of responsibility between all these classes seems complicated at first, but it is really quite easy to grasp. The URL object provides a handle and an abstract representation of the URL so that an application doesn't have to keep track of all of the various pieces all the time. Internally, it uses a subclass of URLStreamHandler to hold the protocol-specific information about the URL. The stream handler knows how to parse the rest of the URL-many URL schemes use the same syntax for the part after the colon, but some have specialized syntaxes, and the stream handler is responsible for understanding that part of the URL syntax. Beyond the URL syntax, however, the URL stream handler doesn't actually know much. It knows about a companion class (a subclass of URLConnection), which is responsible for implementing the protocol. The associated URLConnection object knows how to open a connection based on the URL information, handle the actual protocol operations to retrieve or deliver a document (or several), and close the connection.

After that is done, the URLConnection object has gained access to two things: the document of interest and some information about that document. After learning what kind of document it has (the media type), the URLConnection can either provide an I/O stream to the raw document data or create a ContentHandler object to convert the document into an object representation.

This seems complicated, but there are a lot of separate issues to deal with, and the Java library designers were right not to get them confused. Each separate concept is represented by one class or family of classes:

Basic URL abstraction	`URL`
Protocol-specific URL syntax	`URLStreamHandler`
Protocol handling	`URLConnection`
Data format handling	`ContentHandler`

The first of these four concepts is independent of protocol or data format, and so it is represented by one class. The second and third are both tied to particular protocols, so appropriate classes are chosen by a factory object (the factory chooses the URLStreamHandler, which chooses the matching URLConnection). The last is tied to specific data formats, but is independent of protocol, so classes to fit data formats are chosen by a different factory object.

What this means for writers of specialized URLStreamHandler subclasses is that, in general, each URLStreamHandler needs to have a companion URLConnection class. For example, Sun's classes, which are used to implement the applet viewer, include the following two classes:

sun.net.www.protocol.http.Handler
sun.net.www.protocol.http.HttpURLConnection

Make sure your stream handler and connection classes maintain the correct division of responsibility. The stream handler should understand the URL syntax and choose the connection class, with the mechanics of the protocol being left to the connection class.

Besides the workings of the protocol, the connection class has one other important responsibility: calling the content handler factory when necessary. To do that, the connection object must first determine the type of the document being retrieved. That's a protocol-specific matter, but some protocols, such as FTP, don't really provide that information, so the connection object must guess. Fortunately, the abstract URLConnection class provides two utility methods that can be useful for guessing a file's type from its name or content. The guessContentTypeFromName method takes the file's name as a parameter and attempts to intuit the type from the file's extension using a built-in table. Given a file with the extension "jpg," for example, it would assume that the file was a JPEG image file. The other method, guessContentTypeFromStream, takes the data input stream as a parameter and inspects the first few bytes for characteristic patterns that identify different file types.

All this might be useful information for writing URL stream handlers and connection classes, but what about the factory object itself? That might be one of the easiest parts. You certainly want to give it built-in knowledge of the stream handlers you supply with your application and the URL scheme identifiers that match them. Permitting users to configure the factory and add new handlers is also a good idea. Finally, it might be a good idea to write your URLStreamHandlerFactory to consult a network-based registry and load stream handlers from the network under certain conditions. Most of these issues aren't peculiar to URL stream handlers and are discussed in the section "Factory Object Implementation Considerations," later in this chapter.

Once you have written your factory, install it in the URL class using the URL.setURLStreamHandlerFactory method.

The Content Handler Factory

Content handlers are associated with URL connections, and they are responsible for interpreting different data formats. Data formats are identified by MIME media type names. Common formats include the following:

text/plain	A plain text document
text/html	An HTML document
image/gif	A GIF image
image/jpeg	A JPEG image
audio/basic	A µ-law (.au) format audio file
audio/wav	A WAV audio file
video/quicktime	A QuickTime video clip
video/mpeg	An MPEG video clip
multipart/mixed	A container with multiple subparts
application/pdf	An Adobe PDF file
application/postscript	A PostScript document
application/vrml	A VRML scene description

A content handler must be able to understand its format and generate an appropriate Java object representing that format. For example, a content handler designed for image/gif might build an instance of the java.awt.image class. ContentHandler is an abstract class, with one method: getContent. With a URLConnection object as a parameter, the method reads the data from the connection, builds the appropriate object representation, and returns it. When writing a content handler, it might be helpful to read Chapter 5, "Building Special-Purpose I/O Classes." It contains a discussion about building classes that provide an object representation of structured data.

The content handler factory is found within the URLConnection class and can be set using the URLConnection.setContentHandlerFactory method. URLConnection has its own getContent method, just like ContentHandler. When URLConnection.getContent() is called, the URLConnection object queries the content handler factory for the right handler and calls the handler's getContent method, with this as the parameter. In case there is not a content handler factory, the fallback code looks for a built-in handler class using the content type. For example, if the content type is text/html, the default URLConnection implementation looks for the class sun.net.www.content.text.html. If it doesn't find a class with that name, it will simply return null. The JDK doesn't come with such a class, but the HotJava application does, and any application is free to provide content handler classes of its own.

Factory Object Implementation Considerations

Unlike the handler or implementation objects that they generate, factory objects have a lot in common. They may be very simple, like SocketImplFactory, or more complex, like ContentHandlerFactory, but they are all essentially the same: Based on the current data or circumstances, factory objects figure out which specialized type of object is best equipped to deal with the situation. Factories either have a built-in knowledge of the alternatives or know where to go to find out about them. The only difficult issues involved in writing a factory object are the strategies used to discover the right answer and how much trouble the factory goes to before it gives up.

The simplest strategy for building a factory object is to hard-code knowledge of several classes into the factory. At runtime, a simple table lookup will suffice to determine whether the situation is one that the application understands. If it is, the factory can allocate and return an instance of the appropriate class. Only slightly more complex is the strategy used by the built-in fallback code in the Java library: searching for a class with a conventional name via the CLASSPATH. The first approach might suffice in some very simple situations, but if there is no possibility of extending or configuring the factory's knowledge, it's hardly worth having a factory at all. The second approach at least enables users and site administrators to configure and extend the system in useful ways. It's still a very limited approach, however, because it requires all the handlers to be in the same package.

Another approach, which is easier and more flexible for users, is to provide a configuration file-or better yet a configuration dialog-so that new handlers or implementations can be used without having to be placed in the same package as the built-in handlers. They still have to conform to the appropriate Java type rules and so on, so this won't violate any language assumptions. It just makes things flexible for the people using the application.

The problem with all of those approaches is that they assume that the users or site administrators will be looking for new handlers, fetching them, and taking the trouble to install them correctly. It would be much better to take advantage of Java's networking and security capabilities to do all that work within the factory object. One way of doing that would be to rely on a central online registry that could be queried over the network, returning a URL which could be used to retrieve the appropriate class. Your company could maintain such a registry, collecting information about appropriate handlers as they became available. In the case of handlers written by you or others in your organization, your customers can access the extra functionality immediately, without having to install any additional software. If users or other vendors write handlers to support formats or protocols that they are interested in, you can make sure that other users also get the advantage of those classes. You and your users can save a lot of trouble and expense.

I anticipate that public, general-use registries will appear for classes such as URLStreamHandler and ContentHandler. Your applications can simply make use of those, and you will need only to maintain registries for classes that are tied closely to your own application.

There's one more approach to finding applicable classes. It's a little trickier than the others, but it can yield huge benefits in flexibility and usefulness. If your application is one that works with data from the network (as Web browsers do, for example), you can allow data providers to supply handlers themselves. You simply need to document where to put classes in relation to the data and how to name them, and then look for them according to those rules when other strategies fail.

Two examples might help to illustrate this idea. First, imagine an application fetching a document via an HTTP URL and finding that it has content type image/spiffy. After trying other strategies to find a handler, the content handler factory might return to the site from which the unrecognized document was fetched to try to find a handler there. It might look for a class called ContentHandler_image_spiffy, first in the same directory where the document was found, and then perhaps in another directory at that site with a conventional name-perhaps classes/content_handlers.

As a second example, imagine a Web browser that has just fetched an HTML document from www.foo.com. One section of that HTML document contains the following link:

<a href="mirror://www.bar.com/foomobile.gif">A terrific picture of a Foomobile!</a>

It sounds good, but what about this "mirror" protocol? As humans, we might guess from the name that it's a new protocol designed to enable a document to be mirrored at several places around the Internet, but that doesn't help the program to know what to do with it (and we might be wrong, in any case). The URL stream handler factory could pursue a strategy similar to the one in the previous example to find a handler for the mirror protocol, looking first in the same directory where the document containing the link was found and then in a special directory at the same site.

If you write a factory object that fetches new classes from the network, it's important to bear two things in mind. It is helpful to cache those downloaded classes locally so that they don't have to be fetched anew for each use. You can then treat the cache as a private, local registry of the type described previously. The other thing to remember is that, for such features to be useful to your users and the people who provide useful data on the network, they have to be well- documented and reasonably well-known.

There are also security considerations involved when writing factory objects and handlers, no matter what strategy you implement for finding applicable handlers. They are addressed in the section "Security Considerations," later in this chapter.

Supporting a New Kind of Factory

The classes that use factory objects are, in some ways, more interesting and important than factory objects themselves. The Java library includes four such classes (Socket, ServerSocket, URL, and URLConnection), but that doesn't mean you won't find a reason to build another, whether you are building a complete application or a reusable library. How do you know when a new kind of factory is what you want-and how do you add the support for it?

When Are Factories Useful?

In general, you need a factory object when you want to provide an implementation abstraction: an abstract interface to some functionality that hides the details of shifting implementations that might be required.

More specifically, you should use a factory object when three conditions apply:

You want to provide a uniform, high-level interface to a concept, which can have multiple underlying implementations.
The choice of a particular implementation depends on data, environment, or other runtime circumstances.
You cannot know in advance the entire set of implementations and the circumstances that might call for them.

There are many concepts that might be good candidates for implementation abstractions. Here are just a few examples:

Data types in a spreadsheet or database
Specialized notations (for example, mathematical equations or chemical diagrams) in a technical word processor
Alert handlers in a network management system
Graphical user interface widgets, such as the elements used to provide fill-out forms in HTML documents
Special-effect "plug-ins" in an image processing program
Reusable tool components, such as spelling checkers, searching engines, printer drivers, and toolbars

There are probably many other situations that could be added.

Each of the three kinds of factory objects supported by the Java library meets these three conditions. There are high-level classes that let application code ignore the details of multiple underlying implementations. The choice of implementation depends on the runtime situation-either data (in the case of URL stream handlers and content handlers) or environment (in the case of socket implementations). There is no way to predict what new kinds of protocols, data types, and firewall proxy interfaces might be developed.

Tips for Building Implementation Abstractions

There are a few rules of thumb you should follow when building implementation abstractions and the factory objects and handlers that make them work:

Hide the use of the factory in a library class
Don't try to force the handlers to be completely uniform
Supply an example factory, but put fallback code in the library class, in case there's no factory at all
Keep a cache of handlers
Remember security issues

Make sure that the application-level class (the class that provides the implementation abstraction itself) takes care of the factory behind the scenes. Don't leave it to the application to call the factory when necessary-make the abstraction cleaner by hiding that detail.

On the other hand, don't take the abstraction too far. Sometimes it's not possible to hide every detail. In the Java library, socket implementations represent one extreme where the abstraction is nearly perfect, and the application using the code doesn't really have to know anything about underlying mechanisms. URL stream handlers also can hide nearly every detail. Content handlers, however, can hide things only up to a point. Then, they must return an object that provides a runtime representation of the document they were asked to interpret. That object might be a String containing a plain text document, an Image, an AudioClip, or an instance of a class that can format and display a structured document (such as an HTML). The application must know what to do with those objects, and a new content handler might return a type of object that the application can't handle. In spite of those complications, it's still a worthwhile abstraction and hides a lot of messy details.

(Perhaps you think that, with a little care, the content handler abstraction could have been more thorough, to the point of providing a unified interface to data objects so that an application could handle them all in a uniform way. If so, that's terrific! I can't wait to see the result.)

If you have a fairly good idea of how to go about finding the appropriate handler class for the situation, go ahead and supply a sample implementation of the factory so that applications can use it if possible. Follow the example of the Java library implementors, however, and provide some fallback behavior in the primary class, which is used if no factory is installed. It can be very simple, but it helps to make your classes robust. It will work for the most common cases in the event that the application writers forget to install a factory, or in circumstances where the factory cannot do its job.

Keep a cache of handlers. If you're loading handlers from the network, this is vitally important, but it's useful even if all the handlers are available locally. If you implement your abstraction so that the same handler instance can handle multiple cases of the abstraction, you can cache not only the handler classes, but the handler instances themselves. The Java library does this for URL stream handlers and content handlers. Given the simple fallback strategy that the library classes use in the absence of real factory objects, the efficiency benefits of the approach might not be obvious to you at first, but a real factory object might have to do a lot of work to find the right handler, including several disk and network accesses. A simple cache of previously used handler instances can avoid all that work in most cases. Implementing the cache in the abstraction class also means that it only has to be done once, and all factory object implementations gain the benefit automatically. The source code for the URL.getURLStreamHandler method listed near the beginning of this chapter is a good example of how to implement a simple and effective handler cache.

Most importantly, pay attention to the security issues surrounding your factory. There are security implications even if you only load handlers from trusted sources.

Security Considerations

When writing a factory object or an implementation abstraction that uses a factory, you need to be aware of the security implications.

First, you need to protect the factory itself. If untrusted code (possibly an applet in an unrelated part of an application) can install an untrusted class as a factory object, it can garble data, transparently substitute data from a completely different source, or even steal outgoing data. The user of the application will never be the wiser. The way you protect against this is by consulting the application security manager when installing a factory object. The primary object of your implementation abstraction will probably provide a method for installing a factory, much as the URL class provides setURLStreamHandlerFactory. To protect against sabotage, that method includes this code:

if (factory != null) { throw new Error("factory already defined"); } SecurityManager security = System.getSecurityManager(); if (security != null) { security.checkSetFactory(); }

The first part provides a little additional security by permitting the factory to be set only once. You may choose to forgo that precaution, but you really should only need one factory object anyway (at least after you've finished debugging the factory). The second part is the important part. If a security manager is installed, the call to security.checkSetFactory gives the security manager the chance to see whether the new factory object is being supplied by trusted code. If not, the security manager will throw a SecurityException, and the next line of code (which actually installs the factory object) will never be reached.

Additionally, you need to ensure that the factory cannot be set directly, bypassing the method that performs the security checks. The variable that holds the factory object and the method used to set it should be static so that there is only one factory and so that the method cannot be overridden in a subclass. The variable should also be private so that a subclass cannot access it directly. Depending on the circumstances, you may also wish to make the class final so that there can be no subclasses (that will not always be practical, however).

If the abstraction you are providing is a means for accessing resources that could be abused by malicious code, you need to ensure that those resources are protected. You have a choice: Should the security checks be done in the high-level class or in the handlers?

One thing to consider is that you might, at some point, be loading handlers from untrustworthy sources. It's actually rather unlikely in this case, because code that provides access to security-sensitive resources usually requires using native methods, which you shouldn't be loading from the network. Even if all your handlers will be trusted, it's still a good idea to put the security checks in the high-level class. Security issues are complicated, and it's best to think the issues through once, carefully, and build the necessary checks in from the start. Otherwise, it would be too easy for a new handler to open a security hole accidentally because the programmer forgot an important security check. If every programmer writing a handler has to think about all those issues, such a security hole is actually pretty likely. If security is dealt with once, from the start, in the high-level class, the handlers don't have to worry about it, and the chance of an accidental security hole is much smaller.

Even if you build all the necessary security checks into the high-level class, there's still one security issue that the handlers need to deal with: Untrusted code might be able to bypass the class with the security checks and go straight to the handler. To avoid this, you need to use the Java language protection mechanisms. For example, consider the Socket and SocketImpl classes. Sockets need to be protected from abuse, so the Socket class makes several calls to the security manager to ensure that operations are permitted. All the real functionality of sockets is handled by some subclass of SocketImpl; however, if untrusted code could create an instance of an appropriate subclass of SocketImpl, no security checks would be made. The Java library designers avoid this danger by not making the handler class they supply, PlainSocketImpl, a public class. The class can be used only by other classes within the same java.net package. Combined with a security policy which prohibits untrusted code from defining new classes in that package, the socket abstraction is secure. Everyone who needs to build a new SocketImpl class must be careful not to make it a public class and to put it into a package that is protected by the application security manager.

The final security issue that must be addressed by factory objects is the most obvious: what to do about handlers that cannot be trusted. Fortunately, there aren't many decisions to make about that, at least from the point of view of the author of the factory object. The security manager and the Java library automatically restrict the untrusted handler from most of the dangerous things it could do. The only real issue to consider is whether a handler found at one site should be allowed to deal with data fetched from another site. The handler may have been planted by one company on its own site so that it could intercept data fetched from a competitor's site, altering it or substituting other data to deceive the user.

That scenario seems a little farfetched, I know, but it could happen. For now, the best policy is to keep track of which site handlers come from, and reuse them only on data from the same site. When there are common, shared registries for certain types of handler classes, there should be some way to determine that handlers don't do anything underhanded before they are listed in the registry. Then at least the handlers found through the registry can be given enough trust to handle data from any site. No such guarantees exist today.

Summary

Factory objects represent a useful technique for making a Java class library flexible and extensible. Unlike the new operation (which must be told an exact class to instantiate at compile time), a factory object creates a new object of some general-purpose class but chooses the specific subclass based on current data or runtime environment.

The Java library uses factories to create URL protocol handlers, content handlers for URL connections, and socket implementation objects. These factories make it easy to adapt the library to support new protocols, content types, and socket behavior (such as socket proxy support).

Your own class libraries and applications can also use factory objects to their advantage. Factories are especially useful for building applications that can be dynamically extended with code loaded from the network.

Chapter 17

Network-Extensible Applications with Factory Objects

CONTENTS