Java 1.2 Unleashed

Contents


- 37 -

Distributed Application Architecture


This chapter provides background information on Java's distributed programming capabilities. It discusses the various approaches to designing and implementing distributed applications and shows how Java's distributed object model compares to these approaches. It describes how Java's RMI capabilities are implemented using a three-tiered protocol set, and finishes with a discussion of the security issues involved with using RMI. When you finish this chapter, you'll have the background information you need to understand how RMI works and how it is used to develop distributed applications.

Distributed Application Design Approaches

A distributed application is an application whose processing is distributed across multiple networked computers. Distributed applications are able to concurrently serve multiple users and, depending on their design, make more optimal use of processing resources.

Distributed applications are typically implemented as client/server systems that are organized according to the user interface, information processing, and information storage layers, as shown in Figure 37.1.

FIGURE 37.1. The organization of distributed systems.


The user interface layer is implemented by an application client. Email programs and Web browsers are examples of the user-interface component of distributed applications.

The information processing layer is implemented by an application client, an application server, or an application support server. For example, a database application may utilize a database client to convert user selections into SQL statements, a database access server may be used to support communication between the client and a database server, and the database server may use reporting software to process the information requested by a client.

The information storage layer is implemented by database servers, Web servers, FTP servers, file servers, and any other servers whose purpose is to store and retrieve information.

Distributed Applications on the Internet

The popularity of the Internet and the Web has resulted in an almost fully networked world. Computers on opposite ends of the world are directly accessible to each other via the TCP/IP protocol suite. This worldwide connectivity has given rise to distributed applications that run within the Internet's client/server framework. These first-generation applications support client/server communication using application-specific protocols such as HTTP, FTP, and SQL*NET. Figure 37.2 illustrates a typical Internet application.

FIGURE 37.2. A distributed Internet application.

Typically, a client program is executed on multiple host computers. The client uses TCP to connect to a server that listens on a well-known port. The client makes one or more requests of the server. The server processes the client's requests, possibly using gateway programs or back-end servers, and forwards the response to the client.


NOTE: Chapter 30, "Network Programming with the java.net Package," describes the basics of TCP/IP client-server computing.

Applets on an Intranet

In an intranet environment, corporate information systems support services that are tailored to the organizational needs of the company. These services consist of applications that support business areas such as management, accounting, marketing, manufacturing, customer support, vendor interface, shipping and receiving, and so on. These intranet services can be implemented using client/server services, such as a company-internal Web. Java applets provide the capability to run the client interface layer and part of the information processing layer of business applications within the context of a Web browser. Figure 37.3 shows an approach to implementing corporate information services using the applet paradigm. Applets are represented by the small filled-in squares within browsers.

FIGURE 37.3. Implementing intranet services using applets.


The approach shown in Figure 37.3 is essentially the Internet client/server approach shown in Figure 37.2 but applied to an intranet, using Java applets to program client information system interfaces. This approach is popular for developing distributed intranet applications, and can also be used with Internet applications. It allows business applications to be distributed among browsers, Web servers, and other back-end servers.

The Distributed Computing Environment

The Distributed Computing Environment (DCE) is another approach to building distributed applications. DCE was developed by the Open Software Foundation, now referred to as the Open Group. DCE integrates a variety of fundamental services and technologies to build distributed applications. Distributed systems are organized into cells, which are groups of processing resources, services, and users that support a common function and share a common set of DCE services. For example, cells can be organized according to company functions. In this case, you may have separate cells for your finance, manufacturing, and marketing departments.

The DCE services of a cell are used to implement distributed applications that serve the users of the cell and interface with the applications implemented by other cells. The services and technologies used within a DCE cell consist of the following:

DCE is referred to as middleware because it is not a standalone product, but rather a bundle of services that are integrated into an operating system or operating environment. These services are used as an alternative approach to constructing distributed applications. They are used to build the same kinds of applications as the Web-based example covered in the previous section, but they go about it in a different manner.


NOTE: The DCE FAQ, located at http://www.camb.opengroup.org/dce/info/faq-mauney.html, provides a good introduction to the DCE services identified in this section.

The Distributed Component Object Model

The Distributed Component Object Model, or DCOM, is Microsoft's approach to developing distributed systems. DCOM is based on COM, which is the heart of Microsoft's object-oriented development strategy. Because DCOM is essentially a distributed system extension to COM, understanding COM is essential to understanding DCOM.

Understanding COM

COM is an outgrowth of Microsoft's Object Linking and Embedding technology, or OLE. OLE was used in early versions of windows to support compound documents, or documents that are the product of multiple applications. COM was a solution to early problems in OLE, and like most great solutions, it solved a much more fundamental problem--how general objects should interact with and provide services to each other.

COM objects are instances of classes and are organized into interfaces. Interfaces are simply collections of methods. COM objects can only be accessed via their methods, and every COM object is implemented inside a server. A server may be implemented as a dynamic-link library, independent process, or an operating service. COM abstracts away the implementation details and presents a single uniform interface to all objects, no matter how each object is implemented.

The COM library is key to implementing this common interface between objects. It is present on any system that supports COM and provides a directory to all classes that are available on that system. The COM library maintains information about available classes in the system registry. When one COM object accesses another, it first invokes functions in the COM library. These functions can be used to create a COM object from its class or obtain a pointer to its interfaces. The COM runtime is a process that supports the COM library in implementing its functions. It is supported by the Service Control Manager. The invoking object uses interface pointers to invoke the methods of the object that it accesses through the COM library. The pointers used by COM objects can be used by objects written in any programming language.

The interface definition language used to define COM interfaces and methods is borrowed from DCE. COM also defines a binary interface standard. This standard helps to promote language-independence.


NOTE: COM differs from other object-oriented systems in its support of inheritance. COM classes do not inherit the implementation of methods from their superclasses. They only inherit the definition of those interfaces. This means that all methods must be reimplemented every time a subclass is declared. COM provides a workaround to this problem called aggregation. Using aggregation, a class may inherit an entire interface by copying the interface of its superclass. However, the inheriting class may not override individual methods in the inherited interface.

From COM to DCOM

DCOM is essentially COM distributed over multiple computers. DCOM allows COM objects executing on one computer to create COM objects on other computers and access their methods. The location of the remote object is transparent. Using DCOM, remote objects are accessed in exactly the same manner as local objects.

In order for an object on a local system to access the methods of an object on a remote system, the local system must have the remote object's class registered in its local registry. The local object, oblivious of the location of the object that it is accessing, creates the remote object and/or obtains a pointer to its methods by invoking the functions of its local COM library. The COM library processes the function calls using its local COM runtime. The COM runtime checks the system registry for the class of the object being accessed. If the registry indicates that the class is defined in the registry of a remote machine, the local COM runtime contacts the COM runtime on the remote machine and requests that it perform the creation of the remote object or invocation of its methods. The remote COM runtime carries out the request if the request is allowed by the system's security policy. This policy typically defaults to the Windows NT security policy, but may be tailored and made more restrictive for a particular application. Figure 37.4 summarizes DCOM's operation.

The COM runtime processes on separate machines communicate with each other using an RPC mechanism referred to as Object RPC, or ORPC. ORPC is based on Microsoft RPC (which is essentially DCE RPC. ORPC) may be configured to use a number of transport protocols, but works best with UDP. Refer to Chapter 30 for a description of UDP. Because most firewalls block UDP, it is necessary to use TCP with ORPC to build distributed applications that work over the Internet.

Although DCOM is a Microsoft product, it is an open standard and has been ported to other platforms, such as UNIX. Microsoft intends DCOM to be a cross-platform solution for distributed application development. So far it has received a high level of acceptance by Windows users but mediocre success in cross-platform applications.

One of the prominent features of DCOM is its application support. DCOM security integrates with and extends the Windows NT security model. It allows access control decisions to be made with a fine level of granularity. For example, it is possible to specify whether one object is allowed to create or invoke the methods of another. DCOM also provides strong and flexible communication security. A variety of encryption mechanisms may be used to protect information as it is transmitted from one COM object to another. Windows NT 5.0 extends these encryption capabilities to Kerberos-based authentication, encryption, and access control. Kerberos is a very strong security protection mechanism developed at the Massachusetts Institute of Technology. Information on Kerberos may be found at http://www.ov.com/misc/krb-faq.html.

FIGURE 37.4. How DCOM works.

The Microsoft Java Software Development Kit includes a JVM and API that provides an interface to COM and DCOM. Chapter 54, "Dirty Java," shows how to use the capabilities provided by the Microsoft JVM and API.

The Common Object Request Broker Architecture (CORBA)

The Common Object Request Broker Architecture (CORBA) provides another approach to building distributed systems. CORBA, like DCOM but unlike DCE, is object-oriented. It allows objects on one computer to invoke the methods of objects on other computers. CORBA, unlike DCOM, is an open standards solution and is not tied to any particular operating system vendor. Because of this, CORBA is a great choice for building distributed object-oriented applications.

CORBA makes use of objects that are accessible via Object Request Brokers (ORBs). ORBs are used to connect objects to one another across a network. An object on one computer (client object) invokes the methods of an object on another computer (server object) via an ORB.

The client's interface to the ORB is a stub that is written in the Interface Definition Language (IDL). The stub is a local proxy for a remote object. The IDL provides a programming language-independent mechanism for describing the methods of an object.

The ORB's interface to the server is through an IDL skeleton. The skeleton provides the ORB with a language-independent mechanism for accessing the remote object.

Remote method invocation under CORBA takes place as follows: The client object invokes the methods of the IDL stub corresponding to a remote object. The IDL stub communicates the method invocations to the ORB. The ORB invokes the corresponding methods of the IDL skeleton. The IDL skeleton invokes the methods of the remote server object implementation. The server object returns the result of the method invocation via the IDL skeleton, which passes the result back to the ORB. The ORB passes the result back to the IDL stub, and the IDL stub returns the result back to the client object. Figure 37.5 summarizes this process.

FIGURE 37.5. How CORBA works.

Figure 37.5 shows the ORB as being a single layer across the client and server hosts. This is the standard way in which the ORB is viewed. A number of possible ORB implementations are possible. For example, peer ORBs could be implemented on the client and server hosts or a central system ORB could be implemented on a local server. Other ORB implementations are also possible.

Now that you know how CORBA works, you may be wondering how it is used to develop distributed applications. The answer is that CORBA provides a flexible approach to distributed application development. It provides a finer level of granularity in the implementation of client/server systems. Instead of relying on monolithic clients and servers (as is the case of the browsers and servers of the Web), both clients and servers can be distributed over several hosts.

The advantages of CORBA over other distributed application integration approaches are significant:

We'll cover CORBA more in Chapter 41, "JavaIDL and ORBs," where you'll learn how to use Java objects with CORBA.

Java Remote Method Invocation

Given the various approaches to distributed application development discussed in the previous sections, you may be wondering why Java just doesn't pick the best approach and go with it instead of using RMI. There are a number of reasons for this:

Chapter 38, "Building Distributed Applications with the java.rmi Packages," provides an introduction to Java RMI and covers the RMI API. It also shows you how to develop a simple distributed application using RMI. In the next section, I'll describe the Java distributed object model and explain why it is a natural extension of the Java object model used within a single JVM.

The Java Distributed Object Model

The distributed object model used by Java allows objects that execute in one JVM to invoke the methods of objects that execute in other JVMs. These other JVMs may execute as a separate process on the same computer or on other remote computers. The object making the method invocation is referred to as the client object. The object whose methods are being invoked is referred to as the server object. The client object is also referred to as the local object and is said to execute locally. The server object is also referred to as the remote object and is said to execute remotely.

In the Java distributed object model, a client object never references a remote object directly. Instead, it references a remote interface that is implemented by the remote object. The use of remote interfaces allows server objects to differentiate between their local and remote interfaces. For example, an object could provide methods to objects that execute within the same JVM that are in addition to those that it provides via its remote interface. The use of remote interfaces also allows server objects to present different remote access modes. For example, a server object can provide both a remote administration interface and a remote user interface. Finally, the use of remote interfaces allows the server object's position within its class hierarchy to be abstracted away from the manner in which it is used. This allows client objects to be compiled using the remote interface alone, eliminating the need for server class files to be locally present during the compilation process.

The Three-Tiered Layering of the Java RMI

In addition to remote interfaces, the model makes use of stub and skeleton classes in much the same way as CORBA. Stub classes serve as local proxies for the remote objects. Skeleton classes act as remote proxies. Both stub and skeleton classes implement the remote interface of the server object. The client interface invokes the methods of the local stub object. The local stub communicates these method invocations to the remote skeleton, and the remote skeleton invokes the methods of the server object. The server object returns a value to the skeleton object. The skeleton object returns the value to the stub object, and the stub object returns the value to the client. Figure 37.6 summarizes the use of stubs and skeletons.

FIGURE 37.6. The use of stubs and skeletons in the Java distributed object model.


If you are a CORBA programmer, you'll notice the conspicuous absence of IDL and ORBs in Figure 37.6. (IDL and ORBs are required by CORBA because it is language-neutral). The stub and skeleton classes are automatically generated by the rmic compiler from the server object. (The rmic compiler is a standard JDK tool.) These classes are true Java classes and do not rely on an external IDL. No ORB is required because the Java RMI is a pure Java solution. The client object and stub communicate using normal Java method invocations, and so do the skeleton and the server object. The stub and the skeleton communicate via a remote reference layer.

The remote reference layer supports communication between the stub and the skeleton. If the stub communicates with more than one skeleton instance (not currently supported), the stub object communicates with the multiple skeletons in a multicast fashion. The RMI API currently only defines classes that support unicast communication between a stub and a single skeleton. The remote reference layer may also be used to activate server objects when they are invoked remotely.

The remote reference layer on the local host communicates with the remote reference layer on the remote host via the RMI transport layer. The transport layer sets up and manages connections between the address spaces of the local and remote hosts, keeps track of objects that can be accessed remotely, and determines when connections have timed out and become inoperable. The transport layer uses TCP sockets, by default, to communicate between the local and remote hosts. However, other transport layer protocols, such as SSL and UDP, may also be used.

Figure 37.7 illustrates the three-tier layering used to implement Java RMI. In this expanded view of the model, the client object invokes the methods of the local stub of the server object. The local stub uses the remote reference layer to communicate with the server skeleton. The remote reference layer uses the transport layer to set up a connection between the local and remote address spaces and to obtain a reference to the skeleton object.

FIGURE 37.7. The three-tier layering of Java RMI.

In order for a server object to be accessed remotely, it must register itself with the remote registry. It does this by associating its object instance with a name. The remote registry is a process that runs on the server host, and is created by running the rmiregistry program, another JDK tool.

The remote registry maintains a database of server objects and the names by which these objects can be referenced. When a client creates an instance of a server object's interface (that is, its local stub), the transport layer on the local host communicates with the transport layer on the remote host to determine if the referenced object exists and to find out type of interface the referenced object implements. The server-side transport layer uses the remote registry to access this information. A separate process, referred to as the Java RMI Activation System Daemon, supports the activation of remote objects. The Java RMI Activation System Daemon is run by executing the rmid program of the JDK on the remote system.

Passing Arguments and Returning Values

In order for a client object to pass an argument as part of a remote method invocation, the type of the argument must be serializable. A serializable type is a primitive or reference type that can be written to and read from a stream. In practice, all Java primitive types are serializable, and so are all classes and interfaces that implement or extend the Serializable interface. The Serializable interface is defined in the java.io package.


NOTE: Chapter 40, "Using Object Serialization and JavaSpaces," covers object serialization in more detail.

Object references are used within the JVM that contains the object. When a local object is passed as an argument to a remote method invocation, the local object is copied from the local JVM to the remote JVM. Only non-static and non-transient field variables are copied.

When a remote object is passed via a remote method invocation within the same JVM, the reference to the remote object is passed. This is because the remote JVM already contains the object being referenced.

When an object is returned by a server object as the result of a remote method invocation, the object is copied from the remote JVM to the local JVM.

Objects and Remote Method Invocation

The Java distributed object model is a natural extension of the Java object model used within a single JVM. It implements RMI in an easy-to-use fashion and places minimal requirements on objects in order for them to be accessed remotely. These requirements are as follows:

Chapter 38 builds a simple distributed application that shows how each of these requirements are accomplished. Chapter 39, "Working with Remote Objects," provides more advanced examples of implementing remote objects.

Distributed Application Security

The Java distributed object model implements security through the use of class loaders and security managers in the same way that it does for applications and applets. The class loader trusts classes that are loaded from the local host. Classes are not allowed to be loaded from the network unless a security manager is in place that permits remote class loading.

An applet security manager is automatically put into place for applets as they are loaded. The security manager used in distributed Java applications is the RMISecurityManager class. An instance of this class should be set via the setSecurityManager() method of the System class at the beginning of the execution of a client or server object. Less restrictive security managers can be developed by subclassing RMISecurityManager and overriding its methods.

Transport Security

Because RMI uses TCP/IP for network communication, it is subject to the vulnerabilities of the TCP/IP protocol suite. JDK 1.2 enhancements to RMI provide the capability to create custom sockets on a per-object basis. Custom sockets can enable RMI to use Netscape's Secure Sockets Layer protocol to protect information as it is communicated between local and remote objects. This is accomplished by creating a custom RMISocketFactory.

Authentication and Access Control

Authentication is the process of verifying the identity of an individual or an object that acts on the individual's behalf. Access control is the process of restricting access to resources or services based on an object or individual's identity. Authentication and access control work hand in hand. Without strong authentication, unscrupulous individuals may be able to masquerade as trusted individuals. Without access control, authentication has no teeth.

Authorization and access control are important in distributed applications. For example, you may want to limit the objects that are able to remotely invoke the methods of a particular server object to those objects that execute on a specific host or set of hosts, or that act on behalf of a particular individual.

The RMI API does not provide classes and interfaces that directly support authentication and access control. However, these capabilities may be built on top of the classes that are provided by the RMI API. For example, the getClientHost() method of the RemoteServer class can be used by a server object to determine the name of the host from which a remote method invocation is initiated. This may be used to limit RMI access to a specified list of hosts, but this approach is not foolproof. There are ways for malicious hosts to masquerade as trusted hosts. However, it may be used to provide a limited degree of protection. More advanced authentication and access control can be implemented through the use of digital certificates in the overall distributed application supported by RMI.

Firewalls may be used to protect distributed applications that run on an intranet. They are typically used to restrict access to the distributed application to those hosts that are on a corporate intranet or on a selected segment of an intranet. However, firewalls introduce problems of their own. If a firewall exists in the communication path between client and server objects, it can prevent remote method invocations from occurring. Fortunately, JavaSoft recognized this problem, and the RMISocketFactory class provides the capability for RMI to be used with a firewall. This class uses alternative approaches to client/server communication that can be used to circumvent the security restrictions imposed by many firewalls.

Summary

In this chapter you covered background information about approaches to designing and implementing digital applications, and learned how Java's distributed object model compares to these approaches. You delved into the details of Java's distributed object model, and learned how RMI is implemented using a three-layered protocol set. You then investigated the security issues involved with using RMI. In the next chapter, you'll cover RMI in more detail and use it to develop sample distributed applications.


Contents

© Copyright 1998, Macmillan Publishing. All rights reserved.