Chapter 15

The Networking Package

by Mike Fletcher


CONTENTS


This chapter serves as an introduction to the package containing Java's networking facilities. It covers the classes, interfaces, and exceptions that make up the java.net package.

Unless otherwise noted, classes, exceptions, and interfaces are members of the java.net package. The full package name is given for members of other classes-such as java.io.IOException. Method names are shown followed by parentheses (), such as close().

These descriptions are not intended to be a complete reference. For a more detailed description of the components of the java.net package and the arguments of their various methods, see Appendix C, "The Java Class Library."

Classes

The classes in the networking package fall into three general categories:

Keep in mind that some of the java.net classes (such as URLConnection and ContentHandler) are abstract classes and cannot be directly instantiated. Subclasses provide the actual implementations for the different protocols and contents.

Table 15.1 lists all the classes in the package along with brief descriptions of the functionality each provides.

Table 15.1. Classes of the java.net package.

ClassPurpose
URL Represents a Uniform Resource Locator
URLConnection Retrieves content addressed by URL objects
Socket Provides a TCP (connected, ordered stream) socket
ServerSocket Provides a server (listening) TCP socket
DatagramSocket Provides a UDP (connectionless datagram) socket
DatagramPacket Represents a datagram to be sent using a DatagramSocket object
InetAddress Represents a host name and its corresponding IP number or numbers
URLEncoder Encodes text in the x-www-form-urlencoded format
URLStreamHandler Subclasses implement communications streams for different URL protocols
ContentHandler Subclasses know how to turn MIME objects into corresponding Java objects
SocketImpl Subclasses provide access to TCP/IP facilities

The URL Class

The URL class represents a Web Uniform Resource Locator. Along with the URLConnection class, the URL class provides access to resources located on the World Wide Web using the HTTP protocol or on the local machine using file: URLs.

Constructors. The constructors for the URL class allow the creation of absolute and relative URLs. One constructor takes a whole String as a URL; other constructors allow the protocol, host, and file to be specified in separate String objects. The class also provides for relative URLs with a constructor that takes another URL object for the context and a String as the relative part of the URL.

ConstructorDescription
URL( String url ) Takes the entire URL as a String.
URL( String protocol, String host, int port, String file ) Takes each component of the URL as a separate argument.
URL( String protocol, String host, String file) As above, but uses the default port number for the protocol.
URL( URL context, String file ) Replaces the file part of the URL with the second argument.

Methods. The methods for the URL class retrieve individual components of the represented URL (such as the protocol and the host name). The class also provides comparison methods for determining whether two URL objects reference the same content.

Probably the most important method is getContent(). This method returns an object representing the content of the URL. Another method, openConnection(), returns a URLConnection object that provides a connection to the remote content. The connection object then can be used to retrieve the content, as it can be with the getContent() method.

The URLConnection Class

The URLConnection class does the actual work of retrieving the content specified by URL objects. This class is an abstract class; as such, it cannot be directly instantiated. Instead, subclasses of the class provide the implementation to handle different protocols. The subclasses know how to use the appropriate subclasses of the URLStreamHandler class to connect and retrieve the content.

Constructor. The only constructor provided for the URLConnection class takes a URL object and returns a URLConnection object for that URL. However, because URLConnection is an abstract class, it cannot be directly instantiated. Instead of using a constructor, you will probably use the URL class openConnection() method. The Java runtime system creates an instance of the proper connection subclass to handle the URL.

Methods. The getContent() method acts just like the URL class method of the same name. The URLConnection class also provides methods to get information such as the content type of the resource or HTTP header information sent with the resource. Examples of these methods are getContentType(), which returns what the HTTP content-type header contained, and the verbosely named guessContentTypeFromStream(), which tries to determine the content type by observing the incoming data stream.

Methods also are provided to obtain a java.io.InputStream object that reads data from the connection. For URLs that provide for output, there is a corresponding getOutputStream() method. The remaining URLConnection methods deal with retrieving or setting class variables.

Variables. Several protected members describe aspects of the connection, such as the URL connected to and whether the connection supports input or output. A variable also notes whether or not the connection uses a cached copy of the object.

The Socket Class

A Socket object is the Java representation of a TCP connection. When a Socket is created, a connection is opened to the specified destination. Stream objects can be obtained to send and receive data to the other end.

Constructors. The constructors for the Socket class take two arguments: the name (or IP address) of the host to connect to, and the port number on that host to connect to. The host name can be given as either a String or as an InetAddress object. In either case, the port number is specified as an integer.

ConstructorDescription
Socket( String host, int port, boolean stream ) Takes the hostname and port to contact, and whether to use a stream (true) or datagram connection.
Socket( String host, int port ) As above, but defaults to a stream connection.
Socket( InetAddress host, port, boolean stream ) Uses an InetAddress object to specify the int hostname rather than a String.
Socket( InetAddress host, int port ) As above, but defaults to a stream connection.

Methods. The two most important methods in the Socket class are getInputStream() and getOutputStream(), which return stream objects that can be used to communicate through the socket. A close() method is provided to tell the underlying operating system to terminate the connection. Methods also are provided to retrieve information about the connection such as the local and remote port numbers and an InetAddress representing the remote host.

The ServerSocket Class

The ServerSocket class represents a listening TCP connection. Once an incoming connection is requested, the ServerSocket object returns a Socket object representing the connection. In normal use, another thread is spawned to handle the connection. The ServerSocket object is then free to listen for the next connection request.

Constructors. Both constructors for this class take as an argument the local port number to listen to for connection requests. One constructor also takes the maximum time to wait for a connection as a second argument.

Constructor Description
ServerSocket( int port,int count ) Takes the port number to listen for connections on and the amount of time to listen.
ServerSocket( int port ) As above, but the socket waits until a connection is received.

Methods. The most important method in the ServerSocket class is accept(). This method blocks the calling thread until a connection is received. A Socket object is returned representing this new connection. The close() method tells the operating system to stop listening for requests on the socket. Also provided are methods to retrieve the host name the socket is listening on (in InetAddress form) and the port number being listened to.

The DatagramSocket Class

The DatagramSocket class represents a connectionless datagram socket. This class works with the DatagramPacket class to provide for communication using UDP (User Datagram Protocol).

Constructors. Because UDP is a connectionless protocol, you do not have to specify a host name when creating a DatagramSocket-only the port number on the local host. A second constructor takes no arguments. When this second constructor is used, the port number is assigned arbitrarily by the operating system.

ConstructorDescription
DatagramSocket( int port ) Creates a socket on the specified port number.
DatagramSocket() Creates a socket on an available port.

Methods. The two most important methods for the DatagramSocket class are send() and receive(). Each takes as an argument an appropriately constructed DatagramPacket (described in the following section). In the case of the send() method, the data contained in the packet is sent to the specified host and port. The receive() method blocks execution until a packet is received by the underlying socket, at which time the data is copied into the packet provided.

A close()method is also provided, which asks for the underlying socket to be shut down, as is a getLocalPort()method, which returns the local port number associated with the socket. This last method is particularly useful when you let the system pick the port number for you.

The DatagramPacket Class

DatagramPacket objects represent one packet of data that is sent using UDP (using a DatagramSocket).

Constructors. The DatagramPacket class provides two constructors: one for outgoing packets and one for incoming packets. The incoming version takes as arguments a byte array to hold the received data and an int specifying the size of the array. The outgoing version also takes the remote host name (as an InetAddress object) and the port number on that host to send the packet to.

ConstructorDescription
DatagramPacket( byte[] buffer, int length ) Creates a packet to receive the specified number of bytes into the given buffer.
DatagramPacket( byte[] buffer, int length, InetAddress addr, int port ) Creates a packet to send the specified number of bytes from the given buffer to the host and port given.

Methods. Four methods in the DatagramPacket class allow the data, datagram length, and addressing (InetAdress and port number) information for the packet to be extracted. The methods are named, respectively, getData(), getLength(), getAddress(), and getPort().

The InetAddress Class

The InetAddress class represents a host name and its IP numbers. The class itself also provides the functionality to obtain the IP number for a given host name-similar to the C gethostbyname() function on UNIX and UNIX-like platforms.

Constructors. There are no explicit constructors for InetAddress objects. Instead, you use the static class method getByName(), which returns a reference to an InetAddress. Because some hosts may be known by more than one IP address, there also is a getAllByName() method, which returns an array of InetAddress objects.

Methods. In addition to the static methods just listed, the getHostName() method returns a String representation of the host name that the InetAddress represents; the getAddress() method returns an array of the raw bytes of the address. The equals() method compares address objects. The class also supports a toString() method, which prints out the host name and IP address textually.

The URLEncoder Class

The URLEncoder class provides a method to encode arbitrary text in the x-www-form-urlencoded format. The primary use for this format is when you are encoding arguments in URLs for CGI scripts. Nonprinting or punctuation characters are converted to a two-digit hexadecimal number preceded by a percent (%) character. Space characters are converted to plus (+) characters.

Constructors. There is no constructor for the URLEncoder class. All the functionality is provided by means of a static method.

Methods. The URLEncoder class provides one static class method, encode(), which takes a String representing the text to encode and returns the translated text as a String.

The URLStreamHandler Class

The subclasses of the URLStreamHandler class provide the implementation of objects that know how to open communications streams for different URL protocol types. More information on how to write handlers for new protocols can be found in Chapter 24, "Developing Content and Protocol Handlers."

Constructors. The constructor for the URLStreamHandler class cannot be called because URLStreamHandler is an abstract class.

Methods. Each subclass provides its own implementation of the openConnection() method, which opens an input stream to the URL specified as an argument. The method should return an appropriate subclass of the URLConnection class.

The ContentHandler Class

Subclasses of the ContentHandler abstract class are responsible for turning a raw data stream for a MIME type into a Java object of the appropriate type.

Constructors. Because ContentHandler is an abstract class, ContentHandler objects cannot be instantiated. An object implementing the ContentHandlerFactory interface decides what the appropriate subclass is for a given MIME content type.

Methods. The important method for ContentHandler objects is the getContent() method, which does the actual work of turning into a Java object the data read using URLConnection. This method takes as its argument a reference to a URLConnection that provides an InputStream at the beginning of the representation of an object.

The SocketImpl Class

The SocketImpl abstract class provides a mapping from the raw networking classes to the native TCP/IP networking facilities of the host. This means that the Java application does not have to concern itself with the operating system specifics of creating network connections. At runtime, the Java interpreter loads the proper native code for the implementation, which is accessed by means of a SocketImpl object. Each Socket or ServerSocket then uses the SocketImpl object to access the network.

This scheme also allows for flexibility in different network environments. An application does not have to bother with details such as being behind a firewall because the interpreter takes care of loading the proper socket implementation (such as one that knows how to use the SOCKS proxy TCP/IP service).

Tip
SOCKS provides TCP and UDP access through a firewall. A SOCKS daemon runs on the firewall (or the inside machine of a DMZ setup). Clients on the inside network call up the SOCKS daemon and ask it to make a connection to an outside host. The daemon connects to the outside host directly or through another SOCKS daemon. SOCKS is pretty cool because the client application doesn't even know it's there if things are set up properly. For more information about SOCKS, take a look at this URL:
http://www.socks.nec.com/socks5.html

Unless you are porting Java to a new platform or adding support for something such as connecting through a firewall, you probably will never see or use SocketImpl.

Constructors. The SocketImpl abstract class has one constructor that takes no arguments.

Methods. The methods provided by the SocketImpl class look very familiar to anyone who has done socket programming under a UNIX variant. All the methods are protected and may be used only by subclasses of SocketImpl that provide specific socket implementations.

The create() method creates a socket with the underlying operating system. It takes one boolean argument that specifies whether the created socket should be a stream (TCP) or datagram (UDP) socket. Two calls, connect() and bind(), cause the socket to be associated with a particular address and port.

For server sockets, there is the listen() method, which tells the operating system how many connections may be pending on the socket. The accept() method waits for an incoming connection request. It takes another SocketImpl object as a parameter, which represents the new connection once it has been established.

To allow reading and writing from the socket, the class provides the getInputStream() and getOutputStream() methods, which return a reference to the corresponding stream. Once communication on a socket is finished, the close() method may be used to ask the operating system to close the connection. The remaining methods allow read access to the member variables as well as a toString() method for printing a textual representation of the object.

Variables. Each SocketImpl object has four protected members:

MemberDescription
fd A java.io.FileDescriptor object used to access the underlying operating system network facilities.
Address An InetAddress object representing the host at the remote end of the connection.
Port The remote port number, stored as an int.
localport The local port number, stored as an int.

Exceptions

Java's exception system allows for flexible error handling. The java.net package defines five new exceptions, which are described in the following sections. All these exceptions provide the same functionality as any java.lang.Exception object. Because each exception is a subclass of java.io.IOException, the exceptions can be handled with code such as that in the following fragment:

try {
    // Code that might cause an exception goes here
} catch( java.net.IOException e ) {
    System.err.println( "Error on socket operation:\n" + e );
    return;
}

This code could be put inside a for loop-for example, when trying to create a Socket to connect to a heavily loaded host.

The UnknownHost Exception

The UnknownHostException exception is thrown when a host name cannot be resolved into a machine address. The most probable causes for this exception are listed here:

Tip
If you are sure that you are using the right host name and are still getting this exception, you may have to fix the name-to-IP number mapping. How to go about this depends on the platform you are using. If you are using DNS, you must contact the administrator for the domain. If you are using Sun's NIS, you must have the system administrator change the entry on the NIS server. Finally, you may have to change the local machine's host file, usually named hosts or HOSTS (/etc/hosts on UNIX variants, \WINDOWS\HOSTS on Windows 95). In any case, using the IP number itself to connect to the host should work.

The UnknownService Exception

The URLConnection class uses the UnknownServiceException exception to signal that a given connection does not support a requested facility such as input or output. If you write your own protocol or content handlers and do not override the default methods for getting input or output stream objects, the inherited method throws this exception. An application to which a user can give an arbitrary URL should watch for this exception. (Users being the malicious creatures they are!)

The Socket Exception

The SocketException exception is thrown when there is a problem using a socket. One possible cause is that the local port you are asking for is already in use (that is, another process already has the socket open). Some operating systems might wait for a period of time after a socket has been closed before allowing it to be reopened.

Another cause is that the user cannot bind to that particular port. On most UNIX systems, ports numbered less than 1024 cannot be used by accounts other than the root or superuser account. This is a security measure because most well-known services reside on ports in this range. Normal users are not able to start their own server in place of the system version. While you are developing a service, you may want to run the server on a higher numbered port. Once the service has been developed and debugged, you can move it to the normal port.

The SocketException exception is also thrown if you try to use the setSocketImplFactory() method of the Socket or ServerSocket class when the SocketImplFactory already has been set. Usually, the Java interpreter sets this to a reasonable value for you, but if you are writing your own socket factory (for example, to provide sockets through a firewall), this exception may be thrown.

The Protocol Exception

The ProtocolException exception is raised by the underlying network support library. It is thrown by a native method of the PlainSocketImpl class when the underlying socket facilities return a protocol error.

The MalformedURL Exception

The URL class throws the MalformedURLException exception if it is given a syntactically invalid URL. One cause can be that the URL specifies a protocol that the URL class does not support. Another cause is that the URL cannot be parsed. A URL for the HTTP or FILE protocols should have the following general form:

protocol://hostname[:port]/[/path/_/path]/object

In this syntax, the following components are used:

ComponentDescription
protocol The protocol to use to connect to the resource (http or file).
hostname[:port] The host name to contact, optionally followed by a colon (:) and the port number to connect to (for example, kremvax.gov.su:8000). The host name also may be given as an IP address.
[/path/.../path] The (optional) path to the object, separated by / characters.
object The name of the actual object itself.

This syntax for a URL depends on the protocol. The complete URL specification can be found in RFC 1738 (see Chapter 23, "Introduction to Network Programming," for details on retrieving RFC documents, or check out the World Wide Web Consortium's site at http://www.w3.org/ for the latest version).

Other Exceptions

In addition to the exceptions in the java.net package, several methods throw exceptions from the java.io package. The most common of these is java.io.IOException-which is thrown when there is a problem reading a Web resource by the URL class or if there is a problem creating a Socket object.

Interfaces

The java.net package defines three interfaces. These interfaces are used primarily behind the scenes by the other networking classes rather than by user classes. Unless you are porting Java to a new platform or are extending it to use a new socket protocol, you probably will have no need to implement these interfaces in a class. They are included here for completeness and for those people who like to take off the cover and poke around in the innards to find out how things work.

The SocketImplFactory Interface

The SocketImplFactory interface defines a method that returns a SocketImpl instance appropriate to the underlying operating system. The socket classes use an object implementing this interface to create the SocketImpl objects they need to use the network.

The URLStreamHandlerFactory Interface

Classes that implement the URLStreamHandlerFactory interface provide a mapping from protocols such as HTTP or FTP into the corresponding URLStreamHandler subclasses. The URL class uses this factory object to obtain a protocol handler.

The ContentHandlerFactory Interface

The URLStreamHandler class uses the ContentHandlerFactory interface to obtain ContentHandler objects for different content types. The interface has one method, createContentHandler(), which takes the MIME type for which a handler is desired as a String.

Summary

This chapter provided a quick introduction to the networking facilities that the java.net package provides. Appendix C, "The Java Class Library," contains more detailed information about the specific arguments and return types for the various methods.