Chapter 3

Exploiting the Network


CONTENTS


Because Java applets are usually prohibited from accessing files or other resources on the system where they are running, it's sometimes difficult to find really useful tasks for applets to do. However, one useful thing that applets can do is access the network. While interacting with a user, applets can retrieve new data from the network or request information from the user and return that information to the server. There may still be some restrictions; for example, Netscape Navigator currently permits an applet to communicate only with the host from which it was loaded. Nevertheless, with the proper support on the server, an applet can get anything it needs from the network.

Additionally, the built-in networking support in the Java library makes exploiting the network easy. This chapter explains several ways to access the network from applets.

Retrieving Data Using URLs

One of the nicest aspects of the core Java library is that it provides built-in classes for using the Internet and the World Wide Web. The URL class and its associated classes are particularly useful, because they provide a simplified, high-level interface to common network operations, especially document retrieval.

A Universal Resource Locator (URL) can be used to fetch a document of some sort-text, HTML, image, video, audio, or some other type of document, such as a Java class file. The Java library makes it extremely easy to fetch a document from the network using a URL.

Retrieving Typed Data Objects

Here's a short code fragment that shows how to fetch one document using a URL:

URL home = new URL("http://www.utdallas.edu/~glv/");
Object page = home.getContent();

That's actually all there is to it (under some circumstances), although there is usually a little more error handling to do. There are certainly other options that you can use if you want, but it's nice to know that the basics of parsing the URL, finding the remote host, opening the connection, initializing the protocol, requesting the appropriate document, and reading it into a local buffer are all taken care of by the Java library.

There are actually several ways to create a new URL object, using different types of information:

You will probably use the first method frequently, the second less often, and the last two methods rarely. Why go to the trouble of parsing a URL into its constituent parts when the URL class will do it for you?

When you create a URL object, it doesn't open a network connection automatically. That doesn't happen until the object needs to open the connection to satisfy some other request. For example, in the previous code fragment, the URL object connected to host www.utdallas.edu when the getContent call was made.

You might be wondering how to use the document once you fetch it, because the page variable in the example was declared as an Object. The actual type of object that is returned is determined by the data format of the document. If the URL points to an image in GIF format, for example, the object returned will be an Image object. Usually, when you retrieve a URL, you will have some idea of what kind of object you will get. (If you're interested in the details of how the mechanism works, it's explained in Chapter 17, "Network-Extensible Applications with Factory Objects.")

Accessing the Raw URL Stream

If you don't want to get the entire contents of the document all at once, or if you want to operate on the raw byte stream, there is another method. Instead of calling getContent, you can arrange to read the data yourself. The openStream method returns an instance of InputStream from which you can read the document a byte at a time if that suits your needs. By the time the openStream method returns, the protocol has been initialized, and the desired document has been requested; the first byte you read from the input stream will be the first byte of the document.

Actually, a lot of the work for handling URLs is done behind the scenes by a URLConnection object. In fact, when you ask for the input stream, the URL object simply asks its URLConnection object for the input stream and returns it to you. If you need to, you can get a direct handle to the URLConnection object associated with a particular URL object by calling the openConnection method.

Why would you want to have direct access to the connection object? You might want to learn some additional information about the document-not just the document contents. The URLConnection object has several methods that return such information. Here are a few that are commonly useful:

getContentEncoding The data encoding used for transport
getContentLength The length of the document in bytes
getContentType The MIME media type of the document
getExpiration The document expiration time
getLastModified The last-modified date of the document

Some protocols can't provide all those values, and the ones that can may not be able to provide them all for every document (for instance, not all documents have an expiration time). Therefore, you should be prepared to take appropriate default action if a particular value is not available.

Posting Data to a URL

There's another reason you may want to manipulate a URLConnection object directly: You may want to post data to a URL, rather than just fetching a document. Web browsers do this with data from online forms, and your applets might use the same mechanism to return data to the server after giving the user a chance to supply information.

As of this writing, posting is only supported to HTTP URLs. This interface will likely be enhanced in the future-the HTTP protocol supports two ways of sending data to a URL ("post" and "put"), while FTP, for example, supports only one. Currently, the Java library sidesteps the issue, supporting just one method ("post"). Eventually, some mechanism will be needed to enable applets to exert more control over how URLConnection objects are used for output.

To prepare a URL for output, you first create the URL object just as you would if you were retrieving a document. Then, after gaining access to the URLConnection object, you indicate that you intend to use the connection for output using the setDoOutput method:

URL gather = new URL("http://www.foo.com/cgi-bin/gather.cgi");
URLConnection c = gather.openConnection();
c.setDoOutput(true);

Once you finish the preparation, you can get the output stream for the connection, write your data to it, and you're done:

DataOutputStream out = new DataOutputStream(c.getOutputStream());
out.writeBytes("name=Bloggs%2C+Joe+David&favoritecolor=blue");
out.close();

You might be wondering why the data in the example looks so ugly. That's a good question, and the answer has to do with the limitation mentioned previously: Using URL objects for output is only supported for the HTTP protocol. To be more accurate, version 1.0 of the Java library really only supports output-mode URL objects for posting forms data using HTTP.

For mostly historical reasons, HTTP forms data is returned to the server in an encoded format, where spaces are changed to plus signs (+), line delimiters to ampersands (&), and various other "special" characters are changed to three-letter escape sequences. The original data for the previous example, before encoding, was the following:

name=Bloggs, Joe David
favoritecolor=blue

If you know enough about HTTP that you are curious about the details of what actually gets sent to the HTTP server, here's a transcript of what might be sent to www.foo.com if the example code listed previously were compiled into an application and executed:

POST /cgi-bin/gather.cgi HTTP/1.0
User-Agent: Java1.0
Referer: http://www.foo.com/cgi-bin/gather.cgi
Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
Content-type: application/x-www-form-urlencoded
Content-length: 43

name=Bloggs%2C+Joe+David&favoritecolor=blue

Java takes care of building and sending all those protocol headers for you, including the Content-length header, which it calculates automatically. The reason you currently can send only forms data is that the Java library assumes that's all you will want to send. When you use an HTTP URL object for output, the Java library always labels the data you send as encoded form data.

Once you send the forms data, how do you read the resulting output from the server? The URLConnection class is designed so that you can use an instance for both output and input. It defaults to input-only, and if you turn on output mode without explicitly setting input mode as well, input mode is turned off. If you do both explicitly, however, you can both read and write using a URLConnection:

c.setDoOutput(true);
c.setDoInput(true);

The only unfortunate thing is that, although URLConnection was designed to make such things possible, version 1.0 of the Java library doesn't support them properly. As of this writing, a bug in the library prevents you from using a single HTTP URLConnection for both input and output.

Communication Using Sockets

Occasionally, you might find that network communication using URLs is too inflexible for your needs. URLs are great for document retrieval, but for more complicated tasks it's often easier to just open a connection directly to some server and do the protocol handling yourself. Applets can use the low-level networking facilities of the Java library directly, bypassing the URL-based mechanisms. The Java networking support is based on the socket model. Like the URL class, Java's Socket classes are easy to use.

When you create a socket, you supply a host for connection (either a String containing the hostname or an InetAddress) and an integer port number on that host. Simply creating the object causes the connection to be made:

Socket comm = new Socket("www.javasoft.com", 80);

Either the socket object will successfully create the connection, or it will throw an exception.

Once that's done, the Socket class really has only three interesting methods:

close Closes the connection
getInputStream Gets the input stream for the socket
getOutputStream Gets the output stream for the socket

Using the stream objects, you can read from the socket or write to it, communicating with the process on the other end of the network connection. As with all InputStream and OutputStream objects, you can use any of the filter streams defined in the java.io package or write your own to help with the I/O you need to do. (See Chapter 5, "Building Special-Purpose I/O Classes," for more details about stream classes.) Once the socket is connected, you can write to it and read from it using the same operations you use on files or other I/O streams. When you are ready to end the communication, you can close the connection.

There's really only one other small detail to know about sockets: Each of the constructors can also contain a third parameter in addition to the host and port. The extra parameter is a boolean value that indicates whether you want a stream or a datagram socket. If the third parameter is true, the socket is a stream socket (the default). If it is false, a datagram socket is created. Stream sockets involve a little more overhead than datagram sockets, but with a substantial benefit: Stream sockets are reliable. The connection might be cut off during use, but as long as it's active, bytes are guaranteed to be delivered in the same order they are sent. Datagram sockets, on the other hand, are unreliable: Packets can be dropped, and the recipient will never know they were sent. There are some circumstances where datagram sockets are appropriate, but most Java programmers use stream sockets almost exclusively.

A Socket Redirection Server

What can an applet do if its networking access is restricted? If it can't access the network at all, it might not be very useful. But the more common situation-where an applet is permitted to connect back only to the machine from which it was fetched-although a little inconvenient, isn't a serious barrier if the proper support exists on the source machine.

If you are writing applets that need to connect to other machines while they're running and the applets can't get by with connecting only to the source machine, you may want to run a relay server. The relay server process runs on your Web server, accepts socket connections on behalf of your applets, finds out where they really want to connect, and then forwards the connections on to the real destination. This has disadvantages; for instance, it can increase the load on your server machine (and the network nearby) if your applets are used frequently. Nevertheless, it is an effective way of enabling your applets to get access to data when they are not permitted to fetch it directly.

The following three code listings show how to build such a server and the applet interface to it. To save space, this implementation is quite simple and crude, and could be improved in many ways, but it does illustrate the concepts. The listings' weaknesses are pointed out along the way, and perhaps you'll be able to improve them to suit your needs.

The BouncedSocket Class

Listing 3.1 shows the BouncedSocket class, which is a rough replacement for the Socket class. Socket is declared final in the Java library, so this replacement can't be a subclass of the real thing; that's unfortunate, because BouncedSocket would be more useful as a subclass of Socket. BouncedSocket works the same way as Socket, however, so it can be used in the same circumstances.


Listing 3.1. BouncedSocket.java.
/*
* BouncedSocket.java                 1.0 96/03/04 Glenn Vanderburg
*/

package COM.MCP.Samsnet.tjg;

import java.io.*;
import java.net.*;

/**
* A replacement for the Socket class which redirects through a
* SocketBounceServer, to connect to hosts which would otherwise
* not be allowed.
*
* @version     1.0, 03 Mar 1996
* @author      Glenn Vanderburg
*/

public
class BouncedSocket {

    // The Socket class is final, so unfortunately I can't extend it.  That
    // means this class can't be used as a drop-in replacement.  Oh, well.

    // The place we *really* want to connect to ...
    private InetAddress realaddr;
    private String realhost;
    private int realport;
    
    // The real Socket object which we use to communicate
    private Socket realsock;

    public final int DEFAULTSERVERPORT = 12223;

    /**
     * Creates a new BouncedSocket
    
* @param host The real host that we ultimately want to talk to
     * @param port The port number on host
     * @param bouncehost The host where the SocketBounceServer is running
     * @param bounceport The SocketBounceServer port
     */
    BouncedSocket(String host, int port, String bouncehost, int bounceport)
            throws IOException, UnknownHostException
    {
        realsock = new Socket(bouncehost, bounceport);
        DataOutputStream out
            = new DataOutputStream(realsock.getOutputStream());
        DataInputStream in
            = new DataInputStream(realsock.getInputStream());

        out.writeBytes("Host: " + host + "\nPort: " + port + "\n");
        out.flush();

        String ack = in.readLine();
        if (ack.equals("UnknownHost\n")) {
            throw new UnknownHostException(host);
        }
        else if (ack.startsWith("IOException: ")) {
            throw new IOException(ack.substring(13));
        }
        else if (ack.startsWith("Connected: ")) {
            realaddr = InetAddress.getByName(host);
        }
        else {
            throw new IOException(ack);
        }
    }

    /**
     * Gets the address to which the socket is connected.
     */
    public InetAddress getInetAddress() {
        return realaddr;
    }

    /**
     * Gets the remote port to which the socket is connected.
     */
    public int getPort() {
        return realport;
    }

    /**
     * Gets the local port to which the socket is connected.
     */
    public int getLocalPort() {
        return realsock.getLocalPort();
    }

    /**
     * Gets an InputStream for this socket.
     */
    public InputStream getInputStream() throws IOException {
        return realsock.getInputStream();
    }

    /**
     * Gets an OutputStream for this socket.
     */
    public OutputStream getOutputStream() throws IOException {
        return realsock.getOutputStream();
    }

    /**
     * Closes the socket.
     */
    public synchronized void close() throws IOException {
        realsock.close();
    }

    /**
     * Converts the Socket to a String.
     */
    public String toString() {
        return "BouncedSocket[addr=" + realaddr + ",port=" + realport
            + ",localport=" + realsock.getLocalPort()
            + " via SocketBounceServer, addr=" + realsock.getInetAddress()
            + ",port=" + realsock.getPort() + "]";
    }
}

A BouncedSocket uses a real socket to communicate with the server, but it needs extra variables to store information about the real goal of the communication. Like the real Socket class, BouncedSocket initializes the connection when the object is created. It connects to the server and uses a very simple protocol to tell the server where to connect. The server returns an indicator of the success or failure of its own connection attempt. If something went wrong, BouncedSocket throws an exception; otherwise, the constructor returns, and the connection is established.

All the real work of BouncedSocket is done either in the constructor or by the real Socket object. The rest of the methods simply supply information about the connection or forward socket operations to the real socket.

The SocketBounceServer Class

Listing 3.2 shows the server side of the operation: the SocketBounceServer class. This is the simplest part, largely because it uses a helper class, SocketBouncer, to do most of the work.


Listing 3.2. SocketBounceServer.java.
/*
* SocketBounceServer.java            1.0 96/03/04 Glenn Vanderburg
*/

package COM.MCP.Samsnet.tjg;

import java.io.IOException;
import java.net.*;

/**
* A server which forwards socket operations to a host which may
* not be accessible to another host.
*
* @version     1.0, 03 Mar 1996
* @author      Glenn Vanderburg
*/

public
class SocketBounceServer {

    static int portnum = 122223;

    public static void
    main (String args[]) {
        if (args.length == 1) {
             portnum = Integer.valueOf(args[0]).intValue();
        }

        try {
            ServerSocket listener = new ServerSocket(portnum);

            while (true) {
                Socket connection = listener.accept();
                Thread t = new Thread(new SocketBouncer(connection));
                t.start();
            }
        }
        catch (IOException e) {
            System.err.println("IO Error creating listening socket on port "
                               + portnum);
            return;
        }
    }
}

SocketBounceServer is a stand-alone application rather than an applet. It creates a ServerSocket so that it can camp on a port and wait for clients to connect. Each time a connection is accepted, the server simply creates a new SocketBouncer instance to handle that particular connection, starts the SocketBouncer running in its own thread, and goes back to wait for another connection.

The SocketBouncer Class

The SocketBouncer class is the interesting part of the server side. In reality, a single SocketBouncer instance handles the communication in only one direction, and it takes two of them to handle one client. The first one is responsible for the rest of the connection setup. It must find out from the client which machine to connect to, make the new connection, and then create the other SocketBouncer object (also in a separate thread, to avoid deadlocks). Only then can it begin forwarding data from the client. Listing 3.3 shows the code for SocketBouncer.


Listing 3.3. SocketBouncer.java.
/*
* SocketBouncer.java                 1.0 96/03/04 Glenn Vanderburg
*/

package COM.MCP.Samsnet.tjg;

import java.io.*;
import java.net.*;

/**
* Handles bouncing for one BouncedSocket client, in one direction only.
*
* @version     1.0, 03 Mar 1996
* @author      Glenn Vanderburg
*/

public
class SocketBouncer implements Runnable {

    private Socket readsock;
    private Socket writesock;

    public
    SocketBouncer (Socket readsock) {
        this.readsock = readsock;
    }

    public
    SocketBouncer (Socket readsock, Socket writesock) {
        this.readsock = readsock;
        this.writesock = writesock;
    }

    public void
    run () {
        if (writesock == null) {
            String host;
            int port;
            DataInputStream in;
            DataOutputStream out;
            
            try {
                in = new DataInputStream(readsock.getInputStream());
                out = new DataOutputStream(readsock.getOutputStream());
            
                String line = in.readLine();
                if (line.startsWith("Host: ")) {
                    host = line.substring(6);
                }
                else {
                    out.writeBytes("IOException: expecting hostname\n");
                    out.flush();
                    readsock.close();
                    return;
                }

                line = in.readLine();
                if (line.startsWith("Port: ")) {
                    port = Integer.valueOf(line.substring(6)).intValue();
                }
                else {
                    out.writeBytes("IOException: expecting port number\n");
                    out.flush();
                    readsock.close();
                    return;
                }

                try {
                    writesock = new Socket(host, port);
                }
                catch (UnknownHostException e) {
                    out.writeBytes("UnknownHost\n");
                    throw e;
                }
            
                out.writeBytes("Connected: " + writesock.getInetAddress()
                               + "\n");
            }
            catch (IOException e) {
                return;
            }
            finally {
                try {
                    readsock.close();
                    if (writesock != null) {
                        writesock.close();
                    }
                }
                catch (Throwable t) {
                }
            }
            
            Thread t = new Thread(new SocketBouncer(writesock, readsock));
            t.start();
        }

        try {
            InputStream in = readsock.getInputStream();
            OutputStream out = writesock.getOutputStream();
            byte b[] = new byte[32768];
            int l;
        
            while ((l = in.read(b)) > 0) {
                out.write(b, 0, l);
            }

            out.close();
        }
        catch (IOException e) {
        }
        finally {
            try {
                readsock.close();
            }
            catch (Throwable t) {
            }
        }
    }
}

SocketBouncer handles the server end of the simple protocol that was introduced in BouncedSocket. That protocol is just one of the weak points of this example implementation. It is clumsy, and the error handling is poor. Furthermore, once the connection is completely established, there is no way for the server to communicate information about error conditions to the BouncedSocket object on the client side so that it can throw an appropriate exception there. A more thorough, robust socket proxy protocol would be a better choice.

Another weakness is that the example doesn't check for sockets that have been idle for a long period of time. If a network connection is broken, one or more of the threads in the server might wait for a very long time.

In spite of these weaknesses, the example implementation illustrates the basic concepts of a connection-forwarding server-possibly the solution to your applet's communication needs.

Summary

Applets can use the Java library's networking classes to get help from remote servers, allowing them to perform useful tasks. Applets can use URLs to fetch resources from standard network servers (such as HTTP servers or FTP servers), and the resources can be typed media objects or simply streams of character data. In some situations, a URL can be used to send information back from the applet to the server. For more general client/server interactions, the applet can use a socket to perform complicated interactions with specialized servers that perform a part of the applet's function.

Current applet security restrictions allow an applet to make network connections only to the machine from which the applet was loaded, but with the help of a server on that machine, an applet can effectively connect to any machine. This chapter contains an example of such a socket relay server.