Chapter 25

Intranet Boilerplate Library


CONTENTS


Experienced word processors know one of the most valuable features of their word processing software is the ability to reuse documents, and parts of documents, without retyping them. For even the fastest typists, copying a paragraph, page, or an even larger part of one document to another makes words-per-minute measurements obsolete. Compared to typing a block of text from scratch, using the paste command is an obvious gain in efficiency.

Although virtually everyone who uses a word processor can profit from its cut, copy, and paste features to save time and typing, organizations that generate large volumes of similar documents will find the use of boilerplate text to be invaluable. In this chapter, we'll look at how two such organizations-the Public Inquiries Department of a Government agency and a law firm-can build an Intranet Boilerplate Library.

Your steps in putting your Intranet boilerplate library together involve assembly, analysis, and conversion of available legacy word processing documents. Then, just as you did in Part III of this book, you'll prepare to go live with the data on your Intranet by configuring the Web server MIME map; building an HTML page of document descriptions and hyperlinks; and configuring the browser helper applications. If you read Chapter 13, "Word Processing on the Web," you are already very familiar with these steps and I won't waste your time repeating them here.

Preparing to Serve Existing Documents

If you're interested in creating an Intranet boilerplate library, your organization is probably already running some sort of operation which reuses the same documents repeatedly. The example organizations in this chapter-the Department of Public Inquiries (DPI) and a law office-do. The former answers letters from the public or from legislators about government programs, often answering the same questions again and again. The latter assembles legal briefs and other legal documents, much of which are made up of boilerplate language.

In both cases, it's likely that a large number of reusable documents are already available for your Intranet boilerplate library. Your organization might or might not have organized this into a formalized process, so you may have to search out documents that can be used. Although you probably won't want to search customers' hard disks or fileserver directories for candidate documents, you'll want to encourage your customers to contribute information for your library.

Handling Word Processor File Formats

You'll recall from our earlier work that you'll need to get a survey of the word processors in use within your organization. If you determine that more than one program is popular, you'll need to deal with the file format issue on the Web server. You can do this in a couple of ways:

HTML Presentation on the Server

As you've learned, the basic setup of Web pages containing straightforward, clickable lists of available documents is quite easy. Adding a little subject-matter organization is simple, too, using hyperlinks to create nested menu listings and adding explanatory text to the pages. In just a few minutes, you can present a useful list of available word processing documents to your customers.

Obviously, this is an effective-but limited-way of doing business. The advantage of this approach is that you can get it up and running very quickly. Later in this chapter, I'll have a bit more to say about how you can better organize the Web site presentation to the end user.

Client Configuration

Client configuration has been dealt with in several earlier chapters, so I will address it here only by way of review. It boils down to two steps: 1) MIME configuration in the Web browser to reference the relevant helper application; and 2) the actual installation of the word processor helper application (for example, Microsoft Word). See Chapter 13 for step-by-step instructions.

Note
Some office applications support the notion of licensing and installing one copy on a server, as opposed to a full client installation on every desktop. The main advantages of this are that it saves disk space on the clients and the system administrators have an easier job configuring and upgrading the application. The disadvantages are that it takes slightly longer to load the application across the network and some applications don't distinguish individual user preferences based on a login ID. That can lead to a situation where say, I set the default file location to point to my drive, and then you override it by setting it to point to your drive. We would be in an endless loop unless the application stored user preferences on each client or identified user preferences which are stored separately on the server.

Once you have taken these steps, your Intranet boilerplate library's customers can use their Web browsers to retrieve, view, and interact with documents as necessary, just as they would with any other Web-page hyperlink. Customers' word processor helper applications are fired off as documents are accessed. For example, clicking hyperlinks pointing to WordPerfect files causes that program to start with a copy of the retrieved document loaded. Having located the necessary document, customers are all set to save, print, and edit the document for their own purposes. More importantly, for purposes of the Intranet boilerplate library, customers are able to use the downloaded documents as the framework for new ones, with boilerplate language intact. They'll even be able to assemble documents from multiple original source documents.

Note
Web browser helper applications, such as your customers' word processors, always operate on a copy of the original document. Your original document on your Web server remains unchanged until you change it. Customers can freely change the documents they've opened with a Web browser helper application for their own needs, all without touching the original.

Use of Your Intranet Boilerplate Library

Government agencies receive a lot of mail, including letters from the public, from legislators exercising constituent service and general legislative oversight, and from other government agencies. As does your Help Desk (see Chapter 23, "Intranet Help Desk"), your DPI gets repeated, similar questions and probably has a large library of canned responses to these common questions. You don't want to send an often-photocopied form letter in response to such questions, particularly to an influential legislator who may have control over your agency's budget. As a result, your DPI finds itself creating what it wants to appear to be original responses to these inquires.

In many cases, such responses are nothing more than a cobbling together of several off-the-shelf, stock paragraphs, with a few personalizing edits to make the letter look original. Your word processor's copy-and-paste function is a critical part of this process, but you might not have a way of easily finding the particular stock paragraphs you want, or an easy way of assembling those paragraphs into a completely new document.

In a way, law firms are like DPIs, in that they generate loads of documents consisting largely of standard, off-the-shelf language, and also develop altogether new documents that often include lengthy quotations from legal opinions, court cases, and other existing documents. Attorneys give cut-and-paste documents to clerks for typing, with text from prior briefs, photocopied pages from court decisions, and the like. Much of the text may have come from existing documents already online as word processor document files, and some court systems are making electronic copies of court documents available. It goes without saying that it profits the firm if the cut-and-paste documents can be assembled by the clerical staff from available electronic boilerplate.

High-volume DPI organizations and large government agencies (such as the U.S. Social Security Administration), which receive hundreds of thousands of inquiries a year, may be able to contract for an industrial strength document management system to meet these needs. Depending on its size and way of doing business, a law firm may need to do the same thing. Like other custom applications, these systems may come at very high cost. Although very large operations may require such custom document management systems, your Intranet boilerplate library can replicate most of their features at substantially lower costs. As with the data warehouse packages, you have trade-off choices between costs and capabilities.

The Boilerplate Home Page

Let's revisit an aspect of the boilerplate project that I touched on lightly above, namely the creation of what you might call its home page. This is just like any other Web home page, consisting of ordinary HTML markup, introductory text, graphics (if you want to include them), and a top-level set of clickable hyperlinks. For your word-processing staff, you may want to configure their Web browsers to start with this home page; if not, be sure to include a link to your home page on whatever startup page they use.

The hyperlinks on the home page may be organized in several ways. The best may be an organization by subject, with the home page main links leading down a hierarchy of subject matter that enables customers to perform a top-down search for documents matching their needs. For example, you might provide just ten or fewer broad top-level subjects with branches leading to more specific subjects. Although it's not a boilerplate operation by any means, take a look at the Yahoo Search Engine home page, shown in Figure 25.1, for a famous example of the hierarchical approach.

Figure 25.1: The Yahoo top-level subject layout rises above a massive hierarchical structure.

Walk Down the Subject Tree to Live Documents

Each of the Yahoo search engine pages presents a dozen or so very broad subject categories, such as Science & Technology or Business and Economy. Selecting one of these links takes you to progressively more specialized subjects within the general subject matter tree, finally leading you to individual Web pages and documents you can view with your Web browser.

Your Intranet boilerplate library can be laid out just the same way, with subject matter breakdown based on the documents in your own library. In fact, you can almost certainly port over the structure of your existing paper file cabinet, with top-level links representing, say, drawers-each one of which contains organized folders of individual documents.

The only difference between your Intranet boilerplate library and these search engine pages is that when customers reach the individual-document level and select a hyperlink, the document pops up in their word processor rather than in the Web browser. At this point, customers have a live document with which they can do more than just look. For example:

The last item in the preceding list deserves a bit more attention here. Your word processor can be custom-tailored to the needs of your Intranet boilerplate library operation. Besides the actual library of reusable documents you've created on your Intranet, you can also configure and share document style sheets and formatting templates; macro commands for inserting pieces of boilerplate too short to put in your library; custom spelling-checker dictionaries; and many other word-processing-specific facilities. Doing so not only adds efficiency to your operation, but it also brings about uniformity in the look of your documents and in the way they are produced.

Although it's true these things are not really part of your Intranet, they're commonsense means of streamlining the work that a boilerplate operation like a DPI or law firm does.

Note
You may recall the discussion in Chapter 14 of the critical difference between spreadsheets saved as plain text (simple tabular rows and columns of numbers) and live spreadsheets (with formulae and macros already built-in). This same reasoning applies to custom document management systems. Being able to retrieve the text of a boilerplatedocument on-screen may be one thing; being able to retrieve it directly into your word processor and use it immediately (with style sheets and macros attached) is something else altogether.

Documents and More Documents

Customers' Web browsers are still active after having retrieved a document into their word processors. As a result, there's nothing to stop them from popping back into the Web browser window and locating and retrieving additional documents. Unless your boilerplate library has all its stock paragraphs in just a few large documents (which would defeat many of your purposes), it's likely that customers will need to open several documents to retrieve all the necessary stock language to assemble their final documents. Continued browsing and retrieving documents, however, can result in multiple copies of the customer's word processor running concurrently. With some word processors, each new retrieval loads a new copy of the customer's word processor, not just a new document in the already-open copy of the word processor. This can result in a badly cluttered screen (and possibly badly confused customers). It can also result in out-of-memory errors.

It's possible to manage this multiple-document situation in several ways:

Managing Your Boilerplate Library

Even the longest-lived, largest boilerplate operations will occasionally generate a completely original document. More frequently, though, incremental changes in stock language are made. In either case, it's important to ensure that new and revised documents get placed in your library so that everyone can retrieve them. Keeping your library up-to-date is important, and how you go about doing it goes back to some of the organizational choices you considered back in the early chapters of this book.

If your boilerplate operation is a large and/or critical one, with frequent document changes that must be put in place quickly, you'll probably want a Web server right in the department, with a local Webmaster to update the documents promptly, rather than relying on a central MIS department to maintain one for you.

Supervisors or others with the authority to update documents on your Intranet boilerplate library Web server can use facilities such as the TCP/IP File Transfer Protocol (FTP) to upload files to the Web server from their pcs or workstations. As you learned in Chapter 9 , because you have the TCP/IP networking infrastructure to support World Wide Web services, you also have FTP and many other networking capabilities with which to supplement the Web services on your Intranet.

Netscape Navigator 2.0 even has FTP upload capabilities built in, so customers don't need to learn to use a stand-alone FTP client (such as CuteFTP). You can access this feature when browsing an FTP server by pulling down the File menu and selecting Upload File, but first you'll need to specify a username, and possibly a password, in the FTP URL. For example:

ftp://poweruser:password@ftp.yourcompany.com

The Netscape File Upload dialog box is shown in Figure 25.2. Note that if you attempt to do this with the Microsoft IIS FTP server, you must turn off the Allow only anonymous connections checkbox in the Properties dialog of the FTP service. File and directory permissions for the NT user account will also come into play, as you would expect.

Figure 25.2: The Netscape Navigator 2.0 FTP File Upload dialog.

Tip
Because Government staff work is often reviewed by managers and legal documents by paralegals and other attorneys, you can use the FTP upload facility in Netscape, or in a dedicated FTP client program, to make draft documents accessible to reviewers, also using their Web browsers. Setting up a publicly writeable, shared review directory on an FTP server enables reviewers to download the drafts for review and edit them using their Web browsers, just as your customers download boilerplate documents. Pointing and clicking with a Web browser, the document is loaded right into your word processor.

Network drives can also be used to keep your Intranet boilerplate library up to date. If the system running the Web server shares its directory where your boilerplate files are located, supervisors running client workstations in that same domain can map the remote filesystem as a network drive and then just copy new documents over (using Explorer or the DOS prompt). This can even be accomplished across domains, provided that proper trust relationships are established between the Windows NT Server Domain Controllers. Connecting your client machines to shared drives on the Windows NT Server is absolutely the most convenient way for your customers to keep the files on the Web server updated on a regular basis. Files can be dragged and dropped in Explorer more easily than they can be uploaded via FTP.

Indexing Your Boilerplate Library

You could use a workaround, maintaining parallel plain-text copies of all your documents just for indexing. In such a situation, you'd index only the plain-text versions but somehow contrive to retrieve the original documents through custom CGI or ISAPI applications. This would be quite clumsy. A better alternative is to use waisindex on the originals. (See Chapter 21 for detailed information about WAIS.)

You can also index your RTF documents, if you're using RTF as a means of document portability among different word processors. RTF is a plain-text file format, much like PostScript or the Adobe Portable Document Format (PDF), with plain-text markup commands included right in with the document's text. As such, the full indexing power of waisindex will be available to you to index all your RTF documents (as well as PostScript and PDF ones). However, you may want to create your own waisindex stop files to exclude RTF-specific markup, such as font names and the like, from your indexes. Take a look at the bit of RTF that follows this paragraph, which is just one line from the top of a document, declaring the document to be RTF and specifying the available fonts:

{\rtf1\defformat\mac\deff2 {\fonttbl{\f0\fswiss Chicago;}{\f2\froman New York;}{\f3\fswiss Geneva;}{\f4\fmodern Monaco;}{\f5\fscript Venice;}{\f6\fdecor London;}{\f8\fdecor San Francisco;}{\f11\fnil Cairo;}{\f16\fnil Palatino;}{\f20\froman Times;}{\f21\fswiss Helvetica;}

As you can see, there are a number of RTF-specific commands, such as the opening {\rtf1\defformat\mac\deff2, along with several font names, roman New York, modern Monaco, and the like. Because these font names contain words that might otherwise appear in your document text, you'll want to craft your stop list carefully. Otherwise, searches for words such as New York or Cairo will generate hits on the font names rather than on the substantive text; a search for New York, for example, might bring you back every single RTF document in your index.

As an alternative to extending your stop files, you can create waisindex document format description files that exclude the RTF or PostScript markup. These define the structure of your documents so you can do fielded indexing. You can then exclude all RTF markup codes from customer searches.

Depending on the extent and nature of the overall library of documents on your Intranet and your customers' indexing needs, you may want also to look into commercial full-text indexing tools.

Tip
With any commercial full-text indexing product, if you need RTF support, be sure to ask for specific details about how the product's RTF support works. As described previously with respect to waisindex, you'll want the package to exclude, or otherwise work around, all the RTF markup in your documents, so as not to get false hits in your customer searches.

Summary

In this chapter, you've seen a practical application for your Intranet, again combining several of the facilities about which you've learned. As you've seen in previous chapters, this flexibility, used in combination with your imagination, can result in significant added value for your Intranet. Creating an Intranet boilerplate library like the ones described in this chapter involves:

In the next chapter, you take the ideas discussed and illustrated in the past several chapters and create a completely new use for the Intranet. I'll show you how to put a new twist on a commonly used business application: the slide show presentation.