Register for EarthWeb's Million Dollar Sweepstakes!
home account info subscribe login search My ITKnowledge FAQ/help site map contact us


 
Brief Full
 Advanced
      Search
 Search Tips
To access the contents, click the chapter and section titles.

Platinum Edition Using HTML 4, XML, and Java 1.2
(Publisher: Macmillan Computer Publishing)
Author(s): Eric Ladd
ISBN: 078971759x
Publication Date: 11/01/98

Bookmark It

Search this book:
 
Previous Table of Contents Next


Implementing freeWAIS

In October 1989, a group of companies composed of Dow Jones, Thinking Machines Corporation, Apple Computer, and KPMG Peat Marwick saw the need for an easy way to provide text-based information systems on the corporate level and decided to do something about it. Their goal was to create a system for searching that was easy to use, flexible, built on an established standard, and which could search large amounts of distributed information in various formats. In April 1991, the group released the first Internet version of Wide Area Information Systems (abbreviated WAIS and pronounced “ways”).

The benefits of WAIS are ease of use (for clients and developers), full-text search capability, and support for a variety of document types. It also has a far-reaching knowledge base; it can draw on remote databases to continue the query by example. Using results from one search can lead to a more appropriate server, and so on until the desired result is found.

Almost any time you encounter a discussion of WAIS on the Internet, freeWAIS will also be mentioned. The term freeWAIS is fairly self explanatory—it is a freeware version of WAIS. Much of the material in this section is adapted directly from Bill Schongar’s comprehensive discussion of WAIS in Chapter 12 of Special Edition Using CGI.

Implementing freeWAIS on UNIX Most WAIS tools are still primarily designed for use on UNIX servers. These tools include the servers themselves as well as the client scripts. It only makes sense, therefore, that one of the most significant public extensions to original WAIS functions first appeared on UNIX servers. freeWAIS-SF, designed by the University of Dortmund, Germany, takes advantage of built-in document structures to make more sense out of queries. It even enables you to specify your own document types for its use.

In addition, freeWAIS-SF gives you more power to search the way you want to search. Wild cards, “sounds-like” searches, and more conditions for what does and doesn’t match are all components that make finding what you’re looking for much easier. You no longer have to worry about whether the author wrote “Color” or “Colour,” “Center” or “Centre.”

Unlike many things that you use with your server, especially in the UNIX world, the freeWAIS-SF package is easy to install. A shell script leads you through the basic configuration by asking questions; when you finish answering the questions, you’re finished installing freeWAIS-SF.

At the time of this writing, the current version of freeWAIS was 2.2.10. You can obtain the freeWAIS-SF package at the following site:

ftp://ftp.germany.eu.net/pub/infosystems/wais/Unido-LS6/

If you want the original freeWAIS instead (which you can certainly use, although it was last updated in 1996), you can get it from the Center for Networked Information Discovery and Retrieval. To get the main distribution directory so that you can choose the appropriate build, visit the following site:

ftp://cnidr.org/pub/NIDR.tools/freewais/

Whichever freeWAIS build you use will be a tarred and GNUZIPped file. To unpack the build, therefore, you must enter a command such as the following:

gunzip -c freeWAIS-0.X-whatever.tar.gz | tar xvf -

freeWAIS comes with its own longer set of installation instructions within the distribution, so double-check the latest information for the build that you obtain to make sure that you don’t skip any steps.

Implementing freeWAIS on Windows NT A port of freeWAIS 0.3 is available for Windows NT from EMWAC (the European Microsoft Windows Academic Center) in its WAIS Toolkit. EMWAC’s current version of the toolkit is 0.7. You should, however, check with EMWAC before obtaining the toolkit to find out what is the latest version. Versions are available for all types of Windows NT: 386-based, Alpha, and PowerPC. You can obtain the toolkit from the following site:

ftp://emwac.ed.ac.uk/pub/waistool/

After you obtain the ZIP file, decompress it to retrieve the files that compose the distribution. Move them to an NTFS drive partition, and then rename the file Waisindx.exe to Waisindex.exe.

If you plan to use the entire WAIS Toolkit with your server, put all three .exe programs into the %SYSTEMROOT%\SYSTEM32 directory (usually C:\WINNT\SYSTEM32).


If you are using UNIX, the WAIS program to query the WAIS indexes is called WAISQ. The query tool provided for Windows NT is called WAISLOOK. Keep this in mind when you see references to WAISQ, and just substitute WAISLOOK if you are using Windows NT.

Building a WAIS Database Now that you have the software installed and running, you are ready to make a database (a set of index files).

The WAISINDEX program looks through your files and creates an index that the WAIS query tool can use later. This index consists of seven distinct files that are either binary or plain text, as shown in Table 31.3.

Table 31.3 WAIS Index Database Files

File Extension Purpose File Type

.Cat A catalog of indexed files with a few lines of information about each one. text
.Dct A dictionary of indexed words. binary
.Doc A document table. binary
.Fn A filename table. binary
.Hl A headline table, featuring the descriptive text used to identify documents that the search returns. binary
.Inv An inverted file index. binary
.Src A structure for describing the source. The structure identify documents that the search returns. text


Previous Table of Contents Next


Products |  Contact Us |  About Us |  Privacy  |  Ad Info  |  Home

Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc.
All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement.