home account info subscribe login search My ITKnowledge FAQ/help site map contact us


 
Brief Full
 Advanced
      Search
 Search Tips
To access the contents, click the chapter and section titles.

Platinum Edition Using HTML 4, XML, and Java 1.2
(Publisher: Macmillan Computer Publishing)
Author(s): Eric Ladd
ISBN: 078971759x
Publication Date: 11/01/98

Bookmark It

Search this book:
 
Previous Table of Contents Next


The files with the extensions listed in Table 31.3 all share the same first name, as in Index.cat, Index.dct, Index.doc, and so on. You can name the first file anything you want, but if the file containing the HTML for the search form is called Index.html, INDEX is what you should use for the database. If your HTML file is called Default.htm (as it would be using EMWAC’s HTTP server), DEFAULT is the correct first name for your database.


Many Web servers have built-in support for WAIS databases and determine which files to look at by matching the first name of the HTML file with the first name of the database files. Therefore, naming your database files correctly is important if you expect the built-in support to function.

The command line options that you use when executing WAISINDEX determine these database files’ contents. You might want to use a variety of options, depending on your objective and the nature of the files that you want to index. The following is a simple command line to create an index:

waisindex -d Data\database1 Data\*.html

This command line uses only one option, the -d switch, which specifies that the next argument is the name that you want to give the index. The command specifies that the name is DATABASE1 and that the database is to reside in the DATA directory. Arguments following the switches are the file names to index. In this example, the command indexes all the HTML files (those with an .html extension) in the DATA directory.

One of the more powerful features of WAISINDEX is that it enables you to index a variety of file types. To find out exactly which file types your version supports, check your version’s documentation. The versions of WAISINDEX vary in the file-type support that they offer. In particular, freeWAIS-SF enables you to specify your own document types, and the EMWAC Toolkit supports such formats as Microsoft’s Knowledge Base.

Accessing the WAIS Database If your Web server has built-in support for WAIS (as many Web servers do), accessing the WAIS database is quite simple. You just create an HTML file to make the query and put the file in the same directory as the WAIS database files. (Remember that the first names of the HTML file and the database files must match.)

The HTML itself could not be simpler. Listing 31.18 shows a sample. All you have to do is include an <ISINDEX> tag somewhere on the form, and the Web server does the rest.

Listing 31.18 A Sample WAIS Search HTML


<HEAD>
<TITLE>Sample WAIS Search</TITLE>
</HEAD>
<BODY>
<H1>Sample WAIS Search</H1>
This page has a built-in index. Give it a whirl!
<P>
<ISINDEX>
</BODY>
</HTML>

If your Web server doesn’t support WAIS directly, you must use a CGI script to access the data. You might also want to use a script when you need to format the output or filter the input.

Your script must gather data from a fill-in form and run a query against the WAIS index, and then format the data appropriately for the visitor.

You can have your script perform the same function Web servers directly supporting WAIS perform: Call the WAISQ (or WAISLOOK) program. You can test this call from the following command line:

waisq -d -http Data\database1 stuff

In this simple example, you run a query against the DATA\DATABASE1 index files, using stuff as the query term. The result returns STDOUT as properly formatted HTML code, which makes the result perfect for use in a CGI script.

WAIS is so popular that dozens of scripts are available in the public domain for managing your queries. The following are the three most generic and useful scripts:

  WAIS.PL ftp://ftp.ncsa.uiuc.edu/Web/httpd/Unix/ncsa_httpd/cgi/wais.tar.Z
  Son-of-WAIS.PL http://dewey.lib.ncsu.edu/staff/morgan/son-of-wais.html
  Kid-of-WAIS.PL http://www.cso.uiuc.edu/grady.htmlhttp://jordal.cso.uiuc.edu/kidofwais.pl

Implementing Excite for Web Servers

Architext’s popular search engine, Excite for Web Servers (EWS), is available for SunOS, Solaris, HP-UX, SGI Irix, AIX, BSDI UNIX, and Windows NT. EWS is a full-featured, fast indexing search tool based on the same technology as the Excite search service. Despite being a commercial search engine, it is available for use on your Web site for free. The only restriction in the user license is that you cannot use it to provide services for a third party (by establishing a service to compete with Excite, for example).

Excite enables people to enter queries in ordinary language, without using specialized query syntax. Excite claims that EWS understands plain English queries such as, “How to stay healthy by eating well” or “Learn to speak Tagalog.” Queries using concepts are more likely to produce effective results than simple keyword searches, according to the company.


Previous Table of Contents Next


Products |  Contact Us |  About Us |  Privacy  |  Ad Info  |  Home

Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc.
All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement.