|
To access the contents, click the chapter and section titles.
Platinum Edition Using HTML 4, XML, and Java 1.2
Searching with OraCom WebSite ServerOReillys WebSite server for Windows NT includes the companys WebIndex indexing and WebFind searching tools. WebIndex can index the full text of every page in the servers directory structure or only selected parts of the directories. WebFind runs as a CGI program and is a conventional search tool. It does keyword searches and supports AND and OR operators. OReilly publishes a book (or manual), Building Your Own WebSite, that goes into considerable detail about setting up and using its WebSite server. Before you install the companys software, you can read all about it at The following site is running WebSite and has set up several search databases: Searching with Netscape SuiteSpot ServersNetscape SuiteSpot Standard and Professional Editions run on Windows NT and UNIX. They include a built-in indexing and searching system, although Netscapes lower-priced FastTrack Server does not. You can find out more about Netscapes servers at Searching with Microsoft Index ServerDesigned for zero maintenance and complete Web-site indexing, Microsofts Index Server search engine supports multiple languages (Dutch, U.S. and International English, French, German, Italian, Spanish, and Swedish) and attempts to index by content type as well as contents. It can index documents in several formats: text in a Microsoft Word document, statistics on a Microsoft Excel spreadsheet, or the content of an HTML page. Index Server enables the user to search using both keywords and content types. You may read about Index Server and download a free copy at Index Server requires NT 4.0 and is designed to work with Microsofts Internet Information Server (IIS). Considerations when Adding a Search Engine to Your SiteFor purposes of this discussion, you have evaluated the alternatives and decided that you need to implement a site-resident search engine. Perhaps you dont like the idea of sending your users off to a commercial index, and your Web server doesnt have a built-in search capability, or maybe you just want more control. Before you get started, however, you should consider the type of search engine you want to use. Indexing Versus Grepping Search EnginesTwo main types of approaches can be used for creating an online search facility for your Web site:
Indexing search engines predigest your Web site and create indexes containing all its words. The major commercial Web search engines, such as AltaVista, Lycos, Excite, and Web Crawler, are all indexing engines. In fact, it is not practical to have a search engine that searches the whole Web with the grepping method. To accomplish this, the search engine would have to either add the full text of every site to a database or search every site in real-time. With an indexing search engine, when a user requests a search, the search engine needs to refer only to the index to find relevant pages. Because indexes are often a small fraction of the size of the documents indexed, this takes much less time. More important, such an approach makes the major commercial search engines practical by enabling them to store only the indexes of sites rather than site images. Indexing search engines generally employ more sophisticated searching algorithms to improve their chances of returning relevant documents. Although easy to implement, most grepping search engines are somewhat limited in the types of search queries they support. Grepping, after all, is a rather brute-force method of searching. Each file is opened and then scanned for the search terms. The amount of system resources consumed by these activities can limit the sophistication of the search strategies. Most grepping engines use only simple keyword searches, although some offer searching via regular expressions. To determine which searching method to employ, you must first decide what kinds of search services you want to offer and how many resourcesboth disk space and processor timeto dedicate to those services. Evaluating Performance and Processor Efficiency As you might imagine, a big difference exists between the performance and efficiency of grepping and indexing search engines.
One approach that adds no disk overhead and a small amount of processor overhead is to have someone else maintain your index and run the search process. An example of this approach is Pinpoint, from Netcreations (http://www.netcreations.com/pinpoint/). This commercial service sends its robot to your site about once a month. The site index is maintained on the Netcreations site, and it also maintains and runs the search engine. You maintain a query form on your site that points to the Pinpoint URL. Some trade-offs, of course, exist for this type of solution. You give up a lot of control over what is indexed, how it is indexed, and how often the index is updated. In addition, performance of search queries is likely to be slower when conducted over the Internet.
|
Products | Contact Us | About Us | Privacy | Ad Info | Home
Use of this site is subject to certain Terms & Conditions, Copyright © 1996-2000 EarthWeb Inc. All rights reserved. Reproduction whole or in part in any form or medium without express written permission of EarthWeb is prohibited. Read EarthWeb's privacy statement. |