Andrew Fry (http://www.freerange.com/home/talent/af.html), president of FreeRange Systems in Seattle, says, "You can learn to write HTML in a day. It can take a lifetime to learn to write great HTML."
This book is about making effective sites on the World Wide Web. Mr. Fry is right-almost anyone can learn to write HTML in just a few hours. Many people in this field complain that their competitors are 16 years old.
But putting up a site and putting up an effective site are two different things. This chapter describes the difference and gives several quick techniques for transforming a good site into an effective site.
The running joke in this industry is the one where the client calls the Webmaster and asks, "How much does a Web site cost?" The Webmaster answers, "That depends. What do you want the site to do?" to which the client answers, "I don't know-what can you do with a Web site?" The Webmaster responds, "That depends. How much do you want to spend?"
Web sites are so new that no one knows what they should cost. There are reports of people spending tens of thousands of dollars to put up a few pages. Then you hear about the high school sophomore who puts up 12 pages for $400. This book takes the position that the best measure of a site is not its price or its page count but its effectiveness.
Effectiveness, of course, is more difficult to pin down. It's tough to define and even tougher to get the hard data needed so that effectiveness can be measured. As the sample site, this book uses a site for which hard data is available: a real estate broker's site. Some of the details of this site have been simplified to make the points of each chapter clear. The reader can visit the live site at http://www.dse.com/gsh/ to compare the chapter examples with what is actually on the Web.
Although the joke above may stretch the issue, it is often the
case that prospective owners have little idea why they want a
Web site. They just know that they need to be "on the Web,"
or that management has decreed that they "put up a site."
Note |
Many different terms are used to identify the people associated with a Web site. The Webmaster has overall responsibility for developing and maintaining the site. The site owner is the person or organization who benefits from the site. The owner typically hires the Webmaster-sometimes as an employee, and sometimes as a contractor. Before the site owner commits to putting up a site, he or she is a prospective owner, or prospect. The developer actually writes the HTML and the programs that implement the site. If the Webmaster doesn't do the development, then the developer works for the Webmaster. Where the distinction between HTML (a markup language) and software (written in a programming language) is important, the terms "HTML writer" and "programmer" are used as specific roles for the developer. For business purposes, the client is the same as the site owner. For technical purposes, "client" refers to the software in a client-server architecture. For example, the Netscape Navigator Web browser is client software. The visitor to a site is also known as its user. Sometimes the visitor will have a special role-if the visitor buys something at an online store, he is a customer as well as a user. |
One of the tasks of the expert Webmaster is to help clients define their purposes in putting a site on the Web. These purposes can be considered at three levels:
There are at least three goals an organization might have in setting up a Web site:
An example of a site set up to trigger direct action is http://www.mcp.com/. You can buy books through that site.
An example of a site set up to trigger delayed action is a realtor's site, http://www.dse.com/gsh/. Most people visiting a realtor's site are not ready to buy or sell their home today. But they can use the site to learn about the market and the realtor, and to learn some skills like how to prepare a home for sale. When they are ready to list their home, they can come back to the Web site and fill out a form to contact the realtor.
The General Motors site, http://www.gm.com/, is set up
to trigger indirect action. You can't (yet) buy a car through
the Web. But you can use the Web to compare models and features,
and take that information to a dealer to select a car.
Tip |
The most effective sites tend to be direct action sites because there is a tangible measure of their effectiveness that allows the sites to be improved over time. Rather than trust users to remember your site when they are ready to buy, convert delayed action and indirect action sites into direct action sites by letting users sign up for more information. |
Many clients are not sure why they want a Web site so it's impossible for them to set objectives or participate in the design process. Here's a process the site developer can use to help clients clarify their goals and objectives:
Marketing experts use the acronym AIDA to describe the marketing and sales process:
On the Web, "attention" translates to visits. A site can't be effective if no one visits it. Three ways to bring in users to a Web site are:
When users arrive at the site, hold their interest. Whereas lots of sites use impressive graphics, sounds, and even animation, most experienced Webmasters acknowledge that the most effective interest-holding tool is content. We use content here to mean information that is of use or interest to users, even if they do not take the desired action.
The Displaced Cajun Pages at http://www.webcom.com/dp-cajun/, shown in Figure 1.1, have great content. This site entertains users with information about Cajun food, music, and culture. If users have any interest in things Cajun, they will tour their way through the site.
Figure 1.1: Displaced Cajun home page.
In a Direct Action site, the goal is to bring users to the point where they decide to take action. In the example of the Displaced Cajun page, by the time users find their way to the order form, they have had a grand tour of mouthwatering Cajun recipes. Many of these users have decided (one hopes) to order some Cajun food.
In a Delayed Action site, the site should bring users to the point where they make a decision to remember this site. They may bookmark the page or enter it in a site-monitoring service such as URL-minder.
An Indirect Action site combines many of the features of a Direct Action and a Delayed Action site. The Webmaster wants users to find out enough about the product to make a decision, but users can't take action on the Web site.
Often, an effective strategy is to turn the Indirect Action site
into a Direct Action one: offer users a form to fill out that
allows them to register their interest. These registrations are
later turned over to people who qualify the user and close the
deal. For a commercial site, these "closers" are called
salespeople. For, say, a political party, the closers might be
volunteers who follow up with users to help them "get out
and vote."
Tip |
Part of the Webmaster's responsibility in providing content is to ensure that the content is findable. This requirement implies:
|
It is the responsibility of the Webmaster to help clients set realistic objectives for their site, based on their experience and research. Irresponsible Webmasters promise everything, deliver little or nothing, and leave a trail of angry clients saying that "the Web doesn't work."
Professional designers don't leave their clients at all. They build effective sites that meet objectives and work to maintain the site to enhance its effectiveness.
To help clients visualize the recommended site, the Webmaster can use presentation tools to prepare a treatment or storyboard of each page. Usually, the first graphic should show a high-level view of the site: where is the home page and what can the visitor access from there? How many "layers" are there to the site?
In the sample site, the realtor wants to attract people to list their homes with that company. Signing a listing agreement is not something most folks are likely to do over the Web, so a reasonable goal is to have people call the realtor and schedule an appointment-making it a Delayed Action site.
If the site is successful, users might be interested in filling out a form on the Web site to tell the realtor they are interested in the realtor's services-making it a Direct Action site.
Figure 1.2 shows a first cut at a storyboard.
Figure 1.2: First cut at a storyboard.
Once the client decides to move forward with the project, the Webmaster gathers information from the client for each of the pages. For example, what makes this realtor unique? Why would someone choose this realtor instead of a competitor?
Many clients are able to supply print ads, brochures, and other
collateral material to help the Webmaster get started. Although
the Webmaster may need additional material to provide effective
content, existing copy is a good place to start.
Caution |
Do not give in to the temptation to just copy existing print material onto the Web site. The Web is an interactive medium with content needs that are different from traditional media. Commercial messages in traditional media are often an interruption of something else (such as entertainment). Such messages must get in, get people's attention, deliver their content, and get out quickly before they lose interest (or the budget runs out). Web visitors are intelligent, curious, and want the site to give them access to large amounts of content. They are using the Web to seek out that content-it is not an interruption, and good content will hold their attention. Use the client's existing print material to learn about the client's business-identify the unique elements that make them stand out from others in their industry. Learn why someone would use their products or services. Use this information, in turn, to design a content-rich site that addresses the needs of the Web visitor. |
The Webmaster should also get started on graphics at this point. For many sites, the only graphic needed may be the client's logo. The client should supply a clean copy that the Webmaster can scan into the computer.
Figures 1.3, 1.4, and 1.5 show some of the pages of a more fully
developed storyboard for the realtor's site.
GSH: Welcome About GSH GSH is one of Virginia's largest real estate brokers, with over 400 agents serving southeast Virginia (a region known as Hampton Roads). Table of Contents This site contains information of interest to homeowners and home buyers in southeast Virginia (Hampton Roads).
|
Information for Prospective Buyers
Finding a Home Qualifying for a Loan Closing Costs Buyers' Broker Special Offer for Buyers |
Thank you for taking the time to fill out this form. For a limited time, anyone buying their home through GSH will receive __________________. (Fill in special offer here.) Name:____________________________ Address:______________________________________________ City:____________________ State:__________ Zip Code:___________ I currently _____ own my home. _____ rent my home. Questions for Homeowners Questions for Renters Housing Preferences I need ______ bedrooms. | ||||
Send Form | Clear Form |
Many developers begin with a basic knowledge of HTML but have not been taught how to make a site effective. Figure 1.6 shows a prototype of the sample site that might be built by such a developer. The complete code for this page is on the CD-ROM.
Figure 1.6: Sample site welcome page.
Figure 1.6 shows how the prototype page looks when viewed with
one particular browser: Netscape Navigator 2.0. When viewed with
a different browser, the same code can produce a page that looks
quite different.
Note |
The HyperText Markup Language, or HTML, consists of tags that are written between angle brackets. Many tags have attributes-fields that give more specific information to the browser about how to interpret the tag. For example, <BODY BACKGROUND="texture.gif"> is an instance of the BODY tag. It has one attribute, which tells the browser to use the file texture.gif as the background image. Although standards exist, not all Web browsers understand all tags. If a browser does not understand a tag or extension, it is supposed to silently ignore it. This behavior allows browser vendors to introduce their own tags and attributes, called extensions. |
The original versions of HTML (versions 0 and 1) were experiments that worked well. So well, in fact, that the industry decided to improve the language. Representatives from all over the world contributed ideas and voted on them to come up with HTML 2.0. Count on any browser being able to read HTML 2.0. Beyond 2.0, all browsers are not created equal.
HTML 3.0's development started out just like 2.0's: good ideas poured in and members of the working group voted on how to best design the language. But this kind of voting and discussion take time, and things on the Web change fast.
First, Netscape Communications and then other browser developers came up with their own tags and attributes to do things that couldn't be done in HTML 2.0. As the HTML 3.0 standard emerged, Netscape's tags were not always consistent with HTML 3.0.
Then the HTML 3.0 working group decided to disband, and develop specific enhancements piecemeal. The draft HTML 3.0 proposal was allowed to expire. Some browser vendors picked up parts of the old HTML 3.0 standard, some incorporated others, and some vendors just ignored the whole thing and stayed at HTML 2.0.
Because Netscape Navigator could do so much more than so many other browsers, many HTML writers began to use Netscape tags as though they were part of HTML. Sometimes they announced that their pages were "enhanced for Netscape" and sometimes they just let the users figure it out.
The result is that users can get completely different effects depending on which browser they use. To make things even more complicated, the browsers display some colors differently on different platforms. And in most browsers, the user can change elements like the font, the font size, and some of the colors.
To deal with this complex situation, some Webmasters write a different version of the page for each major browser. Sometimes this step is necessary to get a particular effect the client wants. But it takes a lot more work and it never stops because next year's browsers will have a new set of capabilities and limitations.
Expert Webmasters solve the problem using a process like this one:
Some Webmasters wonder why they should apply ALT tags. "After all," they reason, "Lynx is most heavily used among college students-and that's not who my client is trying to reach."
Most people do have graphical browsers but those graphics take quite a while to download. Webmasters who keep track, report that often about 30 percent of their users have turned graphics downloading off. So they see only the contents of the ALT tags. Then, if a graphic looks interesting, they'll load just that graphic.
Other developers wonder why so much emphasis is put on Netscape Navigator and they're right. In its purest form, the problem-solving process above should say, "Consider using a browser-specific tag to get the effect you want." But currently, a huge percentage of the market is using the Netscape browser. If and when that changes, update the process.
To find out what browsers people are using and what tags those browsers are capable of handling, visit BrowserWatch and BrowserCaps. BrowserWatch at http://www.browserwatch.com/ keeps track of which browser people are using when they visit the site. Although the numbers are not completely representative, they do give an idea of trends and rough market share.
BrowserCaps at http://www.objarts.com/bc/ allows users to test their browser using a standard set of tests. The results are posted on the site and show what features each browser is capable of.
Most pages can be made to look nice and be effective in HTML 2.0 or at least HTML 3.0. If a developer must deliver a browser-specific version of the page, there are several ways to do that. One of the better ways is with a CGI script as described in Chapter 7, "Extending HTML's Capabilities with CGI."
The process below gives a quick and dirty division between Netscape Navigator and most other browsers. Just remember that when you prepare a browser-specific page, you pay for it later in extra maintenance time.
In the <HEAD> section of the page, add the following line:
<META HTTP-EQUIV=REFRESH CONTENT="0;URL=/path/to/netscape-enhanced/page.html">
The META tag is used to carry information not specified in other HTML tags. Netscape Navigator uses the REFRESH attribute to tell the browser to request a new page a specific number of seconds after it reads the current page.
With a refresh time of zero, the Netscape browser reads the first page, then immediately loads the second page. The second page can be Netscape-enhanced since only Netscape Navigator or Netscape Navigator-compatible browsers like Microsoft's Internet Explorer observe the REFRESH tag.
This method is fairly crude because it only serves to separate Netscape Navigator from non-Netscape browsers. There are better techniques available using CGI. To take advantage of CGI, you must know how to program.
For most Webmasters, the CGI language of choice is Perl. A good starting point for learning Perl is Learning Perl, by Randal Schwartz (O'Reilly & Associates, Inc., 1993). This book is often referred to online as the "Llama book" because of the animal that appears on its cover.
Chapter 3, "Deciding What to Do About Netscape," gives more specific recommendations on how to deal with Netscape Navigator.
The CD-ROM contains an example of a Netscape-specific version of a page.
Listing 1.1 shows the HTML for that page.
Listing 1.1 List11.html-A Netscape-Enhanced Page
<HTML> <HEAD> <TITLE>Nikka: Welcome</TITLE> </HEAD> <BODY BACKGROUND="Graphics/graybg.gif" TEXT="#FFFFFF" LINK="#0F792C" ALINK="#830581" VLINK="#9400D3"> <CENTER> <P ALIGN=Center> <PRE> </PRE> <A HREF="General/5.aboutNikka.shtml"> <IMG SRC="Graphics/NGlogo3aBW.gif" HEIGHT=90 WIDTH=216 ALT="Nikka Galleria Logo" BORDER=0></A><BR> </P> <PRE> </PRE> </CENTER> <CENTER> <P ALIGN=Center> <A HREF="Talent/1.Index.shtml">Enter Here</A></P> <PRE> </PRE> <CENTER> <P ALIGN=Center> <A HREF="General/9.repLogo.shtml">Represented by Kerry Reilly</A> </P> <P ALIGN=Center> Boston</P> <P ALIGN=Center> Chicago</P> <P ALIGN=Center> Charlotte</P> <P ALIGN=Center> Atlanta</A></P> <P ALIGN=Center> <A HREF="#more"><IMG SRC="Graphics/more.gif" ALT="More" BORDER=0 HEIGHT=39 WIDTH=93></A> </P> </CENTER> <PRE> </PRE> <A NAME="more"></A> <CENTER> <P ALIGN=Center> <A HREF="General/1.tableOfContents.shtml">Table Of Contents</A> </P> <P ALIGN=Center> <A HREF="General/9~1.representation.shtml">Representation</A> </P> <P ALIGN=Center> <A HREF="General/5.aboutNikka.shtml">About Nikka</A> </P> </CENTER> <CENTER> <P ALIGN=Center> <IMG SRC="Graphics/previousG.gif" ALT="Previous Page" BORDER=0 HEIGHT=39 WIDTH=93></A> <A HREF="General/1.tableOfContents.shtml"> <IMG SRC="Graphics/next.gif" ALT="Next Page" BORDER=0 HEIGHT=39 WIDTH=93></A> </P> <P ALIGN=Center> <A HREF="General/4.searchForm.shtml"> <IMG SRC="Graphics/search.gif" ALT="Search for Art" BORDER=0 HEIGHT=39 WIDTH=93> </A> <A HREF="General/1.tableOfContents.shtml"> <IMG ALT="Contents" SRC="Graphics/contents.gif" BORDER=0></A> <A HREF="Talent/2.Artists.shtml"> <IMG ALT="Artists" SRC="Graphics/artists.gif" BORDER=0></A> </P> </CENTER> <PRE> </PRE> <FONT SIZE=2> <CENTER> All contents Copyright© 1995<BR> <A HREF="General/5.aboutNikka.shtml">Nika Marketing & Communications Group</A><BR> All rights reserved. </CENTER> <CENTER> <P ALIGN=Center> <A HREF="http://stats.internet-audit.com/cgi-bin/stats.exe/0001222"> <IMG ISMAP BORDER=0 SRC="http://g1.internet-audit.com/act/ZQ0001222.gif" ALT="Make your visit count, load this image." HEIGHT=16 WIDTH=16></A> </P> </CENTER> </FONT> <FONT SIZE=1> <CENTER> <P ALIGN=Center> This site produced by <A HREF="http://www.dse.com/">DSE, Inc.</A> </P> <P ALIGN=Center> Last modified: <EM>December 14, 1995<BR></EM> URL: <EM>http://www.dse.com/nikka/index.html</EM> </P> </CENTER> </CENTER> </FONT> </BODY> </HTML>
One of the tensions in the Web developer community is between the Web as an advertising medium and the Web as an information medium.
The Web is a new kind of medium with different rules from print, or TV, or radio. The whole industry is trying to find out what works best for the Web. One of the things we know for sure is that people go to the Web to look for information. Many people think that the Web is like a library. When these people see blatant ads on the Web, they feel much as they would if they went to the library and found mostly product catalogs.
The Web can certainly be used to promote commercial products, but advertising on the Web takes a different form than advertising in other media. In print, for example, you can pay to run an ad or you can write an article. Even in print, many people consider articles that you write or articles written about you to be much more effective than ads alone.
A Web site must do more than advertise. The immediate objective is to draw people to the site by promising information, then delivering on that promise. If qualified clients come to the realtor's site to learn how to prepare their home for sale or how to set the price, they'll stick around to find out about the realtor's services. Many of them will decide to use that realtor.
As content for the new site comes in, the Webmaster may be concerned that the site is starting to look like a patchwork quilt. The site may have great content but if users can't find what they're looking for, they won't say around to hunt for it.
The solution to this problem is to use a style guide. A style guide for Web pages is to a Web site what a conventional style guide is to print authors.
The style guide does three things. First, it ensures that the site has a consistent look and feel. If users follow a link out of the site, the change in sites should be apparent to them.
Second, it gives the Webmaster a starting point. You can often get a basic version of a new site up very quickly by selecting the right style guide and putting in the copy. Some of the newer word processors do the same thing for desktop publishers and call them templates.
Third, the style guide is a checklist to help make sure that you haven't forgotten anything. Sometimes you might choose to deviate from your standard, but you should never just "forget."
The best style guides are those developed for in-house. Many firms use several different style guides, depending on the goals of a particular site. Visit the HTML Writers Guild site at http://www.hwg.org/. In the Guild pages, you will find many different style guides to review.
Figure 1.7 shows how the sample site is improved by applying the recommendations of a style guide. The two graphics at the top of the page are links to the company home page and the Realtor's home page respectively. A standard set of navigational buttons has been added at the top and bottom. Standard contact and support information has been added in a footer.
Figure 1.7: Sample site using style guide.
Listing 1.2 shows the code for that page.
Listing 1.2 List12.html-A Page Based on a Style
Guide
<HTML> <HEAD> <TITLE>GSH: Welcome!</TITLE> </HEAD> <A HREF="Homeowners/3.WhyGSH.html"><IMG ALT="GSH Logo" SRC="Graphics/gshlogo.gif"</A> <A HREF="Homeowners/4.WhoIsRose.html"><IMG ALT="Rosemarie Morgan" SRC="Graphics/rose.gif"></A> <BR> <IMG ALT="Previous Page" SRC="Graphics/previousG.gif"> <A HREF="General/2.Credits.shtml"><IMG ALT="Next Page" SRC="Graphics/next.gif"></A> <A HREF="welcome.html"><IMG ALT="Contents" SRC="Graphics/contents.gif"></A> <A HREF="General/4.IndexOfPages.shtml"><IMG ALT="Index" SRC="Graphics/index.gif"></A> <P> <H1>Content goes here</H1> <HR CLEAR=left> <IMG ALT="Previous Page" SRC="Graphics/previousG.gif"> <A HREF="General/2.Credits.shtml"><IMG ALT="Next Page" SRC="Graphics/next.gif"></A> <A HREF="welcome.html"><IMG ALT="Table of Contents" SRC="Graphics/contents.gif"></A> <A HREF="General/4.IndexOfPages.shtml"><IMG ALT="Index" SRC="Graphics/index.gif"></A> <A HREF="General/3.Help.shtml"><IMG ALT="Help" SRC="Graphics/help.gif"></A> <HR> <H3>Comments to Author</H3> <ADDRESS> <A HREF="mailto:rmorgan@infi.net"> rmorgan@infi.net</A><BR> <A HREF="http://www.dse.com/GSH/">Rosemarie Morgan</A><BR> GSH Real Estate<BR> 4521 E. Honeygrove Road<BR> Virginia Beach, Virginia 23455-6007<BR> <P> Phone: 1-800-472-9700 or 804-552-6437 (24-hours) <P> FAX: 1-804-460-5536</ADDRESS> <HR> <PRE>All contents Copyright © 1995 Rosemarie MorganAll rights reserved.</PRE> <P> <A HREF="http://www.halsoft.com/html-val-svc/"><IMG SRC="Graphics/valid_html.gif" ALT="HTML 2.0 Checked!"></A> <A HREF="http://stats.internet-audit.com/cgi-bin/stats.exe/0001222"> <IMG ISMAP BORDER=0 SRC="http://g1.internet-audit.com/act/ZQ0001222.gif" ALT="Make your visit count, load this image."></A> <BR> <H6>This web site produced by <A HREF="http://www.dse.com/welcome.html">DSE, Inc.</A></H6> Last modified: <EM>February 6, 1995</EM><BR> URL: <EM>http://www.dse.com/GSH/welcome.html</EM> </P> </BODY> </HTML>
There is no such thing as the "best" style guide. Every organization and every site has different needs. The style guides listed at the HTML Writers Guild site are a starting point-your organization will want to choose one of those, then tailor it to meet your objectives. Here are a few of the elements you may want to address in your style guide:
Chapter 6, "Reducing Maintenance Costs with Server-Side Includes," takes up the use of SSIs in site design.
Some sites exhibit a behavior that might be called "T1-itis." The Webmaster has a big, wide connection to the Net such as T-1 and thinks that everyone has a link like that. In fact, about half the users on the Net use a dial-up connection, typically through a 14,400-bps modem. A 40- or 50-K graphic can take 30 seconds or more to load over a connection like that.
Chapter 4, "Designing Faster Sites,"
and Chapter 5, "Designing Graphics
for the Web," give recommendations on how to reduce file
size and download time dramatically. At the design level, be sure
to keep graphics physically as small as possible (without making
them unusable). Use only one per page. If it's important to have
several per page, make them very small (thumbnails) and link the
small graphic to a larger version.
Tip |
It's a good idea to warn users before serving them a graphic that's more than 20K or so. Use thumbnails or text, and let them decide whether or not to download it. |
Here's a general formula for estimating how long a page will take to download. Every time the browser has to go back to the server for an image, figure about a second of overhead before anything even begins to transfer. Then figure a transfer rate on a 14.4-Kbps modem of between 1,000 and 2,000 bps. On fairly good lines, you might average 1,700 bps.
The listing pages have five buttons, a logo, and a picture of the house. The graphics and the text typically come to over 90K. So that's 7 seconds of overhead and about 67 seconds of download time.
The average user has long since hit the Stop button. The photos are about 400 pixels by 300 pixels and are 256 colors deep. If they are resized to, say, 80 pixels by 60 pixels, they'll come down under 5K even if the colors are still 8 bits.
Make just that one change and the page download time drops to about 25 seconds-or even less than that when you figure that the buttons and logo graphic are already in the cache on advanced browsers like Netscape Navigator or Microsoft Internet Explorer. In that case, the whole page can load in 5 or 6 seconds.
To change the size of a scanned image, use a graphics package appropriate for your platform. Many Webmasters use the Macintosh for graphics work, even if they serve Web pages from a UNIX machine.
If the image is a GIF, use Graphics Converter (a shareware program available at ftp://ftp.uwtc.washington.edu/pub/Mac/Graphics/. Windows users may want to check out http://world.std.com/~mmedia/lviewp.html. If the image was scanned as a tagged image file format (TIFF) file and then converted to the Joint Photographic Experts Group (JPEG) standard, go back into Adobe Photoshop or a similar program and change the TIFF; then reconvert it to JPEG. Resizing JPEGs is a delicate process and can degrade the appearance of the image. If the graphics work is done by service bureau, make sure they follow the guidelines of the HTML Writers Guild in making thumbnails. For the best quality, they should brighten the image a bit before they shrink it. There's a tutorial on this subject on one of the HTML Writers Guild pages at http://www.hwg.org/.
Figure 1.8 shows what a listing page looks like with smaller graphics.
Figure 1.8: Thumbnail on a listing page.
Once a site is approved by the client and goes live on the Net, the client begins to enjoy the benefits of a Web site. He or she begins to ask for modifications: additional features and content that didn't make it into the original design.
As the new site grows, the number and names of the pages make the development machine a very confusing place.
Most Web sites are hosted on UNIX machines and many Webmasters run UNIX on their development machines. The techniques described in this section can be applied to any platform. The software described is often bundled into UNIX or is available free over the Internet. Similar software can be purchased for Windows machines and Macintoshes.
The concept is called configuration control. It's a way of keeping many different versions of each file without having to juggle dozens of backup tapes or save multiple full copies.
In a UNIX machine's documentation, look up Source Code Control System (SCCS). Some machines also have the Revision Control System (RCS), which is a newer piece of software to do the same thing. With either of these programs, you can check a file in, check it out, lock it so no one else can change it, and keep a version history.
If you use SCCS, you can set up the configuration control system with just four steps:
After the file is checked in, these characters provide the file name, version, and date.<!--%W% %G%-->
sccs create filename
Your HTML file will be copied to the SCCS directory and the character "s" will be added to the beginning of the file name. The original file will be set to read-only. Now when you want to check out a read-only copy of the file, type:
sccs get filename
If you want to edit the file, get an editable copy with:
sccs get filename
To put a file away after you have been working on it, type:
sccs delta filename
SCCS prompts you to describe the changes you have made. Type as
many lines as you like and finish with a blank line.
Finally, if you decide to abandon any changes and revert to the
stored version, just type:
sccs unedit filename
Now you have some protection if you make changes you later regret or if the client asks for a change and then changes his or her mind. To get, say, revision 1.5, you just type:
sccs get -r1.5 filename
If you use RCS, the same principles apply. Call the subdirectory RCS and add the characters $ID$ to each page inside a comment.
To put the file under revision control, use:
ci filename
The ci program prompts you for a description.
To checkout command (co) gets a copy of the file. To edit the file, type:
co -l filename
(Note that the parameter is a lowercase "l" and not a one.) When you're done with the file, check it back in with the ci command.
As with SCCS, you can go back to previous versions. To get, say, revision 1.5, type:
co -r1.12 filename
Now set up your machine like this:
The root directory of your server is probably called htdocs. Wherever that root is, make a directory for each site you develop. Give it the name that the site will have on the live server.
To set up a site for a client named Bob, with a URL on your machine like URL: http://www.xyz.com/Bobs/, make a directory in the root directory of your development machine called Bobs. Make a parallel directory for the configuration control system. If the client's site is called Bobs, the SCCS directory would be Bobs/SCCS.
Inside the development directory, put the first page that will come up. To see what the default name is, look for DirectoryIndex in the srm.conf configuration file. On some machines, the server may look at localhost_srm.conf.
By default, these configuration files are in /usr/local/etc/httpd/conf/ but check with your system administrator to find out for sure. Also by default, DirectoryIndex is index.html but on some machines it may be set to welcome.html or some other value. In addition, some servers like Apache allow you to have multiple default names.
Let's assume that the server defaults to index.html. Put the site's home page in the index.html file in the site directory. Then, set up subdirectories for each of the major sections of the site and set up a subdirectory for graphics.
In your in-house style guide, specify the name of standard directories like the graphics directory. Call it Graphics or graphics or anything else you like. The important thing is to always call it the same thing and put it in the same place so you don't have to guess each time.
Within each subdirectory, name the files so that your machine sorts them in the order they should appear on the site. Name major pages starting with 1~XXX, 2~YYY, and so on. Give subpages under them names like 1~1.XXX.html and 1~1~1.XXX.html.
Now you can tell at a glance which pages belong where. When you put in Previous and Next arrows, you know which pages they should link to even after the structure of the site has changed.
After you have produced a stable version of the client's site, check it into the configuration control system. The exact commands depend on whether you use SCCS, RCS, or some other package. But get that stable version put away and make sure your backup system is backing up the configuration control system directory.
Before you show the pages to your client, test them on a local server. If you have any CGI scripts or server-side includes, test them now.
Next, put all the pages up on a private Web site and test them. Here's one process for uploading pages. You can use a different approach, depending on how you get to your site:
This works because the find command traverses the directory tree looking for files that match its tests. The test type d causes it to limit itself to directories only.find Bobs -type d !-name SCCS -exec sccs check {} \;
This works because the find command traverses the directory tree looking for files that match its tests. The test type f causes it to limit itself to regular files only. The test name *,v says look only at files whose names end in ",v", which denotes an RCS file.find Bobs -type f -name "*,v" -exec rlog -L -R {} \;
The cd changes to the directory above the one we want to archive. This directory is where the tar file will go. The tar "c" parameter says to create a new archive. The "v" says be verbose-report each file archived.cd .. tar cvfF Bobs.tar ./Bobs
gzip Bobs.tar
ftp www.xyz.com login: Your user name password: Your password (does not echo) cd /usr/local/etc/httpd/htdocs put Bobs.tar.gz quit
gunzip -c Bobs.tar.gz | tar xvf -
find Bobs -exec chmod 666 {} \;
Now that all the files in the site are on the server, make sure that all the links work as you expect. Make sure that the pages download as quickly as you expect them to. If you need to make any changes, make sure you check out the files from the configuration control system, make the changes, and check them back in.
There are several checks you will want to make once the pages are accessible over the Net. To become familiar with these checks, start with Doctor HTML at http://imaginware.com/RxHTML.cgi. Doctor HTML lets you run multiple tests, including a spell checker, a simple structure validator, and a rough performance tester. (Figs. 1.9 through 1.11 show a sample run with Doctor HTML.)
Figure 1.9: Doctor HTML checks the overall structure of the document.
Figure 1.10: Doctor HTML makes suggestions to improve the images on the page.
Figure 1.11: Each hyperlink is exercised to make sure it leads somewhere.
You may want to run other online checks. In particular, you will want to run a formal validator, described in Chapter 2, "Reducing Site Maintenance Costs Through Testing and Validation."
When the site has passed all the tests, print out each page of the site and take it to the client for final review. Have the client initial each page as he or she reviews it. Then make any final corrections that the client requests and print two copies of the final set, one for Bob's files and one for the Webmaster's.
Make sure clients understand that the best sites are regularly
updated. Leave clients with all the pages of the site, along with
a change sheet. Encourage clients to fill out a change sheet for
every page they want to change each month-more frequently if possible.
When you get the changes, implement them on the site to keep the
site current. Chapter 41, "How to
Keep Them Coming Back for More," describes monthly maintenance
in detail.
Tip |
Some Webmasters have the client send them the changes electronically. Others find that it takes more work for the client to describe the change in text than it does for them to mark up a paper copy of the existing change. By asking them to mark up the paper copy, and having them fill out a change sheet, you get a paper trail showing who asked you to do what. This documentation can be useful if there's ever a question about why or when some part of the site was changed. |
Each month, get the change sheets from the client. Check out the relevant files, make the changes requested, and check the files back in. Follow the same procedure used initially to install the site, including checking out the pages with Doctor HTML and a formal validator.
Some Webmasters use a one-stop submission service such as SubmitIt they get the best results by submitting sites to each search engine individually. Visit Yahoo at http://www.yahoo.com/ and look at a list of search engines. Submit the site to the major directories, including
Note that some developers charge clients as much as $25 per listing to submit their name. Be careful that the client does not conclude that this money is paid to the listing service. Many of the best Webmasters charge a flat fee to submit a site to a certain number of listing services and make sure their clients understand that they are paying for the Webmaster's time to enter the data.
Here are some places on the Net where HTML authors can get more information: