-->
Previous | Table of Contents | Next |
A program called httpd_monitor, in the support/ subdirectory in the Apache distribution, can be run against the scoreboard file to give a picture of the state of all the child processes and whether theyre just starting, active, sleeping, or dead. It can give you a good idea of whether your settings for MaxSpareServers and MinSpareServers are decent. Consider it a close equivalent to the system command iostat.
You can increase performance over the standard setup in many ways, including smarter ways to configure your resources, features that can be turned off for better performance, and even things at the operating system and hardware level that can be addressed. All these factors make a difference between a regular Web server and a high-performance Web server.
Most non-hardware improvements fall into three categories: those that reduce the load on the CPU, those that reduce the amount of I/O to the disk, and those that reduce the memory requirements.
Server-side includes (SSI) are preprocessing HTML statements that can cause an increased disk access load and an increased CPU load. The CPU penalty comes from having to parse the HTML file looking for the includes; parsing a file is more intensive than just reading the file and spitting it out to the socket.
The disk access penalty comes from having to make two, three, four, or more separate disk accesses to pull together the page to get served. For example, a typical SSI document might need a header and footer pulled into memory to get served. Thats three disk accesses to pull the document together, instead of one. If the inline HTML files were large, the difference wouldnt be as great. Because the HTML files are usually small, the disk access penalty is relatively large. The problem is compounded by any CGI script that might be included as well; if you had an SSI page with two CGI scripts included, youd probably get at least twice the performance hit than if you had one CGI script that just rendered the whole page in the first place.
Apache uses special files, known as .htaccess files, for controlling access to directories. Searching directories for .htaccess files is fairly painful. Because .htaccess files work hierarchically, when a request is made for /path/path2/dir1/dir2/foo, Apache will look for an .htaccess file in every subdirectory. In the example of /path/path2/dir1/dir2/foo, thats at least fivea significant disk access load thats best to avoid if possible.
To solve the problem of too many disk hits, you should put anything controlled via your .htaccess files into the access.conf configuration file or even the srm.conf file. If you have to look for .htaccess files in subdirectories and can narrow it down to a specific subdirectory, its possible to have the server look only for .htaccess files in that subdirectory by the use of AllowOverride.
Suppose that your document root is in /www/htdocs, and you want to turn off the searching for all .htaccess files except those in /www/htdocs/dir1/dir2 and everywhere below. You would put something like the following into your access.conf configuration file:
<Directory /www/htdocs> Options All AllowOverride None </Directory> <Directory /www/htdocs/dir1/dir2> Options All AllowOverride All </Directory>
Its important that the directories are listed in that order so that the second <Directory> doesnt take precedence over the first.
.asis files are distinguished by having their HTTP headers directly embedded in the file itself. They are a useful optimization for certain types of files, such as server-push animations, which demand the capability to set their own headers and are usually dished out by CGI scripts. The usual server-push CGI script has the additional overhead of assembling the images on-the-fly, whereas with an .asis file, the whole stream can be linked into one file, reducing the I/O hit and the memory and CPU performance situation.
See As-Is Files, p. 701
The only thing you lose by using the .asis parameter is the ability to do timed pushes, where theres a lapse of time between frames implemented as a sleep() (a system call where a program is paused for a defined number of seconds). But because server-push is also bandwidth-limited, many consider the ability to do timed pushes to be a dubious feature.
Certainly one goal for the site administrator should be to automate the rotation of access and error logs. Even a lightly loaded server will generate a couple of megabytes of log activity per day. Left unchecked, your disk space could dry up fast.
The most basic element of logfile rotation is to get the Web server to stop writing to the old log and start writing to another without disrupting service to the outside users. The most straightforward way to accomplish logfile rotation is by renaming the log just slightly and sending a SIGHUP signal to the parent process. Just slightly means renaming it to access_log.0 or something similar on the same hard disk, on the same partition. Why? Each child has a file descriptor open to the logfile. When you rename the file, the file descriptor will still point to the same actual log right up until the time the child receives an echo of the SIGHUP from the parent process. When the SIGHUP echo happens, the file descriptor is closed, a new one is obtained, and the new access_log gets created. This is pretty much the only way to guarantee not losing traffic reports while rotating logs.
Here is an example script that performs a logfile rotation:
#!/bin/sh logdir=/usr/local/etc/httpd/logs # name of the log directory acclog=access_log # name of the access log errlog=error_log # name of the error log pidfile=$logdir/httpd.pid # file that stores the parents # process ID mv $logdir/$acclog $logdir/$acclog.0 mv $logdir/$errlog $logdir/$errlog.0 kill -HUP cat $pidfile
Previous | Table of Contents | Next |