-->
Previous Table of Contents Next


Chapter 37
Managing an Internet Web Server

by Steve Burnett

In this chapter
Controlling Server Child Processes
Using the Scoreboard File
Increasing Efficiency in the Server Software
Automating Logfile Rotation
Understanding Security Issues
Other Tuning Issues

One of the biggest strengths of the Apache Web server is that it’s highly tunable. Just about every feature that imposes any sort of extra server load is an option, which means you can sacrifice features for speed if you need to do so. That said, Apache is designed for speed and efficiency. Even with all of Apache’s features, you’ll probably swamp a full T1 worth of bandwidth before exhausting the resources of a well-constructed Web server’s hardware, be it a Linux system or something else.

Apache has also been designed to give site administrators control over where to draw the line between security and functionality. For some sites with many internal users, such as an Internet service provider, being able to control the policies toward what functionality can be used where is important. Meanwhile, a Web design shop might want complete flexibility, even if it means that an errant Common Gateway Interface (CGI) script could expose a security hole or do damage (incidentally, many feel that CGI in general is one big security risk).

Controlling Server Child Processes


See “Starting Up Apache,” p. 677

As you learned in Chapter 35, “Getting Started with Apache,” Apache uses the concept of a swarm of semi-persistent daemons, sometimes also called children, running and answering queries simultaneously. Although the size of that swarm varies, there are limits to how large it can get and how quickly or slowly it can grow. This size issue is critical; one of the main performance problems with older servers that executed a fork() system call at every request was that there was no limit to the total number of simultaneous daemons, so when the main memory of a machine would get consumed and start swapping to disk, the machine would effectively lock up and become unuseable. This was colloquially called daemon-spamming.

Other server software lets you specify a fixed number of processes, with the “fork for every request” behavior kicking in if all the children are busy when a new request comes in. This is also not the best model—not only do many people set that fixed number too high (having 30 children running when only five are needed can hinder performance), but this design model also removed the protection against daemon-spamming.

So, the Apache model is to start out with a certain number of persistent processes, and make sure that you always keep some number (actually, a range somewhere between a minimum and a maximum) of “spare” processes to handle a wave of simultaneous requests. If you have to launch a few more processes to maintain the minimum number of spares, no problem. If you find yourself with more idle servers than your maximum number of spares, the excess idle ones can be killed. There’s a maximum number of processes, beyond which no more will be launched, to protect the machine against daemon-spamming.

The algorithm to protect against too many processes bogging down or killing a server is configured by using the following configuration directives in /usr/local/apache/httpd.conf:


StartServers  10

MinSpareServers 5

MaxSpareServers 10

MaxClients   150

These numbers are the defaults. This says that when Apache launches, 10 children (StartServers) are automatically launched, regardless of the request load at start. If all 10 children are swamped, more are forked until all requests can be answered as fast as they’re received. This requires at least five (MinSpareServers) but not more than 10 MaxSpareServers) free servers to deal with spikes in requests (that is, when a sudden burst of requests come in well within half a second of each other). Incidentally, these spikes are often caused by browsers that open a separate TCP connection for each inline image in a page in an attempt to improve perceived performance to the user, often at the expense of the server and network.


NOTE:  These directives are part of the core feature set of Apache and should be available on any version of Apache.

Usually a stable number of simultaneous child processes is reached, but if the requests are just pouring in (you’ve installed the Pamela Anderson Fan Club page on your site, for example), you might reach the MaxClients limit. At that point, requests will queue into your kernel’s “listen” queue, waiting to get served. If still more pour in, your visitors will eventually see a “connections refused” message. However, this is still preferable to leaving the number of simultaneous processes unlimited, because the server would just launch children with wild abandon and start daemon-spamming, resulting in nobody getting any response from the server at all.

It’s recommended that you don’t adjust MaxClients, because 150 is a good number for most systems. However, you might be itching to see how many requests you can handle with that sixteen-multiprocessor Sun Enterprise 10000 with two gigabytes of RAM; in that case, setting MaxClients much higher makes sense. On the opposite end of the spectrum, you might be running the Web server on a machine with limited memory or CPU resources, and you might want to make sure that Apache doesn’t consume all resources at the cost of possibly not being able to serve all requests that come to your site. In that context, setting MaxClients lower makes sense.

Using the Scoreboard File

Because the multiprocess model described in the preceding section required some decent communication between the parent and child processes, the most cross-platform method of performing that communication was chosen. This is a scoreboard file, where each child had a chunk of space in the file to which it was authorized to write. The parent httpd process watched that file to get a status report and make decisions about whether to launch more child processes or kill idle processes.

At first, this file was located in the /tmp directory. However, because of problems with Linux setups that regularly clear out /tmp directories (causing the server to go haywire), the scoreboard file has since been moved into the /var/log/ directory. You can configure where the scoreboard file goes exactly with a ScoreBoardFile directive.


Previous Table of Contents Next