by Paul Doyle
One reason why the Web has grown so quickly is that it is a very open environment. Services are set up for anyone who cares to use them, quite often with the aim of attracting as many users as possible.
Sometimes, though, services need to be restricted so that only designated people can use them. Restrictions can apply to files or directories, with different levels of access for different users. With such restrictions in place, authenticating the identity of all users who attach to the server is a priority.
This chapter explains how user authentication works and tells you how to set up and administer user accounts on an Apache Web server. This chapter is not about Perl per se, but is intended primarily to serve as a foundation for the rest of the chapters in Part III, "Authentication and Site Administration."
A read-only Web service that is open to everyone presents no particular security problems. The files are made available in read-only mode, and the Web server process presents the files to any users who request them. As long as basic precautions are taken with regard to access rights to the files and the parent directory, the service is secure.
A service in which the user has write access to one or more files is a little more complex. The Guestbook example in Chapter 2 "Introduction to CGI," is an example of this type of service. Actually, the user does not really write to any files; the Web server process does that on the user's behalf. Making such a service secure means granting the Web server process the appropriate access restrictions (read-only to all files except the ones that it needs to write to) and making sure that your CGI program does not offer the end user any security loopholes, such as executing arbitrary strings involving user input through a HTML form.
The type of service with which this chapter is concerned has access restrictions that involve one or more of the following complexities:
Restrictions such as these are more relevant when you are using CGI programs than when plain old HTML files are involved. To understand why, you need to look at the way in which a Web server works.
A Web server consists of a networked computer running special software under a special user ID. The fact that all three elements-hardware, software, and process-are referred to as Web servers in different contexts isn't very helpful. For clarity, I'll use the following definitions:
So the httpd runs as the httpd process under the httpd user ID on the Web server.
The central issue here is that all file accesses on the Web server are performed by an httpd process. For this process to be capable of serving up files from many different directories, it must run under a user ID with relatively liberal access rights. If the process is to run CGI programs, too, it probably will require generous write access throughout the same directories.
This access is a security risk, however, because the process executes CGI programs under the httpd user ID on behalf of users from elsewhere on the Internet. In effect, you are allowing complete strangers to run programs on your system with a privileged user ID. A CGI program potentially can do anything on the server machine that a user who logged in under the httpd user ID could do, including reading, writing, creating, and deleting files. So if you're not careful, you can open your server to attack from anywhere on the Internet.
By default, the httpd user ID is the ID of the process that executes it. This user ID should not be root! Create a special user ID with limited privileges to start the httpd. Then you can start the httpd by using this user ID. Alternatively, if you are using the Apache httpd, use the User and Group configuration directives (described in "Apache Server Configuration Directives" later in this chapter) to get httpd to switch user IDs at startup time.
The server's user ID is, of course, subject to access restrictions in the same way that any other user ID on the system is. But because you need to give the server read and write access to so many places, and because you also allow it to perform tasks on behalf of complete strangers, you need to introduce an extra layer of access control.
It is important to note here that this extra layer involves user IDs and passwords that belong to the server, not to the host system. In other words, the Web server has a password file that is separate from the /etc/passwd file on UNIX systems. Having a logon account on a system does not guarantee that a user can use a restricted Web page on the server, even if the user has read access to that page when logged on interactively. The opposite also is true-a user who doesn't have a logon ID can access restricted Web pages on the server if that user has the appropriate Web server user ID and password.
The extra security layer takes a different form for each HTTP server package, but in essence, this layer has two strands:
The "User Authentication on the Apache Server" section later in this chapter describes how access restrictions work on the Apache server and how to implement restrictions that are appropriate for your site. First, however, you need to consider the issues that are involved in verifying a user's identity.
The effectiveness of your server's access restrictions depends on whether you can authenticate the identity of users who attempt to attach to your server. If you can't confirm that a user is who he says he is, you may as well make everything read-only and remove all sensitive information that you don't want to publish to the world.
Fortunately, you can check the identity of users in many ways. This section describes the most useful methods, starting with a simple unencrypted password check and moving to secure HTTP using public key cryptography.
User ID and Password The most basic kind of authentication is a simple user ID and password check against a list of user IDs and passwords in a file. Users initially connect to a CGI script on the server that challenges them for a user ID and password. If a user enters a valid combination, the script displays another page or sends a HTTP redirect header to the browser to force it to load the other page.
This rather facile approach to authentication is weak for three reasons:
The following sections examine these reasons in detail.
Privacy Given the openness of the Internet, you should assume that all transmissions-messages, Web pages, or (in this case) parts of a HTTP request-can be intercepted. If a transmission is sent in plain text and is intercepted, its contents become known to the person who intercepted it. If the transmission is encrypted in some way and is intercepted, the original content will not be known to the interceptor without substantial extra effort.
Encrypting the content of transmissions on the Internet is, therefore, a means of ensuring privacy. If a user ID and password are sent in plain-text mode, they might be used subsequently in a so-called impostor attack. This type of attack occurs when somebody other than the owner of the user ID-password pair attempts to use the pair to access the server. Encrypting the user ID and password before they are sent to the server protects them from the bad guys and helps ensure user authentication. Encryption by itself, however, does not provide authentication.
Verification Assume that the user ID and password have been discovered by a person who should not have access to a service. This person may have made this discovery in any of several ways: intercepting a user ID-password pair sent as plain text; intercepting and successfully decrypting an encrypted user ID-password pair, although this event is unlikely; or watching the real owner of the pair type them, which is a more likely event. If no other checks are in place, this miscreant can then access the service from any Internet location.
You can encrypt transmissions in such a way that:
These methods are described in detail in "Public Key Cryptography" later in this chapter.
Manageability The simple user ID-password methodology outlined in the preceding section is simple only if it is used to control access to a single location. If this method is extended to several CGI programs that form a single system, coordinating the activities of all the CGI programs can be difficult. Users may have authenticated themselves upon accessing one CGI program, but they will have to authenticate themselves again if they access the same program a second time (or access a different CGI program that forms part of the same system).
The basic idea outlined here, however, can be developed to the stage at which a single CGI script manages access to a set of other files, so that users need to validate themselves only one time. This type of system is examined in detail in Chapter 9 "Understanding CGI Security."
User ID and Password Summary A simple user ID-password mechanism has flaws. The basic principle is sound, though: Users must say who they are (user ID) and then prove it (password). The mechanism is adequate as it stands for services in which security is a low priority, but it needs to be developed a little to allow for really secure transactions. The following section, "Public Key Cryptography," outlines the current best technology to achieve secure user authentication and describes some of the products that use it.
Public Key Cryptography The main problem with the simple user ID-password schema outlined in the preceding sections is the fact that the origin of the user ID-password pair cannot be verified. The problem of adequate verification extends to many other areas on the Internet, including the contents of messages themselves. But in this case, you're concerned only with ensuring adequate verification of a user's identity before allowing that user to access your CGI scripts or Web pages.
Public key cryptography is a method of transmitting data from a sender to a recipient in such a way that nobody other than the recipient can receive the data and the recipient can be certain of the identity of the sender.
The basic idea underlying public key cryptography is the use of a pair of keys:
The public and private keys are derived simultaneously, using a special algorithm in such a way that messages encrypted with a person's public key can be decrypted only with that person's private key. Likewise, messages encrypted with a person's private key can be decrypted only with that person's public key.
Key Pairs Suppose that I want to send a message to you, using public key cryptography. You have a public and a private key. You tell me your public key, perhaps by including it in your e-mail signature, but you keep your private key to yourself. This is what happens:
The details of how messages are encrypted and decrypted with particular key strings are beyond the scope of this book.
This transmission is secure in the sense that anybody else who receives the message while it travels across the network in encrypted form will be unable to determine the original content of the message without knowing the value of your private key. In theory, someone could crack the code and read the message, but the amount of effort required runs into so many thousands of hours on a powerful computer that the possibility is a concern only if you are, say, a major world power. Even then, cracking a single transmission is of no use for cracking other transmissions if you change your key pair on a regular basis.
Certificate Authorities The problem with the scheme described in the preceding section is the fact that your public key is unverified. How do I know that the public key really is your public key and not a public key generated by some impostor who wants to intercept messages to you? When public key cryptography is used to protect mail messages, this problem is not too serious-the impostor would have to establish e-mail communication with me for long enough to convince me that he is you, before he transmits the fake public key. This situation is possible, though.
The problem is much more acute when public key cryptography is used to automatically verify transmissions between two Internet hosts, such as a Web server and a client running a Web browser. The reason is that the public key is transmitted during the same dialogue as the transmission of the secure message that is encrypted with that key value. No opportunity exists to develop trust through a person-to-person dialogue, such as an exchange of e-mail messages.
That's where certificate authorities come into the picture. Certificate authorities are companies that are trusted to issue certificates to Internet users and to verify the contents of those certificates at a later stage.
A certificate contains the following information:
Certificates cost money and are issued at the request of the person or organization to which they refer. The certificates are used as a trusted point of referral by the recipient of an encrypted message, to verify that the sender really is who he or she claims to be.
In terms of the earlier example, here's how I would go about sending you an encrypted message, using certificate verification:
As is true of all Internet communications, the possibility always exists that someone will attempt to impersonate the entity with which you are dealing, so as to eavesdrop on your communications. When you deal with a certificate authority, that company's reputation is your guarantee. Attempting to impersonate a certificate authority brings tremendous wrath down on the head of any malefactor-a great deterrent to that form of impersonation.
Now suppose that you receive a message from me, encrypted with my private key, and you want to decrypt it while making sure that it really did come from me. This is what would happen:
Remember-all this works because of the unique properties of the public-private key pair. In this example, information on my public key is freely available and verifiable. Only messages that are encrypted with the corresponding private key can be decrypted with this public key. That fact means that nobody can fake messages from me without knowing my private key.
Public key cryptography is an algorithmic method. The algorithms that do the real work-deriving public and private keys, and encrypting and decrypting data-were developed by RSA Data Security, Inc. RSA does not produce any end-user software for performing authentication; instead, it licenses its algorithms to other companies for incorporation into their products.
To ensure secure communications between a server and a browser, you need both the server and the browser to execute these algorithms automatically, behind the scenes, acting under an agreed protocol. The next two sections, "Secure HTTP" and "Secure Sockets Layer," discuss two products that use RSA's public key cryptography technology to authenticate Web communications.
Secure HTTP Secure HTTP (S-HTTP) is an extension of the HTTP protocol developed by Enterprise Integration Technologies (EIT); the National Center for Supercomputing Applications (NCSA); and RSA Data Security, Inc. S-HTTP uses public key cryptography to guarantee the authenticity of signed transmissions, allowing for comprehensive user verification. Although the S-HTTP protocol specification is public, the toolkit necessary to build applications that use it is a commercial product. S-HTTP has not yet become prominent on the Web.
Secure Sockets Layer Netscape Communications has approached the authentication issue from a different angle. Netscape has licensed RSA's public key cryptography technology and used it to developed a security protocol called Secure Sockets Layer (SSL). This layer resides between TCP/IP (the communications layer) and HTTP (the applications layer). Netscape states that SSL will support other application protocols, such as NNTP, but that support has not materialized yet.
Netscape has developed another proprietary extension of the HTTP protocol to support SSL on Web servers. This extension is called https, and URLs that are to be delivered through SSL need to have the prefix https: instead of http. A Web server that supports SSL normally watches for http requests on port 80 and https requests on port 443. The use of two separate ports makes it possible for a server to communicate securely with clients that support https while providing normal, unauthenticated communications with other browsers.
User Verification Summary The fields of cryptography and secure communications are much too vast to cover in detail in this book, and doing so wouldn't be appropriate anyway-this is a Perl book, after all. But understanding the different types of user authentication is important, especially if you're going to introduce user authentication on your Web server. The last few pages should be enough to give you a flavor for the various types of user authentication and the current trends in authentication technology.
If your server uses S-HTTP or SSL technology, authentication becomes a matter of server configuration, so your Perl programs don't need to concern themselves with it. If your server doesn't use either technology, you must provide authentication yourself, in your Perl programs. Your Internet Service Provider may not be able or willing to provide support of this kind on its server, for example. Chapter 9 "Understanding CGI Security," describes a method for implementing user authentication on a Web server entirely by means of Perl. This method works with or without an authentication-aware protocol such as https or S-HTTP.
If you need to be absolutely certain of the identity of anyone who is accessing your CGI/Perl programs, you have to use a certificate authority via S-HTTP, SSL, or some other method. If you want to make it difficult for people to fake their identity, a simple user ID-password system may be more appropriate. You may decide to combine the two approaches, requiring a user ID and password for your Perl script even if it runs on a secure server. Ultimately, the level of security that you choose depends on the sensitivity of your data and your estimate of the risk involved.
Assuming that you can satisfactorily verify the identity of all
users who attach to your server, you need to implement a strategy
that gives users of your server enough access to do the things
that you want them to be able to do, but not enough access to
do the things that you don't want them to be able to do.
This section describes the specific details of user authentication
on the Apache server.
NOTE |
This section focuses on access restrictions for the Apache httpd server only; the chapter can't cover all features of all Web servers. Apache is fairly representative, being a superset of the NCSA httpd server. Apache also is an excellent piece of work and currently is the most popular Web server software in the world. |
You can use two basic parameters to restrict access to a service:
Access can be restricted for one or more HTTP access methods (GET, PUT, POST, and so on) for users, groups, IP addresses, subnets, or a combination.
All aspects of configuration of the Apache server are controlled
by a number of configuration files. Each file contains several
configuration directives, each of which controls a specific aspect
of Apache behavior in a specific directory tree. Table 8.1 lists
the configuration files, in the order in which they are processed
by the server. The default file specs shown in the table are relative
to the server root directory.
File | Default File Spec | Override With | Controls |
Server configuration | conf/httpd.conf | httpd's -d command-line switch | Server daemon |
Resource configuration | conf/srm.conf | ResourceConfig directive | Document provision |
Access configuration | conf/access.conf | AccessConfig directive | Access permissions |
Additional configuration directives can be stored in a special file in each directory to provide a fine level of access control. The per-directory configuration file is called .htaccess by default, but you can override this name with the AccessFileName directive (described in "Apache Server Configuration Directives" later in this chapter).
Filtering of Rights The directives in the .htaccess files control server behavior with regard to files in the directory tree in which the .htaccess file is stored. Notice that the directives in a .htaccess file propagate through subdirectories. An attempt to access a file causes the server to look for a file called .htaccess in the directory in which the file is stored, in the parent directory of that subdirectory, in the parent's parent directory, and so on up to the server's document root directory. The .htaccess files found in this fashion are parsed in sequence, with directives in .htaccess files in lower-level subdirectories overriding directives in higher-level directories.
Realms, Users, and Groups The information used to determine whether a user has access to a particular directory on the server is specific to the httpd server. The access-control mechanism used by the system on which httpd executes is not involved. So on UNIX systems, the contents of /etc/passwd are not relevant.
Instead, user information is stored in several user and group
files, which can be either plain text or DBM files. Group definitions
can be omitted if access is to be defined on a user-by-user basis.
CAUTION |
User and group files should be stored in a location that is not exported by the Web server. Otherwise, users may be able to download them and thereby breach your server's security. |
User and group definitions apply to a particular authorization realm. An authorization realm is a set of directories for which access rights are evaluated as a unit. The concept of authorization realms allows a user to access any directory in a designated set on the basis of a single authentication pass. This means that users are prompted for their user ID and password only one time during a session: the first time that they attempt to access a URL within the realm.
Configuration Delimiters Configuration directives appear, one per line, in any of these configuration files. Directives can be grouped by means of the <Directory>...</Directory> and <Limit>...</Limit> delimiters, as follows:
Directory groups can contain Limit groups, but no other nesting of delimiters is permitted. This means that Limit groups may not contain either Limit or Directory groups, and Directory groups may not contain Directory groups.
The authentication-related configuration directives for the Apache httpd are listed in tables 8.2 through 8.4. Directives related to server configuration are listed in Table 8.2; directives that can be used in local .htaccess files are listed in Table 8.3; and directory-specific configuration directives are listed in Table 8.4.
Notice that a certain amount of overlap occurs among these tables, because some directives can be used in more than one context. Those directives that are relevant to user authentication and access restriction are described in separate sections after each table. Refer to the Apache server documentation for detailed information on all directives, including the ones described in this chapter.
Apache Server Configuration Directives Table
8.2 lists the directives that can be used in the server configuration
files.
Directive | Argument Type | Default Value | Purpose |
AccessConfig | File name | conf/access.conf | Name of file containing access-control directives |
AccessFileName | File name | .htaccess | Name of per-directory access-control file |
BindAddress | IP address | * (all IP addresses) | IP address of server to listen on |
DefaultType | MIME type | text/html | Default type for documents with no MIME type specifier |
DocumentRoot | Directory name | /usr/local/etc/ | Name of top-level directory from |
httpd/htdocs | which files will be served | ||
ErrorDocument | Error code | - | Specifies which document to return in the event of a given error code |
ErrorLog | File name | logs/error_log | Name of server error log file |
GroupUnix | Group ID | #-1 | Name or number of user group under which server will run |
IdentityCheck | on/off | off | Whether to try to log remote user names |
MaxClients | Number | 150 | Maximum number of clients that theserver will support |
MaxRequestså PerChild | Number | - | Maximum number of requests that the server will handle simultaneously for any one client |
MaxSpareå Servers | Number | 10 | Maximum number of desired idle processes |
MinSpare å Servers å | Number | 5 | Minimum desired idle options |
Options å | List of options | - | Defines which server features are allowed |
PidFile | File name | logs/ | Name of file where server daemon process ID is stored |
httpd.pid | |||
Port | Port number | 80 | Port number where server listens for requests |
ResourceConfig | File name | conf/ srm.conf | Name of file to read forserver resource configuration details |
ServerAdmin | E-mail address | - | E-mail quoted by address server when reporting errors to client |
ServerName | IP address | - | Server's host name |
ServerRoot | directory name | /usr/local/ etc/httpd | Name of directory where httpd is stored |
ServerType | inetd/standalone | standalone | Whether to run as one process per HTTP connection (inetd) or one process to handle all connections (standalone) |
StartServers | Number | 5 | Number of child processes to create at startup |
TimeOut | Number | 200 | Maximum server wait time |
User | User ID | #-1 | User ID under which server will run |
The server configuration directives that are relevant to user authentication are explained in the following sections.
AccessConfig This directive overrides the default access configuration file specification, conf/access.conf, where access-control directives (such as directory-specific restrictions) are supposed to be stored. In fact, you can store these directives either in the access configuration file or in the resource configuration file.
The following directive in the server configuration file tells the server to read access_test.conf for directives instead of conf/access.conf:
AccessConfig conf/access_test.conf
You can tell the server not to look for an access configuration file by using the file spec /dev/null with the AccessConfig directive.
AccessFileName Before the server sends any file to a client, it looks in the directory in which the file is stored for that directory's optional local configuration file. You can override the default file name, .htaccess, by using the AccessFileName directive.
Group Use the Group directive in conjunction with the User directive to control the access rights of the server process. If you start the httpd server process as root, the Group and User directives cause the server to become the designated user in the designated group before answering any requests. By specifying a user ID and group that has access only to those files that you want to export onto the Web, you can avoid accidental exposure of sensitive information.
Apache recommends that you set up a special user ID and user group to run the server process. This user ID normally should have access only to the documents directory within the httpd directory tree (normally, /etc/local/http/htdocs). You may want to grant read access to the users' home directory tree as well if you want to allow your users to maintain Web material in their home areas.
IdentityCheck Some Web clients run a daemon that allows the client to provide the user name of the remote user to the Web server on request. This identification is not secure and should not be taken seriously; it may be useful in some cases for crude access counts, but such counts will be incomplete, because most clients do not provide identification of this sort.
Setting the IdentityCheck directive to on instructs httpd to ask clients to identify the remote user and, if an identity is provided, to log this information in the server log file.
Options The Apache httpd allows a great deal of control of the use of extra server features on a directory-by-directory level. The Options directive allows you to turn extra server features on for all directories (if used outside a Directory group) or for a specific directory (if used within a Directory group).
The Options directive takes any combination of the arguments in the following list and turns on the specific feature described by that argument. The directive has two special arguments: All turns on all extra server features, and None turns them all off.
ResourceConfig This directive is quite similar to AccessConfig and is provided largely for backward compatibility. ResourceConfig overrides the default resource configuration file specification, conf/srm.conf, which is where resource control directives are supposed to be stored. In fact, you can store the directives either in the resource configuration file or in the server configuration file.
The following directive in the server configuration file tells the server to read srm_test.conf for directives instead of conf/srm.conf:
AccessConfig conf/srm_test.conf
You can tell the server not to look for a resource configuration file by using the file spec /dev/null with the ResourceConfig directive.
User Use the User directive in conjunction with the Group directive to control the access rights of the server process. If you start the httpd server process as root, the Group and User directives cause the server to become the designated user in the designated group before answering any requests. By specifying a user ID and group that has access only to those files that you want to export onto the Web, you can avoid accidental exposure of sensitive information.
The argument to the User directive can be either a user ID or a user number preceded by a pound sign (#).
Apache recommends that you set up a special user ID and user group to run the server process. This user ID normally should have access only to the documents directory within the httpd directory tree (normally, /etc/local/http/htdocs). You may want to grant read access to the users' home directory tree as well if you want to allow your users to maintain Web material in their home areas.
Apache Directory Directives Table 8.3 lists
the directives that may be applied to individual directories.
These directives are used in server configuration files to override
default settings for a particular directory.
Directive | Argument Type | Purpose | |
allow from | List of hosts | Allows access to this directory from the designated IP hosts | |
deny from | List of hosts | Denies access to this directory from the designated IP hosts | |
order | Evaluation order | Sets the order in which deny and allow directives are applied. | |
Require user | List of user IDs | List of IDs of users who can access a directory | |
require group | List of groups | List of groups that can access a directory | |
require valid-user | Allows access to all users who provide a valid user ID and password | ||
AuthName | Domain name | Name of authorization domain for a directory | |
AuthType | Basic | Type of user authorization (only Basic available) | |
AuthUserFile | File name | Name of text file containing list of users and passwords | |
AuthDBM UserFile | File name | Name of DBM file containing list of users and passwords | |
AuthGroupFile | File name | Name of text file containing list of user groups | |
AuthDBM GroupFile | File name | Name of DBM file containing list of user groups | |
Options | List of options | Defines which server features are allowed | |
Allow | Override list | Specifies which directives can be overridden by local .htaccess file |
The directory configuration directives that are relevant to user authentication are explained in the following sections.
allow from Use the allow from directive to specify which IP hosts are allowed to access a given directory. This directive takes a series of host names as arguments, and allows access from each of the designated hosts. Host names may be fully qualified (as in bilbo.tolkien.org) or partially qualified (as in .tolkien.org). A partially qualified host name (such as tolkien.org) allows access from all hosts whose name ends in the string supplied (bilbo.tolkien.org, gandalf.tolkien.org, and so on).
Use the order directive (described later in this chapter) to determine the sequence in which the allow from and deny from directives are evaluated.
deny from Use the deny from directive to specify which IP hosts are not allowed to access a given directory. This directive takes a series of host names as arguments and denies access to the directory from each of the designated hosts. Host names may be fully qualified (as in bilbo.tolkien.org) or partially qualified (as in .tolkien.org). A partially qualified host name (such as .tolkien.org) denies access from all hosts whose name ends in the string supplied (bilbo.tolkien.org, gandalf.tolkien.org, and so on).
Use the order directive (described in the following section) to determine the sequence in which the allow from and deny from directives are evaluated.
order The allow from and deny from directives have opposite effects; they can be used in tandem to control exactly which IP hosts can and cannot access a particular directory. The order in which these directives are evaluated for a particular directory is significant, however.
Consider the effect on frodo.tolkien.org of allow from .tolkien.org followed by deny from frodo.tolkien.org. The net result is to allow access from all hosts in .tolkien.org except frodo. Now consider the effect of deny from frodo.tolkien.org followed by allow from tolkien.org. The net result in this case is to allow access from all hosts in tolkien.org, including frodo.
The order directive allows you to specify whether the allow from or deny from directives are evaluated first. The first argument is either allow,deny or deny,allow. In the first case, deny from directives can override allow from directives; in the second case, allow from directives can override deny from directives.
The following example allows access to frodo.tolkien.org but to no other hosts within the tolkien.org domain:
order deny,allow deny from .tolkien.org allow from .frodo.tolkien.org
require Use the require directive to restrict access to a directory to one or more designated users. Any user who attempts to access a restricted directory is challenged; the user must provide a valid user ID and password before the server returns the requested URL.
The require directive can be used to restrict access in three distinct ways:
The following directive restricts access to users JohnB and DaveD only:
require user JohnB DaveD
This set of directives restricts access to the membership domain to members of the leaders group:
AuthType Basic AuthName membership AuthUserFile /www/staffmembers AuthGroupFile /www/staffgroups require group leaders
AuthName Used with the AuthType, require, and AuthUserFile directives, the AuthName directive sets the authorization realm of the current directory when used inside a Directory group. For an example of using AuthName, see the section on the require directive earlier in this chapter.
AuthType Apache currently has only one type of user authentication: Basic. The AuthType directive was introduced to allow for the anticipated introduction of other methods at a later stage.
Use the AuthType directive with the AuthName and require directives. For an example, see the require directive section earlier in this chapter.
AuthUserFile Use the AuthUserFile directive to specify the name of the text file containing user IDs and passwords that is to be used to verify access to the current directory.
Each line of a user definition file contains a user ID, followed by a colon and a password encrypted with the crypt() function, as in the following example:
jeremiah:sn/A4bkdRjylI ruth:1H.yzi5xcMPbk
This directive should be used in conjunction with the AuthName, AuthType, AuthGroupFile, and require directives.
AuthDBMUserFile UNIX DBM files are a more efficient way than plain text files of storing user IDs and passwords. Use DBM files if you are dealing with more than a handful of users. Use the AuthDBMUserFile directive to specify the name of the DBM file containing user IDs and passwords that is to be used to verify access to the current directory. This directive should be used in conjunction with the AuthName, AuthType, AuthDBMGroupFile, and require directives.
AuthGroupFile Use the AuthGroupFile directive to specify the name of the text file containing group definitions.
Each line of a group definition file contains a group name, followed by a colon and a list of the users in the group, as in the following example:
admin: henry martha dave
AuthDBMGroupFile The group file for a given directory may be a UNIX DBM file rather than a plain text file. If so, use the AuthDBMGroupFile directive, rather than the AuthDBMFile directive, to specify the group definition file.
Options The Options directive described earlier in this chapter can also be used within a Directory group to control behavior for that directory. For details, refer to the "Options" section earlier in this chapter.
AllowOverride Directives in the server configuration files can be overridden by directives in local .htaccess files, as described in the following section. As a server administrator, you may not want users to override all server directives. In such a case, use the AllowOverride directive in a Directory group to specify which directives can be overridden.
The default behavior is to allow the user to override all directives, which is the equivalent of using AllowOverride with an argument of All. Using an argument of None has the opposite effect, telling the server to ignore the contents of any .htaccess files. You can use the following arguments to fine-tune override behavior related to access control:
The directive AllowOverride Limit in the server configuration file, for example, allows local .htaccess files to control which hosts can access files.
Apache .htaccess Directives Table 8.4 lists
the directives that can be used in the local .htaccess files.
These files override server configuration directives for the directory
in which they reside.
Directive | Argument Type | Purpose | |
allow from | List of hosts | Allows access to this directory from the designated IP hosts | |
deny from | List of hosts | Denies access to this directory from the designated IP hosts | |
order | Evaluation order | Sets the order in which deny and allow directives are applied. | |
Require user | List of user IDs | List of IDs of users who can access a directory | |
require group | List of groups | List of groups that can access a directory | |
require å valid-user | - | Allows access to all users who provide a valid user ID and password | |
AuthGroupFile | File name | Name of file containing list of user groups | |
AuthName | domain name | Name of authorization domain for a directory | |
AuthType | Basic | Type of user authorization (only Basic available) | |
AuthUserFile | File name | Name of file containing list of users and passwords | |
Options | List of options | Defines which server features are allowed in a given directory |
These directives can be used in the .htaccess files as well as
in the server configuration files. For details on each of these
directives, refer to "Apache Directory Directives" earlier
in this chapter.
CAUTION |
Be careful not to give users too much leeway with .htaccess files. Local .htaccess directives can be used for purposes such as exporting files that would not otherwise be available. The best way to provide security is to use the AllowOverride directive in the server configuration file. The following example provides reasonable protection against accidental or deliberate security breaches: <Directory> This code prevents any overriding of directives by means of .htaccess files; explicitly turns off extra server features by means of the Options directive; and allows accesses from all hosts, but only by means of the GET, PUT, and POST methods. |
The extra layer of access control required on a Web server that uses user-related access restrictions has a certain amount of maintenance overhead. Aside from setting up the configuration files that define a realm (and the users and groups that have access to it), you need to be able to add and delete users in the various realms, change passwords for users who lose or forget them, and so on.
Fortunately, that task is just the kind of task for which Perl
was brought into this world. The remainder of this chapter describes
some sample Perl scripts that make it easy to administer the httpd's
user accounts.
NOTE |
The code for the samples in this chapter is available on the CD-ROM that comes with this book. Copy these files into a directory in your path, and make sure that UserUtil.pl is in your Perl library directory (/usr/local/perl/lib, for example). This file contains the shared subroutines that do all the work. |
The first task is to define a user, which means adding a line to a user file that contains the user name, a colon, and the user's password in encrypted format.
Encrypting Passwords Getting the password into encrypted form is fairly straightforward when you use Perl. This task is one that you're going to want to perform again (in your script for setting passwords for existing users), so write a subroutine to do the job for you and then store it in UserUtil.pl, where it can be shared by several scripts.
Listing 8.1 shows the source for the GetPWord() subroutine. The subroutine takes no arguments, prompts the user for the password (twice, to prevent errors), and returns the encrypted password.
Listing 8.1 The GetPword Subroutine
# Subroutine to prompt for and return (encrypted) password. sub GetPword { my ( $pwd1, $pwd2, $salt, $crypted ); my @saltchars = (a .. z, A .. Z, 0 .. 9); print "Enter password: "; $pwd1 = <STDIN>; chop($pwd1); length($pwd1) >= 8 || die "Password length must be eight characters or more.\n"; print "Enter the password again: "; $pwd2 = <STDIN>; chop($pwd2); # Check that they match: ($pwd1 eq $pwd2 ) || die "Sorry, the two passwords you entered do not match.\n"; # Generate a random salt value for encryption: srand(time || $$); $salt = $saltchars[rand($#saltchars)] . $saltchars[rand($#saltchars)]; return crypt($pwd1, $salt); }
The crypt() Function |
n the UNIX world, the crypt() function looks after the job of encrypting passwords. The function takes two arguments: Decrypting a password from the encrypted form of the password is almost impossible, but comparing a given string with the password is easy. |
The following list steps through the code to show you how it works:
Adding a User Adding a user amounts to no more than adding a line that contains the user name and password to the user definition file. You can perform this task by using the SetPword( ) function, the code for which appears in Listing 8.2.
Listing 8.2 The SetPword Subroutine
# Store a user's password in a user definition file # Arguments: # - user file spec # - user name # - password sub SetPword { my( $filespec, $user, $pword ) = @_; # Open user file for appending: open(USERFILE, "+>>$filespec") || die "Could not open user file \"$filespec\" for appending: $!\n"; # Write to the user file print USERFILE "$user:$pword\n" || die "Failed to write the user/password to file \"$filespec\".\n"; # Tidy up: close USERFILE; }
This code opens the named user definition file for appending by including the >> append operator in the file specification argument to the open() function. If the file does not already exist, perl creates it.
The code then writes the supplied user ID and encrypted password (with a colon between them and a new line at the end), closes the user definition file, and returns.
Putting It All Together: Aaddu The script that you invoke when you actually want to add a user is relatively simple, because most of the work has been separated out into reusable subroutines. Listing 8.3 shows the code for Aaddu.
Listing 8.3 The Aaddu Script
#!/usr/local/bin/perl -I. -T # Script to add a user to an Apache user file. require "UserUtil.pl"; # Need utilities # Takes two arguments: # - username to add # - file to add to # Get the arguments: ($user, $file) = @ARGV; # Check that we got two arguments: $file || die "Aaddu: Add user utility for Apache (text) user files.\n", "Usage: Aaddu username filespec\n"; $file =~ /(.+)/; $safefile = $1; # Get the encrypted password: $password = &GetPword; # Store the new username and password: &SetPword($safefile, $user, $password); # End
This script simply takes the user name and user definition file as arguments; gets the password interactively, using the GetPWord() subroutine; and then calls SetPWord() to add the users.
One more detail here. Examine the following mysterious lines:
$file =~ /(.+)/; $safefile = $1;
These lines are here because the script turns on Perl's taint checking with the -T switch. In this mode, Perl does not allow you to pass an argument from the command line-namely, $file-to a function such as open(), because doing so might compromise security. Making a copy of $file won't work either, because it will be similarly tainted.
So how do you get the file name from $file in a way that won't upset Perl's taint-checking sensibilities? One way is to perform a regular expression match on the contents of $file and then store what was matched. Perl allows this method because it assumes that if you go to this much trouble in your own code, you know what you're doing.
The statement $file =~ /(.+)/ carries out a regular expression match on $file, using the expression (.+). This expression simply matches the entire contents of $file and returns what it found as $1. The script then stashes this result in the new variable $safefile. If you are writing scripts to be executed by other users, you may want to use a more elaborate regular expression to eliminate any suspicious characters from the variable before passing it to open.
Avoiding Duplicate User Names The major difficulty with simply appending new users to a user definition file is that there is no safeguard against the possibility of adding the same user name more than once. The procedure would be much safer if the Aaddu script determined whether a user already existed in a user definition file before trying to add the user.
The UserDefined() function in UserUtil.pl makes just that determination. Listing 8.4 shows the code.
Listing 8.4 The UserDefined Subroutine
# Return 1 if user defined in named text file sub UserDefined { my ( $username, $filespec ) = @_; # No file, no user open(USERFILE, $filespec) || return 0; # Check each line for username: while (<USERFILE>) { if ( /^$username:/ ) { close USERFILE; return 1; } } close USERFILE; return 0; }
The function takes a user name and a user definition file specification as arguments; it returns 1 if the user exists in that file and 0 if the user doesn't exist. This function will also be useful in the opposite context when you want to change the passwords of existing users.
The operation of this function is quite straightforward: It opens the user definition file for reading, and checks each line in the file. If the line starts with the user name followed immediately by a colon, the function returns 1 to confirm that the user is defined.
Notice the use of the close() function before both return 1 and return 0. Placing a single close() statement at the end of a subroutine is not sufficient, because a return statement earlier in the subroutine may prevent the close() statement from being reached. It is, therefore, important to place close() statements immediately before every return point in the subroutine.
Listing 8.5 shows the new, improved Aaddu script.
Listing 8.5 The Aaadu Script with Duplicate Checking
#!/usr/local/bin/perl -I. -T # Script to add a user to an Apache user definition file. # Prevents duplicate entries. require "UserUtil.pl"; # Need utilities # Takes two arguments: # - username to add # - file to add to # Get the arguments: ($user, $file) = @ARGV; # Check that we got two arguments: $file || die "Aaddu: Add user utility for Apache user definition files.\n", "Usage: Aaddu username filespec\n"; $file =~ /(.+)/; $safefile = $1; # First check that the user does not already exist: &UserDefined($user, $safefile) && die "User \"$user\" already exists in file \"$safefile\".\n"; # Get the encrypted password: $password = &GetPword; # Store the new username and password: &TextSetPword($safefile, $user, $password); # End
The only change from the earlier version of Aaddu is the addition of a call to UserDefined() to check for the existence of the user.
Deleting users is a little less straightforward than adding them. Adding a user is simply a matter of sticking a new user line at the end of a file. Deleting a user, however, involves finding that user in the file and then rewriting the file without that user line but with all others left intact.
The simplest way to perform this task in Perl is to read the entire contents of the user definition file into an associative array, delete the entry that corresponds to the user that you want to drop, and then write the whole array out to the same file. This approach may not be immediately intuitive if you're not used to working with associative arrays, but it will become familiar to you in a short time, as you learn to leverage the power of associative arrays.
Listing 8.6 shows the code for the DeleteUser() subroutine.
Listing 8.6 The DeleteUser Subroutine
# Subroutine to delete a user from a user file # Input: Username, filespec sub DeleteUser { my ($user, $filespec) = @_; my ($thisusr, $thispw, $elem, %passwords); # Open the file for reading: open(USERFILE, "$filespec") || die "Could not open user file \"$filespec\" for reading: $!\n"; # Grab the contents of the user file in an associative array: while (<USERFILE>) { chop; ($thisusr, $thispw) = split(':', $_) ; $passwords{$thisusr} = $thispw; } close USERFILE; # Check that the named user exists: $passwords{$user} || die "User \"$user\" not found in file \"$filespec\".\n"; # Now delete the user from the array: delete $passwords{$user}; # Now write the whole user/password array to the user file: # First re-open the user file for writing: open(USERFILE, ">$filespec") || die "Could not open user file \"$filespec\" for reading: $!\n"; # Now write each element of the array in the correct format: foreach $elem ( keys %passwords ) { print USERFILE $elem, ":", $passwords{$elem}, "\n" || die "Failed to write user/password to file \"$filespec\": $!.\n"; } close USERFILE; }
The following list goes through this script a step at a time:
Changing the password of an existing user is trivial now; you've already written the code that does all the work. All you need is the following simple wrapper, Asetpw, to call the UserDefined() and SetPWord() subroutines, as shown in Listing 8.7.
Listing 8.7 The Asetpw Script
#!/usr/local/bin/perl -I. -T # Script to change a password in an Apache user file. require "UserUtil.pl"; # Need utilities # Takes two arguments: # - username to change # - file containing userid, password # Get the arguments: ($user, $file) = @ARGV; # Check that we got two arguments: $file || die "Asetpw: Change password utility for Apache user definition files.\n", "Usage: Asetpw username filespec\n"; $file =~ /(.+)/; $safefile = $1; # First check that the user exists: &UserDefined($user, $safefile) || die "User \"$user\" does not exist in file \"$safefile\".\n"; # Get the encrypted password: $password = &GetPword; # Store the new username and password: &SetPword($safefile, $user, $password); # End
This listing illustrates just how useful a modular approach to code design can be.
A group definition file consists of a series of lines, one per group, each of which contains the name of the group, followed by a colon and a list of space-separated member names. This format is somewhat similar to the format of a user definition file, but the task of adding or deleting a user is more complex, because user definition lines have a single name and a single password-ideal material for an associative array. Group definition files, on the other hand, have a single group name and multiple member names.
You will still use associative arrays to deal with groups, but you need to do a little extra work to allow for the storage of a list of members as a single value. The approach that you take in this section is to deal with a group file as a whole, reading its contents to and from an associative array. This approach allows you to modularize your code into neat functional elements. This method lends itself particularly well to working with UNIX DBM files, should you decide to use them at a later stage.
Reading Groups Listing 8.8 shows the source code for GetGroupMembers( ), which is stored in UserUtil.pl. GetGroupMembers( ) is a subroutine that reads the entire contents of a group file into an associative array.
Listing 8.8 The GetGroupMembers Subroutine
# Subroutine to extract group member list from group file # Input: file spec of group membership file # Returns: Associative array of groups, members. sub GetGroupMembers { my( $filespec ) = @_; my ($thisgrp, $grpmembers, %groupmembers); # Just return now if file does not exist: -e $filespec || return; # Open the group file: open(GFILE, "$filespec") || die "Could not open user file \"$filespec\" for reading: $!\n"; while (<GFILE>) { chop; ($thisgrp, $grpmembers) = split(':' , $_); $groupmembers{$thisgrp} = $grpmembers; } close GFILE; return %groupmembers; }
When the input file has been opened, each line is read and split at the colon into a group name ($thisgrp) and a member list ($grpmembers). The member list is a single string containing the user names of all group members, separated by spaces. The associative array %groupmembers is built by adding $thisgrp as a key and $grpmembers as a corresponding value for each line read from the file. Then the entire associative array is returned to the calling routine.
Writing Groups Listing 8.9 shows the source code for SetGroupMembers( ), which is stored in UserUtil.pl and which is similar to SetPword( ).
Listing 8.9 The SetGroupMembers() Subroutine
# Subroutine to store group member list in group file # Input: file spec of group membership file, # associative array of groups/users sub SetGroupMembers { my( $filespec, %groups ) = @_; my ($grp); # Open the group file: open(GFILE, ">$filespec") || die "Could not open group file \"$filespec\" for writing: $!\n"; foreach $grp ( keys %groups ) { print GFILE "$grp: $groups{$grp}\n"; } close GFILE; }
This function does the opposite of GetGroupMembers(): It opens the group definition file for writing and writes out one line per entry in the %groups associative array. Each line is written as the key, followed by a colon and then the corresponding value.
Putting It All Together: Agrpaddu The source code for Agrpaddu is relatively short, making use of the functionality in UserUtil.pl, as shown in Listing 8.10.
Listing 8.10 The Agrpaddu Script
#!/usr/local/bin/perl -I. -T # Script to add a user to a group in an Apache group file. require "UserUtil.pl"; # Need utilities # Takes two arguments: # - group to add to # - username to add # - file to add to # Get the arguments: ($group, $user, $file) = @ARGV; # Check that we got three arguments: $file || die "Agrpaddu: Utility for adding users to Apache group files.\n", "Usage: Agrpaddu groupname username filespec\n"; # Extract filename: $file =~ /(.+)/; $safefile = $1; # Read the current group membership into an associative array: %groups = &GetGroupMembers($safefile); # Check if user already in group: $groups{$group} =~ /\b$user\b/ && die "User \"$user\" is already a member of group \"$group\".\n"; # Add the user to the group: $groups{$group} .= " $user"; # Write the array out to the groups file: &SetGroupMembers($safefile, %groups); # End
The following list describes what this script does:
The task of deleting users from groups is a little more involved than adding them. Listing 8.11 shows the source for Agrpdelu.
Listing 8.11 The Agrpdelu Script
#!/usr/local/bin/perl -I. -T # Script to delete a user from a group in an Apache group file. require "UserUtil.pl"; # Need utilities # Takes two arguments: # - group to delete from # - username to delete # - file to delete from # Get the arguments: ($group, $user, $file) = @ARGV; # Check that we got three arguments: $file || die "Agrpdelu: Utility for deleting users from Apache group files.\n", "Usage: Agrpdelu groupname username filespec\n"; # Extract filename: $file =~ /(.+)/; $safefile = $1; # Read the current group membership into an associative array: %groups = &GetGroupMembers($safefile); # Check if user is in group: $groups{$group} =~ /\b$user\b/ || die "User \"$user\" is not a member of group \"$group\".\n"; # First make an array of all members of this list: (@oldmembers) = $groups{$group} =~ /(\w+)/g; # Clear down the current member list for this group: $groups{$group} = ""; # now add all but the member to be deleted to a new string: foreach $member (@oldmembers) { if ( $member ne $user ) { $groups{$group} .= " $member"; } } # Write the array out to the groups file: &SetGroupMembers($safefile, %groups); # End
The following list shows the essential steps:
Again, you get to reuse the GetGroupMembers() and SetGroupMembers() functions.
This chapter describes how user authentication combines user verification with access restrictions to ensure that your server is as open as it needs to be, and no more. This book has a great deal more to say about server security and Perl: