Chapter 19

Chat Rooms


CONTENTS


Web-based chat rooms are one of the biggest success stories on the Internet. Though better Internet "chat technology" exists, the widespread availability of Web browsers and their ease of use has made Web chatting a remarkable phenomenon. Special CGI programming issues exist when creating chat rooms, including user tracking, maintaining state, and multiple access serving.

Chat Rooms-Getting a Life on the Internet

I have been living a life on-line in one form or another since 1990, and chat rooms are the reason why. I'm not the only one who has discovered the incredible appeal and even addictive quality of on-line chat rooms. Sysadmins across the world fret and fume about the amount of server activity chat rooms bring. People's lives crumble around them as they fritter away dozens of hours a week within them. Maladjusted misfits taunt other chat room users and maliciously lie for the cheap thrill of hurting other people. Strangers who "meet" in chat rooms often end up spending obscene amounts of money in long-distance phone calls and even plane fare on each other. On balance, chat rooms are fairly destructive creations. And now, I'm going to teach you how to make one.

Prescription for a Chat Room

A user's flowchart for how to use a chat room is a fairly simple, feedback-oriented process:

The last step is repeated until you're tired of chatting, likely many hours later. Of course, this procedure is made interesting by the fact that others are doing the same. In the time span between typing in new comments and having your browser reload the chat room following submission of those comments, other people have done the same. Once the reload is done, you get to read what other people have newly typed into the room. This loop of activity can simulate near-real-time "chatting." Figure 19.1 is a quick view of London Chat, one of my chat rooms.

Figure 19.1: A view of London Chat room writen by the author.

The CGI programmer responsible for a chat room must take into account the cycle I stated previously. An appreciation of the way a chat room is used will lead to a better chat room design. In addition to making the chat room "user-friendly," there are several organization issues that have to be addressed:

These points only scratch the surface of how a chat room should be organized. There are other reasons for the chat room to need to "know" who a given user is than just to save the user the trouble of typing their name in repeatedly. A knowledge of the state of the stack can be used in more ways than just limiting its growth. Time stamps on messages can be used in other ways, as well. Also, it may be useful to allow certain users the capability to enter HTML into their posts while not others. I'll explore these concepts later in this chapter.

Listing 19.1 is the source code of a chat room that addresses the simplest level of user and programmer issues I mentioned earlier. This program is alive and active at

http://www.anadas.com/cgiunleashed/chatrooms/chat.cgi

Listing 19.1. chat.cgi-A functional, functioning chat room.
#!/usr/bin/perl

#
# This program was written by Richard Dice of Anadas Software Development
# as part of the Sams Net "CGI Programming Unleashed" book.  The author
# intends this code to be used for instructional purposes and not for
# resale or commercial gain.
#
# Any questions or comments regarding this program are welcome.  You
# may contact the author by Internet email: rdice@anadas.com
#

# puts all POST query information the variable $input_line
read(stdin, $input_line, $ENV{CONTENT_LENGTH});

# replace all '+' coded spaces with real spaces
$input_line =~ tr/+/ /;

# creates array of all data files in $input_line from & separated info
@fields = split(/\&/,$input_line);

#
# decodes hex info for each name/value pair and places pairs in
# %input associative array
#
foreach $i (0 .. $#fields) {
   ($name,$value) = split(/=/,$fields[$i]);
   $name =~ s/%(..)/pack("c",hex($1))/ge;
   $value =~ s/%(..)/pack("c",hex($1))/ge;
   $input{$name} = $value;
}

#
# I put a few of the CGI environment variables in their own variables
# for ease of understanding later on in the program.  also, I create
# a variable $time which records the current time and date
#
$refer = $ENV{HTTP_REFERER};
$ip = $ENV{REMOTE_ADDR};
$browsername = $ENV{HTTP_USER_AGENT};
chop($time = `date`);

# ====== Configuration Variables ======
$progname = 'chat.cgi#Comments';
$baseurl = 'http://www.anadas.com/cgiunleashed/chatrooms';
$html = 0; # set to 1 if HTML is allowed in postings
$maxlines = 100; # sets the size of the stack
$admin_name = 'Richard Dice';
$admin_email = 'rdice@anadas.com';

print "Content-type: text/html\n\n";

#
# the following line traps forms being submitted from invalid locations
#
&print_error if(  (!($refer =~ /$baseurl/)) && defined(%input) );

&remove_html if !($html);

&print_header;
&print_form;
&update_stack if $input{'comments'} ne '';
&print_stack;
&print_footer;

exit 0;

sub print_error {
   print <<END;
<HTML><HEAD><TITLE>Invalid Submission</TITLE></HEAD>
<BODY>
<P>
The URL of the page which submitted the form which has this CGI program
as its action was not valid.  Please go to <A HREF=$refer>$refer</A> to
legally access this CGI program.
</BODY></HTML>
END
   exit 1;
}

sub remove_html {

#
# removes all characters between < and >, inclusive
# ( or, between < and EOL or line beginning and > )
#
   foreach ( keys %input ) {
      $input{$_} =~ s/<([^>]|\n)*>//g;
   }
}

sub print_header {

   print <<END;
<HTML>
<HEAD><TITLE>Richard's First Example Chat Room</TITLE></HEAD>
<BODY>
<CENTER>
<P><FONT SIZE=+2><B>
My First Chat Room
</B></FONT>
<BR><BR>
<FONT SIZE=-1><B><I>
This chat room is the beginning example of the CGI Programming Unleashed
Chat Room Example.  While fully functional, I've left some "hooks" for
later improvement.
</I></B></FONT>
</CENTER>
<BR><HR>
END

}

sub print_form {

#
# if this is the first access of a chatting session, create the ID#
# and put a message into the handle string
#
   if ( !(defined($input{'id'})) ) {
      $input{'handle'} = 'Put your handle here!';
      ($input{'id'} = $ip) =~ s/\.//g;
      substr($input{'id'},$[,3) = ''; # removes first 3 digits of IP info
      for $i ( 0 .. (length($browsername)-1) ) {
         $input{'id'} += ord(substr($browsername,$[+$i,1));
      } # adds Browser info to end of the string
      $input{'id'} .= ".$$"; # process ID becomes part of ID#
   }

   print <<END;
<FORM METHOD=POST ACTION=$baseurl/$progname>
<PRE>
<B>Name     :</B> <INPUT TYPE="text" SIZE=40 NAME="handle" MAXLENGTH="40" ÂVALUE="$input{'handle'}">
<INPUT TYPE="hidden" NAME="id" VALUE="$input{'id'}">
<A NAME="Comments"><B>Comments :</B>
<TEXTAREA NAME="comments" ROWS=3 COLS=50></TEXTAREA></A>
</PRE>
<BR>
<INPUT TYPE="submit" VALUE="Submit Comments or Update Room">
<INPUT TYPE="reset" VALUE="Clear Form">
</FORM>
END
   print "Current time: $time\n";
}

sub print_stack {

   $printed_lines = 0;

#
# referring to the posts/ directory, ls -1t *.posts will force a 1 column
# output, organized most recent file to least recent, of all .post entries
#
   chop(@posts = `ls -1t posts/*.post`);
   foreach ( @posts ) {
      if ( $printed_lines < $maxlines ) {
         open(POST,$_);
         while ( $this_line = <POST> ) {
            print $this_line;
            $printed_lines++;
         }
         close(POST);
      } else {
         system("rm $_"); # clear old entries from the stack
      }
   }
}

sub update_stack {

   $post_name = $input{'id'} . ".$$.post";
   $id_snippet = substr($input{'id'},-2,2);
   open(POST,"> posts/$post_name");

# allows users to control line breaks without access to HTML by simply
# hitting ENTER in the comments textarea
   $input{'comments'} =~ s/\cM\n/<BR>\n/g;
   print POST <<END;
<HR>
<!-- $input{'handle'} -->
<!-- ID#: $input{'id'} -->
<TABLE WIDTH=100%><TR><TD ALIGN=LEFT VALIGN=CENTER><B>$input{'handle'}</B>
<FONT SIZE=-2>($id_snippet)</FONT></TD><TD ALIGN=RIGHT VALIGN=CENTER><I>$time</I>
Â</TD></TR></TABLE>
$input{'comments'}
END
   close(POST);
}

sub print_footer {

   print <<END;
<HR>
<P><FONT SIZE=-1><B>This chat room is maintained by
<A HREF="mailto:$admin_email">$admin_name</A></B></FONT>
</BODY>
</HTML>
END

}

With this small amount of Perl code, we have a working World Wide Web chat room at our disposal. I'll take some time now to discuss how it works and tell you how to make it work on your system.

All form data is passed to chat.cgi using the POST method. The name-value pairs are decoded and placed in a Perl associative array named %input. If the URL is being accessed directly, then %input won't exist because no POST information will come to chat.cgi. Thus, I have chat.cgi check for the existence of fields within %input to decide whether or not the user is accessing the chat room for the first time.

If the access is direct, chat.cgi creates an ID# rather than simply perpetuating the existing one. Also, a message is inserted into the form handle field. After the form is submitted for the first time, whatever is in that field upon submission becomes the value of that field upon reload. This doesn't prevent a user from modifying his name after he's started chatting, though. An ID# cannot be changed. ID# will be used for more than just decoration in amendments to the code later in this chapter.

Previous postings are recorded as files in the "posts" subdirectory beneath the directory that holds chat.cgi. Each file is "free-floating HTML" in the sense that they contain HTML markup but aren't by themselves full HTML files. This is permissible because they are meant to be output as part of a greater stream of data that all together is a full and valid HTML file.

The stack of previous posts is limited in a sense by the number of newlines compared against the variable $maxlines. Each time a line within a .post file is output to the Web, a counter is incremented. Once the counter is greater than $maxlines, all subsequent .post files are deleted rather than output; .post files are output in a last-in, first-output way. This makes sense in the context of how an HTML page is displayed. I can accomplish this quite simply in my code by using the Perl command

chop(@posts = `ls -1t posts/*.post`);

The logic in this line is nested. I'll start on the inside and work out.

ls is the UNIX command similar to dir in DOS. It provides a listing of the files within a directory. ls -1t posts/*.post provides a one-column listing of all files in the posts/ directory ending with .post, arranged in order of newest first to oldest last.

By placing this command in backticks (``), Perl is instructed to shell to UNIX, perform the command within the backticks, and use the standard output of that command as a return value. I then place the standard output into the array @posts. Each line of standard output will occupy one element in the @posts array.

Finally, chop() removes the newline character from each element in the @posts array. Now, I have an array with the filenames I need in the order I need. Note that, by default, Perl keeps newline characters attached to the ends of the lines it reads (from both files and stdin). Most other systems don't do this.

Tip
Perl is the UNIX uber-toolbox. Not only is it a powerful and highly usable tool in its own right, but it has facilities that allow it easy access to all of the rest of UNIX-backticks (``) is a perfect example of this. I almost go out of my way to find situations where I can put the combination of UNIX and Perl to use.

I have made the "executive decision" to disallow HTML in chat.cgi. I accomplish this in a combination of ways. A flag, called $html, is provided. If it equals zero, then HTML is "turned off." This allows an if statement to apply the following code:

$input{$_} =~ s/<([^>]|\n)*>//g;

This line demonstrates the great power and utter confusion of the Perl s/// command. "s" stands for "substitute." Whatever is between the first and second slashes is what is to be looked for. Whatever is between the second and third slashes is what that is to be replaced with. Both clauses are expressed in terms of Perl's regular expression syntax. Conceptually, this isn't a difficult topic. However, regular expressions often "bloat" so much that they become scary-so scary that a voluntary PG-13 rating would be appreciated. The string of characters between the first two slashes in the preceding line of code means: "match on everything between < and >, or between < and a newline character, or between the beginning of a line and a >." I think.

We all just have to get used to it, I guess. Regardless, the net effect (no pun intended) of this incantation is to remove all HTML tags.

Another option for disallowing HTML is somewhat simpler:

$input{$_} =~ s/\</\&lt\;/g;
$input{$_} =~ s/\>/\&gt\;/g;

This simply replaces > with &gt; and < with &lt;, the harmless HTML codes for these characters. People can still try to use HTML with these lines in place, but their unsuccessful attempts will be visible for all to see.

Getting chat.cgi to Work for You

It's nice to be able to use someone else's CGI code where appropriate, but sometimes setting up that code to work on your system can be as hard as writing your own CGI program. To help you out, I'll go through and talk about what's needed to make the chat.cgi code work for you.

There are three things to take into account when installing this code on your system:

Near the top of the Perl code, there is a section discussing "Configuration Variables." This section is mostly straightforward, and parts of it are even documented within the code. You should change the variables starting with $admin to your name and e-mail address. $baseurl is the portion of the URL that references your chat.cgi installation up to but not including the final / character. $progname is whatever you call your installation of chat.cgi. As a bit of a trick, a #name can be added after the name of the program in $progname to reference a <A NAME=name> tag that is added elsewhere in the program.

Perl programs can often reference UNIX system calls and programs. Different UNIX systems provide different command paths and directory structures, so any Perl CGI program you acquire might have to be modified to reference these UNIX commands correctly on your local system. Within the Perl CGI program, be on the lookout for system ("...") statements and backticks `...`.

If you do end up having to change the Perl code to correctly reference UNIX commands on your local system, you might have a difficult time finding exactly where these commands are kept. Here is a list of the four things I try before going to the sysadmin to find a command I need on a UNIX system:

With regards to UNIX directly and not the commands available through it, chat.cgi requires that a "posts" subdirectory be made within the directory that contains chat.cgi. The directory containing chat.cgi and the "posts" must be writable by the Web server. The Web server must also be able to run chat.cgi; both chat.cgi and the directory that contains it must have adequate execute permission to do this. For a discussion on file and directory permissions, type man chmod at your UNIX shell.

Even after you have done an appropriate chmod of the files and directories involved, your CGI programs won't execute unless the Web server is prepared to accept them. To ensure this, chat.cgi must either be located in the /cgi-bin/ directory of your system, or the .cgi Magic MIME type must be enabled in your server's srm.conf file. If you are a sysadmin, these are both easy tasks to accomplish. If you require assistance from your sysadmin to do either of these things, tell them that I said it was all right. Trust me, this will work.

Chat Room Systems and Entry Pages

Listing 19.1, chat.cgi, is a multistate CGI-no .html file is needed as a "springboard" to activate it. In general, I don't feel it necessary to provide an .html bridge for CGI programs. Instead, I make my CGIs "aware" of the context in which they are being invoked. However, in the realm of chat room programming, there are two particular schools of thought on the matter. Neither is right or wrong; they simply reflect different philosophies and methods of organization.

The method I have modeled chat.cgi after is the multistate system. My reasons for this are that it allows users to see what has happened in the discussion stack before actually becoming part of the conversation, and that any chat room CGI will ultimately need the capability to use itself as both source and destination of form information-that is, without going through any gross and unnecessary contortions just to "prove a point" or something.

The other school entails the creation of a static .html page that is a "gateway" into the CGI. There are a couple of reasons that would motivate this sort of setup:

If the programmer wants to have several different rooms in a chat room system, they might still rely on the same CGI program. On the entry page, the form would have a selection of rooms available to the user. The user would select one, and then that selection would become part of that users set of hidden input tags. The room hidden tag would then be recognized by the chat room CGI program, and if statements could be used to set the variable that decided the name of the directory that held the .post files, for instance.

Extension to the Basic Chat Room

As with any program, there is good, and then there is better. A chat room is better than no chat room at all. A chat room that allows you to carry on private conversations with your best friend or block out any posts from that annoying 13-year-old weenie who wants to call himself !!!!!!THE MAGNIFICENT DEATH glaDIATOR!!!!!! would be better yet. I'll devote most of the rest of this chapter to describing situations that an advanced chat room could address.

Intelligent User Identification

So far, I've introduced a cryptic datum called the ID#. What good is it? Well, so far it's not being used for much. The next few pages will change that.

Why is it a good idea to keep track of users within a chat room? There are two broad categories of reasons: to allow beneficial things for the good users of the chat room and to show the door to troublemakers. Unfortunately, they exist and are legion. I don't think the problem is nearly so great in a members-only chat site; people only feel safe being obnoxious in the extreme when they're anonymous. But if you're planning on building a publicly available, anonymous chat site, you'd better plan some defenses.

The ID# is a hash based on a few individual items:

The most important factors are the first two because these are dependent on information about the person using the chat room. The PID is thrown in just as a way to build up the number a bit more. I take the last two digits of PID into $id_snippet, which is displayed to the chat room. The idea behind this is to give two people the opportunity to use the same handle ("Molly Millions," for instance), but they won't look absolutely identical to other users.

As defined in chat.cgi, the ID# can't be "reverse-engineered" to provide the numeric IP or browser names, even given the algorithm that created it. I have intentionally done this to preserve privacy on-line. It's good enough for that job I'll be using it for; it doesn't have to go any further.

In addition to the preceding, the section of the ID# that relates to numeric IP address and browser name isn't even unique. If someone else enters the chat room with the same browser and from the same IP, chat.cgi will determine that part of the ID# for someone else matches your information. My intention is to then forbid that second user to enter the chat room. This seems reasonable to me as a way of preventing one user from having multiple chat sessions going at the same time. The drawback of potentially having other legitimate users denied access exists, but I don't think it's worth dwelling on. With the predominance of Internet dial-up giving people their own IP addresses and the slim odds that two people from the same lab will want to access one specific chat site simultaneously, this is a restriction I'm willing to allow.

So now that we have an individual ID# for each user, what do we do with this data? I'm going to put them all in a file. This file will have four data fields: ID#, Handle, expiry time, and "blocked status." Here's what I'm going to do:

All this is accomplished with use of an ID#, an auxiliary data file, and a lot of brain-racking code. Well, sort of.

Caution
Since two (or more) people could make their form submission very near in time to each other, this auxiliary data file stands the chance of being overwritten. To avoid this, a file- locking mechanism must be devised. You'll see how I do this in my code example.
File locking is one of the fundamental concepts in multi-user systems programming. Be on the lookout for circumstances in your code, chat room or otherwise, where it should be used.

ChatMaster-The Chat Room Administrator

In the chat rooms I have administered, I've found that users get a sense of security knowing that they're talking to the person "in charge of it all." To that end, I've created a feature in my chat rooms that I call "ChatMaster Mode."

There are three special privileges possessed by the ChatMaster:

ChatMaster mode is activated when a handle is entered into the form that has the value of a special ChatMaster password. There is also error trapping for direct entry of the handle "ChatMaster."

The first two aspects of ChatMaster mode are easily implemented by if statements within the code. Universal blocking is accomplished by storing the term "BLOCKED" as the fourth data field within the auxiliary data file.

Private Messaging

It would be good to provide the users of this chat room with the capability to direct messages to specific people within the room. The form interface will now include a drop-down selection menu that will allow the user to choose a recipient for his or her current message. The default value is everyone, a recognized special term indicating that all members of the chat room should receive the message. The selection menu is created by parsing the @userdata array for a list of current users of the chat room.

From the point of view of the program, an $input{'private'} array entry signals the &update_stack subroutine to make a stack entry with a special naming convention. Also, a special header will be given to private messages. The &print_stack subroutine has been rewritten to be aware of private messages and to ignore all private messages that aren't being sent to the invoking user.

Intelligent user identification, blocking, ChatMaster mode, and private message features are included in newchat.cgi. The source code for this is presented in Listing 19.2, and it resides on the Web at

http://www.anadas.com/cgiunleashed/chatrooms/newchat.cgi

Listing 19.2. newchat.cgi-A more sophisticated chat room that incorporates some of the features discussed in this chapter.
#!/usr/bin/perl
#
# This program was written by Richard Dice of Anadas Software Development
# as part of the Sams Net "CGI Programming Unleashed" book.  The author
# intends this code to be used for instructional purposes and not for
# resale or commercial gain.
#
# Any questions or comments regarding this program are welcome.  You
# may contact the author by Internet email: rdice@anadas.com
#

# ====== Configuration Variables ======
$progname = 'newchat.cgi#Comments';
$baseurl = 'http://www.anadas.com/cgiunleashed/chatrooms';
$html = 0; # set to 1 if HTML is allowed in postings
$maxlines = 100; # sets the size of the stack
$admin_name = 'Richard Dice';
$admin_email = 'rdice@anadas.com';
$cmpswd = 'Aaron Thunderfist'; # Handle entry which gives ChatMaster access
$datafile = 'userdata.dat';
$reset_time = 600; # 600 seconds without update before session is terminated

#
# I put a few of the CGI environment variables in their own variables
# for ease of understanding later on in the program.  also, I create
# a variable $time which records the current time and date
#
$refer = $ENV{HTTP_REFERER};
$ip = $ENV{REMOTE_ADDR};
$browsername = $ENV{HTTP_USER_AGENT};
chop($time = `date`);

#
# read datafile into the @userdata array, remove terminating newlines
# from all array elements
#

#--
while ( -e "lock_file" ) {
   sleep(1);
}
#---
open(LF,"> lock_file");
print LF "Locked!\n";
close(LF);
chmod 0660,'lock_file';
#---
open(DF,$datafile);
chop(@userdata = <DF>);
close(DF);
#---
unlink 'lock_file';
#---

#
# remove all elements of @userdata which are past their expiry time
#
foreach $ud ( @userdata ) {
   @field = split(/\t/,$ud);
   push(@udtemp,$ud) if $field[2] > time;
}
@userdata = @udtemp;
undef(@udtemp);

# puts all POST query information the variable $input_line
read(stdin, $input_line, $ENV{CONTENT_LENGTH});

# replace all '+' coded spaces with real spaces
$input_line =~ tr/+/ /;

# creates array of all data files in $input_line from & separated info
@fields = split(/\&/,$input_line);
$input_line = (); # free up memory

#
# decodes hex info for each name/value pair and places pairs in
# %input associative array
#
foreach $i (0 .. $#fields) {
   ($name,$value) = split(/=/,$fields[$i]);
   $name =~ s/%(..)/pack("c",hex($1))/ge;
   $value =~ s/%(..)/pack("c",hex($1))/ge;
   if ($name ne 'block') {
      $input{$name} = $value;
   } else {
      $input{$name} .= $value . "$;";
   }
}
chop($input{'block'}) if defined($input{'block'}); # remove trailing $;

print "Content-type: text/html\n\n";

#
# if this is a first-time access and another user is from the IP and using
# the same browser as this user, prevent this user from using the chat room
#
if (!defined($input{'id'})) {

#
# "postulate" an ID# for this new user
#
   ($temp_id = $ip) =~ s/\.//g;
   substr($temp_id,$[,3) = ''; # removes first 3 digits of IP info
   for $i ( 0 .. (length($browsername)-1) ) {
      $temp_id += ord(substr($browsername,$[+$i,1));
   } # adds Browser info to end of the string

#
# compares the "postulated id#" against those id#s found in the @userdata
# array... if it finds a match, terminate this login
#
   foreach (@userdata) {
      @field = split(/\t/);
      substr($field[0],index($field[0],'.')) = '';
      if ( $field[0] eq $temp_id ) {
         &id_conflict_error;
      }
   }

#
# ID# checks out okay, so add on PID info and store in %input
#
   $input{'id'} = $temp_id . ".$$";

}

#
# determine whether or not the current user is ChatMaster
#
if ( ($input{'handle'} eq $cmpswd ) && (!($input{'id'} =~ /ChatMaster/)) ) {
   $cmmode = 1;
} elsif ( ($input{'handle'} eq 'ChatMaster' ) &&
($input{'id'} =~ /ChatMaster/) ) {
   $cmmode = 1;
}

#
# if the current user is the chatmaster, adjust user's ID# in both
# $input{'id'} variable and in the @userdata array
#
if ($cmmode) {
   $input{'handle'} = 'ChatMaster';
   $id = $input{'id'};
   if (!($id =~ /ChatMaster/)) {
      substr($input{'id'},$[,0) = 'ChatMaster.';
   }
   foreach ( @userdata ) {
      @field = split(/\t/);
      if (!($field[0] =~ /ChatMaster/)) {
         if ( $field[0] eq $id ) {
            substr($_,$[,0) = 'ChatMaster.';
         }
      }
   }
}

#
# Give an error message if someone tries to be ChatMaster without the
# correct password
#
if ( ($input{'handle'} eq 'ChatMaster' ) &&
(!($input{'id'} =~ /ChatMaster/)) ) {
   &chatmaster_error;
}

#
# the following line traps forms being submitted from invalid locations
#
&print_error if ( (!($refer =~ /$baseurl/)) && defined($input{'handle'}) );

#
# If HTML is disallowed, go to the remove_html section... except for
# the ChatMaster
#
&remove_html if ( (!($html)) && (!($cmmode)) );

&print_header;
&print_form;
&update_stack if $input{'comments'} ne '';
&print_stack;
&print_footer;

&update_datafile;

exit 0;

sub print_error {
   print <<END;
<HTML><HEAD><TITLE>Invalid Submission</TITLE></HEAD>
<BODY>
<P>
The URL of the page which submitted the form which has this CGI program
as its action was not valid.  Please go to
<A HREF=$baseurl/$progname>$baseurl/$progname</A> to
legally access this CGI program.
</BODY></HTML>
END
   exit 1;
}

sub chatmaster_error {
   print <<END;
<HTML><HEAD><TITLE>Invalid use of ChatMaster</TITLE></HEAD>
<BODY>
<P>
An attempt was made to access ChatMaster mode illegally.
</BODY></HTML>
END
   exit 1;
}

sub id_conflict_error {
   print <<END;
<HTML><HEAD><TITLE>Multiple Login Sessions not allowed</TITLE></HEAD>
<BODY>
<P>
You may only possess one login session at a time.
<P>
It is (just a little) possible that someone else is using your exact data
port information right now.  If this is the case, you'll have to wait for
them to finish their session in this chat room.
</BODY></HTML>
END
   exit 1;
}

sub remove_html {

#
# removes all characters between < and >, inclusive
# ( or, between < and EOL or line beginning and > )
#
   foreach ( keys %input ) {
      $input{$_} =~ s/<([^>]|\n)*>//g;
   }
}

sub print_header {

   print <<END;
<HTML>
<HEAD><TITLE>Richard's More Impressive Chat Room</TITLE></HEAD>
<BODY>
<CENTER>
<P><FONT SIZE=+2><B>
My Second Chat Room
</B></FONT>
<BR><BR>
<FONT SIZE=-1><B><I>
This chat room has more features than its predecessor.  Enjoy!
</I></B></FONT>
</CENTER>
<BR><HR>
END

}

sub print_form {

#
# if this is the first access of a chatting session, create the ID#
# and put a message into the handle string
#
   if (!(defined($input{'id'}))) {
      $input{'handle'} = 'Put your handle here!';
   }

#
# if the ChatMaster has left ChatMaster mode, strip the special string
# from the ID# in both $input{'id'} and in the @userdata array
#
   if ( (!($cmmode)) && ($input{'id'} =~ /ChatMaster/) ) {
      substr($input{'id'},$[,11) = '';
   }
   foreach ( @userdata ) {
      substr($_,$[,11) = '' if $_ =~ /ChatMaster/;
   }

   print "<FORM METHOD=POST ACTION=$baseurl/$progname>\n";

   foreach ( @userdata ) {
      @field = split(/\t/);
      undef($check);
      if ($field[0] ne $input{'id'} ) {
         $check = ' chECKED' if &blocked($field[0]);
         print "<INPUT TYPE=checkbox NAME=block VALUE=$field[0]$check>";
         print " <FONT SIZE=-1><I>block $field[1]</I></FONT><BR>\n";
      }
   }

   print <<END;
<PRE>
<B>Name     :</B> <INPUT TYPE="text" SIZE=40 NAME="handle" MAXLENGTH="40"
VALUE="$input{'handle'}">
<INPUT TYPE="hidden" NAME="id" VALUE="$input{'id'}">
<A NAME="Comments"><B>Comments :</B>
<TEXTAREA NAME="comments" ROWS=3 COLS=50></TEXTAREA></A>
To: <SELECT NAME="post_to">
<OPTION VALUE="everyone">Everyone</OPTION>
END
   foreach ( @userdata ) {
      @field = split(/\t/);
      if ( $field[0] ne $input{'id'} ) {
         print "<OPTION VALUE=\"$field[0]\">$field[1]</OPTION>\n";
      }
   }
   print "</SELECT>\n";
   print <<END;
</PRE>
<BR>
<INPUT TYPE="submit" VALUE="Submit Comments or Update Room">
<INPUT TYPE="reset" VALUE="Clear Form">
</FORM>
END
   print "Current time: $time\n";
}

sub print_stack {

    $printed_lines = 0;

#
# referring to the newposts/ directory, ls -1t *_posts will force a 1 column
# output, organized most recent file to least recent, of all .post entries
#
   chop(@posts = `ls -1t newposts/*_post`);
   foreach ( @posts ) {

#
# Reclaim ID# from filename and test to see if that ID# is blocked by either
# the current user or the chatmaster.  Also, skip posting if it's not
# "addressed" to this user.
#
      @field = split(/\_/);
      if ( $field[2] ne 'everyone' ) {
         next if $field[2] ne $input{'id'}; # skip if not for my ID#
      }
      next if &blocked($field[0]); # if blocked, skip this posting
      next if &cm_blocked($field[0]); # if blocked, skip this posting

#
# prints valid entries, clears old ones from the stack
#
      if ( $printed_lines < $maxlines ) {
         open(POST,$_);
         while ( $this_line = <POST> ) {
            print $this_line;
            $printed_lines++;
         }
         close(POST);
      } else {
         system("rm $_"); # clear old entries from the stack
      }
   }
}

sub update_stack {

   $post_name = $input{'id'} . '_to_' . $input{'post_to'} . '_post';
   $id_snippet = substr($input{'id'},-2,2);
   open(POST,"> newposts/$post_name");

# allows users to control line breaks without access to HTML by simply
# hitting ENTER in the comments textarea
   $input{'comments'} =~ s/\cM\n/<BR>\n/g;

   print POST <<END;
<HR>
<!-- $input{'handle'} -->
<!-- ID#: $input{'id'} -->
<TABLE WIDTH=100%><TR><TD ALIGN=LEFT VALIGN=CENTER>
<B>$input{'handle'}</B> <FONT SIZE=-2>($id_snippet)</FONT></TD>
<TD ALIGN=RIGHT VALIGN=CENTER><I>$time</I></TD></TR></TABLE>
END

   if ( $input{'post_to'} ne 'everyone' ) {
      print POST "<FONT SIZE=+1><B>Private Message</B></FONT>\n";
   }
   print POST "$input{'comments'}";

   close(POST);
}

sub print_footer {

   print <<END;
<HR>
<P><FONT SIZE=-1><B>This chat room is maintained by
<A HREF="mailto:$admin_email">$admin_name</A></B></FONT>
</BODY>
</HTML>
END

}

#
# this routine checks to see whether or not the ChatMaster is blocking a
# given individual
#
sub cm_blocked {

   ($temp_id) = @_;

   foreach ( @userdata ) {
      @field = split(/\t/);
      if ( ($temp_id eq $field[0]) && ($field[3] eq 'BLOCKED') ) {
         return 1;
      }
   }

   return 0;
}

#
# this routine returns true if the ID# passed to it is found within the
# $input{'block'} string
#
sub blocked {

   ($temp_id) = @_;

#
# if the user isn't blocking anyone, return false
#
   return 0 if $input{'block'} eq '';

#
# check for a match between the ID# being checked up on and the list of
# blocked ID#s.  If there is a match, return true
#
   @blocks = split(/$;/,$input{'block'});
   foreach ( @blocks ) {
      return 1 if $temp_id eq $_;
   }

   return 0;
}

sub update_datafile {

#
# don't proceed while $datafile is locked
#
   while ( -e 'lock_file' ) {
      sleep(1);
   }

#
# enable locking mechanism
#
   open(LF,"> lock_file");
   print LF "Locked!\n";
   close(LF);
   chmod 0660,'lock_file';

   foreach ( @userdata ) {
#
# if the particular element in the @userdata array pertains to this user,
# updates this user's expiry time info (and handle, since that too might
# change)
#
      @field = split(/\t/);
      if ($field[0] eq $input{'id'}) {
         $found = 1;
         $field[2] = time + $reset_time;
         $field[1] = $input{'handle'};
      }
#
# if current user is the ChatMaster, write appropriate blocking info into
# @userdata array
#
      if ($cmmode) {
         if ( $field[0] =~ /$input{'block'}/ ) {
            if ($#field == 2) {push(@field,'BLOCKED'); }
         }
      }
      $_ = join("\t",@field);
   }

#
# if this is the first time a user is accessing the chat site, add a new
# entry to the user data file (via the @userdata array) to that effect
#
   if (!($found)) {
      push(@userdata,join("\t",$input{'id'},$input{'handle'},time+$reset_time));
   }

#
# rewrite datafile with new info on expiry time and possibly blocking info
#
   open(DF,"> $datafile");
   foreach(@userdata) {
      print DF "$_\n";
   }
   close(DF);

   unlink 'lock_file'
;
}

Other Chat Room Features and Examples

Chat rooms are a good example of a software system that suffers from "creeping featurism." A lot of things can be added to chat rooms, but it's questionable as to whether or not certain features should be added. Before coding a new feature, ask yourself a few questions: Is it needed? Is it understandable? Will people actually use it?

For examples of every chat room feature imaginable, you can go to Hulaboy's list of Web chat sites:

http://www.hula.net/~hulaboy/1_chtlnx.htm

Auto-Updates

In general, Web-based chat rooms require that users actively refresh the stack themselves, either by submitting an entry to the stack or by clicking on some sort of update link. With server-push/client-pull mechanisms, this can be accomplished automatically. The HealthyChoice Web site has a very nice chat system within it. You can see many of the features I've created here and a few I haven't, including auto-updates at their site:

http://chat.HealthyChoice.com/cgi-bin/Test.exe

Frames

One of the problems with automatically updating chat rooms is that your page might update while you're in the middle of typing in a new posting-this is not good. One possibility is to combine automatically updating pages with Netscape's frame feature. You could program the room so that the form portion of the page doesn't update, but the stack portion does.

The validity of frames as a feature is a hotly debated topic on the Internet, and I don't mean this as an endorsement of frames at all. Personally, I don't like them. Still, it's interesting to consider them when planning a chat room.

Previous Postings

With the way chat.cgi and newchat.cgi are coded, postings beyond the $maxlines limit are deleted. This isn't the only way to deal with these postings. Keeping them around could be a good idea, too.

One feature I've seen that I really like is a page of previous postings. The chat system at Bianca's Smut Shack keeps postings around for several days. Rather than being deleted, old postings are moved into a separate area and can be read with a different CGI program. Strictly speaking, this isn't a "chat room" feature, but it's certainly useful and could become part of your chat system. For the brave at heart, you can find Bianca and her "trolls" at

http://www.bianca.com/

Sign-Up Boards

Bianca's Smut Shack also attaches a sign-up board to each chat room within. Again, this is not a chat room feature, but it's a great idea. This way, people can leave messages for each other to arrange times to chat, for example.

Dynamic Room Generation

Occasionally, chat room users might want to split off from the main crowd and have a chat room all their own. This can be accomplished with a mechanism very similar to private messaging. InfiNet's popular Talker chat room does a good job in this regard:

http://www2.infi.net/talker/

Alternatives to CGI Chat Rooms

One of the things I find most remarkable about CGI chat rooms is how popular they are given the superior alternatives. Yes, you heard me correctly. CGI chat rooms aren't all that good. Here are a few alternative schemes that do a great job of allowing people to chat with each other.

IRC-Internet Relay Chat

This isn't so much a program as a whole client/server (or peer-peer) protocol on the Internet. There are a great number of IRC clients and servers that allow countless net-heads to chat themselves into an early grave. Winsock IRC clients can be found at Stroud's Consummate Winsock Application List.

PowWow

This exceptional Winsock client allows many people to chat simultaneously, each in their own window. While chatting, people can send pictures, sounds, and other files to each other. Again, look for this at Stroud's list.

Java Chat Rooms

I think this is where things are headed. Java is poised to become nearly universally supported as part of the standard Web browser, and Java-based chat rooms are technologically quite capable. The Earthweb Java chat server is tops at this point in time:

http://www.earthweb.com/html/products.html

talk and ytalk-Old UNIX Standbys

The UNIX talk command is an oldie but a goodie. The talk protocol is quite respectable, but the talk environment is limited to two users. ytalk is an extension of talk that uses the same protocol backbone but allows for multiple chatters. Winsock talk and ytalk clients can be found at Stroud's list.

MUD, MUSH, and MOO Systems

These are very nearly entire operating systems meant to be accessed by the telnet command. The earliest of these was the MUD-Multi-User Dungeon-which started as a role-playing game system. These environments are vast and offer features far beyond any other chat system. However, their telnet interfaces often leave much to be desired.

Summary

As long as you have the system resources, I recommend putting a chat room into your Web site. They're technically impressive, they can promote a real feeling of community in your Web site, and they tend to draw crowds-and hits are the name of the game in this business.

When you plan out your chat room, try to keep the following things in mind:

You'll want to plan out a feature-set that best serves your users. This might include blocking options or maybe schemes that prevent people from using the same handle. You might want to build a "registered user database" with a login system or maybe an http cookie server instead. Remember that chat rooms are meant to be used by many people simultaneously-you'll likely need to build a file-locking mechanism. Maybe you'll want to be a ChatMaster in your own house. Private messaging is always fun, too.

Chat rooms are an exceptionally rich field to explore in terms of the things you (or others) might want them to do. Of course, each new feature means more work for the programmer, but that's what we're paid for, right?

Chat rooms don't exist in a vacuum, either. Given their complexity, it's understandable that you're bound to come up against a lot of server-specific issues. Don't fight it-configuration can be a powerful ally if you respect how they're meant to be used.

Finally, don't be afraid to consider other options to CGI chat rooms. You might find that, for your needs, something else does a better job. However you decide to approach the job, the best piece of advice I can give you is to plan ahead, and that includes planning for the "unplanable"-don't paint yourself into a corner!