After being introduced to CGI and seeing a working (although not very useful) example in the previous chapter, you're probably eager to learn how CGI can really be put to work. This chapter shows you how to install and use a CGI application that is included on the CD. For those of you who want to understand the inner workings of CGI and HTML forms, I'll discuss some of the details of the program, which are written in C++ and Visual Basic.
When using forms in HTML, you need a CGI script or application to handle the contents of that form. You will find the source code and binary code for the programs in this section on the CD-ROM. Feel free to use the code as is or to customize it for your purposes. This CGI application is also useful if you know how to write HTML code but don't want to dive into CGI programming.
The HTML file used in this example is called feedback.htm. It enables a customer to send you comments about your Intranet. For lack of a better name, the first program, which is written in C++, is called cgi1.cpp. Just as unimaginative, the Visual Basic program is called cgi2.bas. The Visual Basic program is not mandatory. In other words, the C++ program is useful on its own. However, if you decide not to use the VB program, you will have to make a slight modification to the C++ program.
The list of files in Table 20.1 is what you will be working with
in this section.
Filename | Description |
cgi1.cpp | C++ source code for cgi1.exe |
cgi1.mak | Makefile for Visual C++ 4.x |
cgi1.exe | Intel binary CGI application |
cgi2.bas | Visual Basic 4.0 source code |
cgi2.vbp | Visual Basic 4.0 project makefile |
cgi2.exe | Visual Basic program that writes the database |
feedback.htm | HTML document that receives user input |
cgi.mdb | Access database that saves the data |
The C++ source code provided here is an enhancement to the source code that comes with the freeware EMWAC HTTP Web server for Windows NT. A RETURN tag has been added to the information dialog so that after your information is written to a file, you can return to the URL from which cgi1.exe was called. The nasty plus characters have been removed, so strings are now separated by spaces. Newline characters are now accepted.
To use this CGI database system, follow these steps:
Recall from Chapter 5, "What You Need to Know About HTML," that CGI applications can capture only the form fields that you name in the HTML document. The cgi1.exe application captures the user data for the form field names that you establish in your HTML document (for example, feedback.htm) and writes them to a temporary file with the .HFO extension. Each temporary file contains one set of form data. The temporary file is a simple text file. The Visual Basic program that writes the data to the database deletes the temporary files. If you want to e-mail the temporary files using Blat, you will need to modify the Visual Basic program.
This section discusses the why and how of the CGI system. I refer to it as a CGI system because it requires proper integration of several languages and protocols, including HTML, C++, Visual Basic, HTTP, CGI, and ODBC.
Figure 20.1 shows a sample form, defined in HTML, that gathers comments from the user. When the user clicks the button labeled Submit Comments, the form will run the CGI application on the server, which will save the form data to a temporary file before passing it to Visual Basic.
Figure 20.1 : Form processing screen.
Listing 20.1 shows feedback.htm, which is the HTML code that creates the form and invokes the CGI application. To achieve the greatest portability among browsers, this code does not take advantage of any HTML 3.2 features.
Listing 20.1. This feedback.htm
file sends form data to the CGI application on the server.
<HEAD> <TITLE>Suggestions and Comments</TITLE> </HEAD> <BODY> <form action="http://localhost/scripts/cgi1.exe" method="POST"> <H1>Your Comments, Questions and Feedback!</H1> Please enter your Name: <BR><INPUT TYPE=text NAME="name" SIZE = 40 MAXLENGTH=40> <BR>Email address: <BR><INPUT TYPE=text NAME="email" SIZE = 40 MAXLENGTH=40> <P> Enter your comments, questions and/or suggestions in the space below: <BR> <TEXTAREA NAME="comments" ROWS=12 COLS=60 MAXLENGTH=3000></TEXTAREA> <P> <input type="submit" value="Submit Comments"> </FORM> </BODY> </HTML>
The action attribute of the form is the URL of the CGI application. The action attribute of the sample form is as follows:
http://localhost/scripts/cgi1.exe
The scripts/cgi1.exe portion of the action attribute executes the first part of the CGI system on the server, cgi1exe. In this example, the method attribute is POST, which means the form data will be read from stdin by the CGI application. Notice the names of the input fields (name, email, and comments); you will be checking for those names in the VB program later.
Take another look at the <FORM> tag in Listing 20.1. Notice that the action URL indicates localhost. One of the rules of TCP/IP is that 127.0.0.1 is defined to be the localhost, or loopback address, meaning that this special IP address is always valid to refer to the current machine. The HOSTS file in your Windows directory should indicate that localhost is an alias for 127.0.0.1. This alias makes a convenient way to check your CGI systems on your server, even if you don't have a valid assigned IP address or you aren't connected to the Internet. You will have to modify the URL to refer to your actual server name if you expect the code to work in the Web browsers of your customers.
Visual Basic includes very powerful capabilities for database, file, and string manipulation. And these features just happen to have a high correlation with a dream language for CGI. After you become familiar with VB, you'll see that this list is just the tip of the iceberg of its capabilities. For example, did you know that VB will let you send and receive e-mail on the LAN using MAPI (the Mail API)? Did you know that you can use VB to send a fax? VB programs are even used to control factory automation processes. Imagine using a Web form to automatically initiate a fax transmission, beep your pager, or build a customized pizza.
Pundits claim that VB is slow, but this just isn't true. Of course, VB is slower than a compiled language, but the point is that this usually doesn't matter. Before C/C++/Java programmers get in an uproar, realize that I didn't say always, I said usually. I can back that statement up with two principles that I have seen proven true in countless software development projects.
In the first place, the choice of algorithm is always far more important to performance than any other factor. One VB program can be a factor of 100 times slower than another VB program written to perform exactly the same task. And a C++ program can easily be written in such a way that it runs slower than a similar Visual Basic program. So before you criticize the speed of a language, first analyze the specific bottlenecks in the given program. Otherwise the conversation is meaningless. (By the way, the most likely kinds of programs where C++ will beat VB in benchmark tests involve compute-bound tasks where the CPU is heavily utilized. If you have a function which calculates a formula thousands of times in a loop, write it in C++. On the other hand, if you are developing a GUI client/server database application, the network and database software layers, like ODBC, are more important factors to the overall performance than is the language choice for the high-level application code. Obviously, hardware plays a critical role in performance too, but I assume that will be the same regardless of your choice of programming language.)
Secondly, with project schedules as condensed as they are on software developers and Information System Engineers, as soon as you identify the segment of your code that cries out to be optimized (evidence suggests that this code is often less than one percent of the total code), and assuming that your customers don't have more important things for you to be working on instead (like new features and enhancements), you can rewrite that function in C and put it in a DLL. This way, you get to retain VB for the things it does best, namely user interface construction and database connectivity, and tune only the sections of the program that need the boost. Isn't it better to get it working first and optimize it later?
Many Windows Web servers don't support WinCGI, which is a protocol invented by Bob Denny (author of the O'Reilly WebSite server) for building CGI applications in Visual Basic. Standard CGI requires a shell interface, but VB4 doesn't build that kind of console-mode application. This is unfortunate, but thanks to the infinite malleability of software, it's not insurmountable. You might call the program presented in this chapter a clever kludge (if you believe in the power of positive criticism). It allows you to use CGI with just about any Windows-based Web server and still take advantage of the database power of Visual Basic 4.0 (or greater) Profession Edition or Enterprise Edition.
The C++ program that is developed in this chapter parses the form data into a temporary file before it invokes the Visual Basic application. The VB program runs invisibly while it reads the temporary file passed in by the C++ program and writes the information to the Access database.
By the way, the C++ program is more like an "extended C" program because it doesn't use any object-oriented features. However, it is compiled with Visual C++ as a C++ program and it does use C++ style comments.
The application presented in this chapter has four pieces:
Using Visual Basic and ODBC, you could just as easily connect to an Oracle database or any other database for which an ODBC driver is available instead of the Access database. I need not name your favorite database here because the chances are very high that you can obtain an ODBC driver for it. Once you have an ODBC driver, the rest of the story is all the same.
What does this application do when it is all done? As you saw in Figure 20.1, the HTML file captures user feedback about your Intranet. The CGI program then saves the data into an Access database, which you can query at your leisure. Writing the back-end program in VB opens the door to easy customizations in the future that could take full advantage of the Windows environment. For example, you could run other programs in Access or Visual Basic to process the collection of form data by sending e-mail in a batch fashion. If you run the second-stage programs offline (perhaps from another workstation) or on a daily or weekly basis, you can minimize the hit on system resources while the Intranet server is running. In other words, consider tuning the Web server to perform only the quick acquisition of form data, and save the data processing for later.
The program inserts the form data into a table in an Access database. Each database record includes a timestamp so that you can use a SQL query (in a separate program such as Access) to determine which records have been added within a given date range.
The main goal of this program is to show you the possibilities of CGI with VB. The sample application presented in this chapter is by no means robust. In particular, it lacks substantial error checking. You can easily extend this system in any number of ways. With HTML, VB, ODBC, and the Windows API at your disposal, you are limited only by your imagination.
The disadvantage of using two programs in this manner is that
they must pass the form data in a temporary file. This takes more
time than if only C++ were used or if ISAPI were used instead.
Using two programs is going to be slower than using one program
and that would in turn be slower than using one ISAPI DLL. But
in the case of an Intranet, efficiency of server transactions
is probably not going to be as key a factor as if you are running
a Web site on the Internet. I'm not saying that you shouldn't
care if it runs slow, but the primary goal should be to get a
prototype up and running so that you can show it to your customers
as soon as possible. Then you can collect their feedback, customize
and enhance the program (and the HTML) based on their suggestions,
and begin analyzing if actual performance is really considered
lacking.
CGI was invented from a command-line perspective in a UNIX environment. Unfortunately, having the Web server invoke the CGI application separately for each client request is not the most efficient means of processing form data on Windows
NT.
Why is CGI efficiency important? The answer depends mostly on how large your Web site is. If you have only a few visitors per day, you don't need to worry about it. But if you are running a server with dozens of simultaneous client connections, you'll want your server to be tuned as tightly as possible. The new generation of Web servers for Windows makes great gains in CGI efficiency by opening the door to ISAPI. Process Software, Ilar Concepts, Internet Factory, and Microsoft, among other vendors, already support this open specification. (Netscape and WebSite use other efficient CGI alternatives.) ISAPI (Information Server Application Programming Interface) allows developers to create 32-bit DLLs to run within the memory context of the Web server. Not only does the server avoid reloading the CGI executable for each client hit, but form data is passed into the application, and the HTML response is passed back, using pointers to memory blocks-thus saving a substantial amount of file I/O. Although Visual Basic doesn't work with pointers as handily as C and C++, ISAPI programming in VB is probably no more complex than it is in WinCGI 1.2 or CGI 1.1. Another alternative to CGI is SSI+ (Server Side Includes), which is supported by the WebQuest server from Questar. Only time will tell which of these techniques will ultimately prove to be the most effective. In the meantime, consider many factors when choosing your server, and then use the tools it offers. |
The C++ program is called cgi1.cpp. It is an enhanced version of the simple C program presented in Chapter 19, "Getting the Most Out of HTML with CGI." You recall that cgisamp.exe was used to parse the contents of an HTML form. The purpose for modifying the program in this section is to get it to parse the data slightly differently before handing it off to the Visual Basic program.
The C++ program creates a small temporary file each time it is invoked. See Listing 20.2 for the code. The following is a description of the functions in cgi1.cpp:
Listing 20.2. cgi1.cpp
runs between the server and the VB program.
/*********************************************************************** * File: CGI1.CPP * * Description: This CGI program parses form data and invokes * a Visual Basic program to save the data in an ODBC database. * * This program assumes it is invoked by an HTML form. * This program writes form data * to a temporary file and then invokes the Visual Basic application * that interfaces with the database. * Ensure that you compile this script as an NT console mode app. * * This program is public domain freeware. * April, 1996 * By: Scott Zimmerman * ***********************************************************************/ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <ctype.h> #include <io.h> #include <time.h> char InputBuffer[4096]; // Maximum amount of data user may enter /* Convert all cOld characters in cStr into cNew characters. */ void strcvrt(char *cStr, char cOld, char cNew) { int i = 0; while(cStr[i]) { if(cStr[i] == cOld) cStr[i] = cNew; i++; } } /* The string starts with two hex characters. ** Return an integer formed from them. */ static int TwoHex2Int(char *pC) { int Hi, Lo, Result; Hi = pC[0]; if('0' <= Hi && Hi <= '9') Hi -= '0'; else if('a' <= Hi && Hi <= 'f') Hi -= ('a' - 10); else if('A' <= Hi && Hi <= 'F') Hi -= ('A' - 10); Lo = pC[1]; if('0' <= Lo && Lo <= '9') Lo -= '0'; else if('a' <= Lo && Lo <= 'f') Lo -= ('a' - 10); else if('A' <= Lo && Lo <= 'F') Lo -= ('A' - 10); Result = Lo + 16 * Hi; return(Result); } /* Decode the given string in-place by expanding %XX escapes. */ void urlDecode(char *p) { char *pD = p; while(*p) { /* Escape: next 2 chars are hex */ /* representation of the actual character.*/ if(*p == '%') { p++; if(isxdigit(p[0]) && isxdigit(p[1])) { *pD++ = (char)TwoHex2Int(p); p += 2; } } else *pD++ = *p++; } *pD = '\0'; } /* Parse out and store field=value items into the temp file. ** DON'T use strtok here because it is ALREADY used by caller. */ void StoreField(FILE *f, char *Item) { char *p; p = strchr(Item, '='); *p++ = '\0'; urlDecode(Item); urlDecode(p); strcvrt(p, '\n', ' '); /* Get rid of those nasty +'s */ strcvrt(p, '+', ' '); fprintf(f, "%s=%s\n", Item, p); } int main(void) { int ContentLength, x, i; char *p, *URL, *whocalledme; char datebuf[9], timebuf[9]; char FileName[_MAX_PATH]; char cmdbuf[_MAX_PATH + 30]; FILE *f; // Turn buffering off for stdin setvbuf(stdin, NULL, _IONBF, 0); // Tell the client what we're going to send back printf("Content-type: text/html\n\n"); // Uses a kludgy Ipc method to pass form data to VB for (i = 0; i <= 9999; i++) { // Make a new filename sprintf(FileName, "CGI%d.HFO", i); // If the file exists, try again. Doesn't handle errors! if(access(FileName, 0) == -1) break; } // Open the file f = fopen(FileName, "a"); // Check if open succeeds if(f == NULL) { printf("<HEAD><TITLE>Error: cannot open file</TITLE></HEAD>\n"); printf("<BODY><H1>Error: cannot open file</H1>\n"); printf("The file %s could not be opened.\n",FileName); printf("</BODY>\n"); exit(0); } // Write to the file the URL which posted the form data whocalledme = getenv("REMOTE_ADDR"); fprintf(f, "URL=%s\n", whocalledme); // Write to the file the date/time of this hit strdate(datebuf); strtime(timebuf); fprintf(f, "Date=%s\n", datebuf); fprintf(f, "Time=%s\n", timebuf); // Get the length of the client input data p = getenv("CONTENT_LENGTH"); if(p != NULL) ContentLength = atoi(p); else ContentLength = 0; // Avoid buffer overflow -- better to allocate dynamically if(ContentLength > sizeof(InputBuffer) -1) ContentLength = sizeof(InputBuffer) -1; // Get the data from the client (assumes POST method) i = 0; while(i < ContentLength) { x = fgetc(stdin); if(x == EOF) break; InputBuffer[i++] = x; } InputBuffer[i] = '\0'; ContentLength = i; p = getenv("CONTENT_TYPE"); if(p == NULL) { fclose(f); return(0); } if(strcmp(p, "application/x-www-form-urlencoded") == 0) { // Parse the data p = strtok(InputBuffer, "&"); while(p != NULL) { // Write the field/value pair to the temp file StoreField(f, p); p = strtok(NULL, "&"); } } else // Write the whole data to file fprintf(f, "Input = %s\n", InputBuffer); // Confirm to client if(!ferror(f)) { // What url called me URL = getenv("HTTP_REFERER"); printf("<HEAD><TITLE>Submitted OK</TITLE></HEAD>\n"); printf("<BODY><h2>Your information has been accepted."); printf(" Thank You!</h2>\n"); printf("<h3><A href=\"%s\">Return</a></h3></BODY>\n", URL); } else { // What url called me URL = getenv("HTTP_REFERER"); printf("<HEAD><TITLE>Server file I/O error</TITLE></HEAD>\n"); printf("<BODY><h2>Your information could not be accepted\n"); printf("due to a file I/O error at the server.</h2>\n"); printf("<h3><A href=\"%s\">Return</a></h3></BODY>\n", URL); } // Close the file. fclose(f); // Run the Visual Basic program... sprintf(cmdbuf, "start cgi2.exe %s", FileName); system(cmdbuf); return(0); }
One of the first things that main() does is to invent a temporary filename. It tries to use CGI0.HFO. If that filename exists, it will increment the number and try CGI1.HFO. This algorithm is pretty inefficient and it doesn't check for errors, but it serves the purpose so you can focus on the more interesting stuff.
The first item written to the temporary file is the URL from the REMOTE_ADDR environment variable. This variable tracks the client. The date and time follow on separate lines. All fields are on lines by themselves, and each field name is separated from its corresponding data by an equal sign. You need to keep these things in mind when you write the VB program.
This program ignores some error-checking, and it blindly assumes that REQUEST_METHOD is POST, so make sure you use that in feedback.htm or your own customized HTML that invokes this program.
The interesting thing about the code is that two output files are being written simultaneously. Remember, the CGI application is supposed to send some HTML output back to the browser so that the user won't get stranded on the Web. This task is achieved by the calls to printf, which write to stdout. The HTTP server picks up the stdout data, applies the HTTP protocol, and sends it back to the client.
Meanwhile, you still have to write the form data from the client into the temporary file before you launch the VB program. That is achieved by the fprintf calls. At the end of the program, cgi2.exe is launched through the system() call in the C standard library.
Now you need the VB program to pick up the form data and insert
it into the database. This process executes fairly quickly, so
you really don't need a user interface. In fact, you want the
program to quit as soon as it's finished-with no user involvement
at all. Remember, this program is only going to run on the server.
I placed all the code in Sub Main
and ParseField in the .BAS
file to eliminate the unnecessary loading of a .FRM
file. See Listing 20.3 for the VB code. This file is cgi2.bas
in the \chapter20
directory on the CD. The VB 4.0 project file is cgi2.vbp.
Note |
This program was developed using the Visual Basic 4.0 Professional Edition. The program will also work as is with the 4.0 Enterprise Edition, and presumably with VB 5.0. With very minor changes, the program could be made to work with the Visual Basic 3.0 Professional Edition. The files on the CD do not include a setup program for this application. You must have Visual Basic 4.0 installed to get the proper DLLs and for OLE registration of the Jet engine to take place. |
Listing 20.3. The VB program, which interfaces with the database.
'------------------------------------------------------------------------ ' CGI2.BAS ' This program was written by Scott Zimmerman, April 1996. ' It is public domain freeware. '------------------------------------------------------------------------ ' Public Sub main() Dim szURL As String Dim szDate As String Dim szTime As String Dim szName As String Dim szEmail As String Dim szComments As String Dim db As Database Dim rs As Recordset ' Open the temporary file with form data and read it into memory Open Command$ For Input As #1 Line Input #1, szURL Line Input #1, szDate Line Input #1, szTime Line Input #1, szName Line Input #1, szEmail Line Input #1, szComments ' Close the temporary file and delete it Close #1 Kill Command$ ' Open the database and the table Set db = OpenDatabase(App.Path & "\cgi.mdb") Set rs = db.OpenRecordset("table1", dbOpenTable) ' Add a new record to the table. Counter field is ' initialized automatically by Jet 3.0. rs.AddNew rs!When = ParseField(szDate) & " " & ParseField(szTime) rs!URL = ParseField(szURL) rs!Name = ParseField(szName) rs!Email = ParseField(szEmail) rs!Comments = ParseField(szComments) ' Update the table, close everything and quit rs.Update rs.Close db.Close End End Sub Private Function ParseField(szText As String) As String Dim k As Integer k = InStr(szText, "=") ' Return the substring following the equals sign ParseField = Mid$(szText, k + 1) End Function
Almost all the code is in Sub Main. The command$ statement is used to retrieve the name of the temporary file passed in by the C++ CGI program. Note that cgi2.exe assumes the order of the fields in the text file written by the C++ program. This isn't robust, but again, that isn't the point. To modify the fields on the HTML feedback form, you also have to modify the VB code. The C++ code should be able to survive without modification if new fields are added to the HTML file, the VB code, and the database.
The following is a sample file written by cgi1.exe before it is passed to cgi2.exe:
URL=127.0.0.1 Date=04/14/96 Time=10:54:13 name=Scott Zimmerman email=scottz@sd.znet.com comments=Thank you for building this Intranet!
Notice how each field of the form data has been parsed onto a line by itself with an equal sign serving as the delimiter for the VB program. The URL, Date, and Time fields are obtained automatically. The other fields correspond to data that the user fills out in the HTML form.
The temporary input file from the C++ CGI program is deleted as soon as you are through reading it in the VB program. Then the database is opened, and a recordset is created using the dbOpenTable parameter. This parameter permits you to write to the table. The table is called table1 (until you change it).
The fields in the table correspond to the data you capture on the form, as well as the URL of the client and the date/time the submission was made to the server. For each field, the table calls the ParseField function to retrieve the substring following the equal sign.
If you designate a key field to use AutoIncrement in Data Manager, the Jet 3.0 engine will automatically take care of incrementing the field for you as new rows are inserted. Therefore, you don't need to supply the value of the key field between the calls to AddNew and Update. See Figure 20.2, which shows the Edit Field dialog in Data Manager 32. Note the Counter checkbox. Only one such field per table should have that value turned on. It is only available for Long Integer type fields.
The database referred to by the Visual Basic program (cgi2.bas) is named cgi.mdb. The VB program assumes that it lives in the same directory as the Visual Basic program itself, which is also the same directory as the cgi1.exe program. The cgi1.exe program does not invoke the cgi2.exe program with a full path name, so they all should go in the \scripts directory. (Your Web server may use a different name, and it is also possible to put cgi2.exe in the PATH or even in the Windows directory.)
You can open the database with either Access 7.0, VisData, or
Data Manager. The last two are utility programs that come with
the VB 4.0 Professional and Enterprise editions. Access comes
with Microsoft Office Professional. This chapter won't go into
the steps to create the database because that information is readily
available in the product documentation for Access and Data Manager.
Table 20.2 summarizes the fields in the sample database.
Field Name | Datatype |
Counter | Long (primary key, required unique) |
URL | Text, length 40 |
When | DateTime 8-byte variant datatype |
Name | Text, length 40 |
Text, length 40 | |
Comments | Text, length 255 |
The field named Counter is the key field and is also marked as a Counter type in Data Manager. This means that you don't have to calculate unique values for it because VB will do this automatically when you execute rs.Update. See Figure 20.3 for the Data Manager Edit Index dialog showing the fields of the CGI table.
Figure 20.3 : The Edit Index dialog in Data Manager32 showing the primary key field.
As I've said, the program lacks several important error-checking features. For example, no check is made to ensure that the supplied text does not exceed the field sizes. The field sizes in the database should match the MAX attribute of the TEXT input fields in the HTML file. The VB program should probably be written to truncate longer text just in case the two sizes should ever become mismatched. As it stands now, the VB program would crash if the input text exceeded the database field size.
The next chapter gets into more of what you can do with databases on your Intranet. WAIS allows your customers to use a searchable database on the Web without your having to do any programming. You'll be amazed at how easy it is to use the EMWAC WAIS Toolkit with IIS to provide powerful search features on your Intranet.