Too much information is available on the Internet for you to get by without using databases. Every time you look at a search site and every time someone places an order, data is being transferred, searched, and stored. Behind the scenes, a lot of work is going on, and it can be accomplished a number of different ways.
What goes into creating the functionality and maintaining these databases? It can be a lot of work. For large servers with thousands of users, search engines can be complex monsters occupying an entire machine that has more RAM than most people have total hard drive space. Online transactions often require their own special server, in addition to the basic Web server, to store special transaction information and perform all the transaction processing.. This information can get complex and overwhelming really quickly, but it doesn't have to be that way.
Using Windows CGI (WinCGI) functionality, some popular tools, and a little of your time, you can put your own Web-based database functionality together in no time at all. I'm going to save the example for the end of this chapter because you should be aware of the many choices before continuing. This chapter covers the following:
Of course, all this information is followed by the example. So, if you're one of those people who looks at the last page of the mystery novel before reading it, you can just jump there now. But for the more adventurous, I cover all the elements that go into such a choice, so you'll know if it's right for you and what else you could do instead.
A database stores information so that when people look for information in a database, they can find it easily. Like tiny filing cabinets on a hard drive, the database stores this information in just about any organizational method, as long as someone or something can help the searcher find that information later. In the database world, this organization involves file formats and one of two general categories of databases: flat file or not-flat file (relational).
Accounting books, tables on paper, old spreadsheets, lists of information separated by semicolons in a text file-they are all the quintessential representations of what are dubbed flat file databases. What exactly is a flat file database? If you take a list of information and save it to a file in some order, you end up with a database. But how deep does the information in that database go? The information goes just as far as what you put into it. Flat file databases are like a phone book: You can look up a person's name or look up a business' name, but you're limited in what else you can do with the information. If you're looking for a friend you knew years ago, and you find 30 names that match, that's all the information you have to go on. You've used up what information is available to you in that level of the database, and you need more depth of detail to make the right choice-unless you plan to use the brute force method and call every person until you get the right one. Once the information is presented in its original format, there's no more "depth" to what is contained in that file. Sure, you can find other files and other sources of information, but you have to track down the information yourself, the hard way. Instead of having the depth and breadth of what you really need in that instance, the source of information provides only a shallow level of detail. In short, it's "flat."
Don't think for a minute that flat file databases are useless. Far from it! I just used the preceding illustration to point out one of the basic ideas behind a flat file database-it's just one level of data. This kind of database still has a large number of uses, like a phone book, a shopping catalog, a dictionary, a menu-anything that has pieces of data that tie together very tightly and provide what it's designed to with that single tier of information.
Creating a flat file database can be as easy as entering information into a file sequentially, like a typical spreadsheet. Save it to a file, and you have an instant flat file database. The trick is in ordering the data so that you can quickly and easily scan through the file later.
When you need to get more information from data, you have to go beyond a flat file database. If you're looking for a friend's name in the ultimate phone book, you might say, "Find everyone in New York with the last name of Jones, first initial J, who went to school in Texas, and is married to someone with the first initial of R or M." Calling directory assistance won't get you that kind of information-unless directory assistance is a relational database of rather prodigious size and flexibility.
Relational databases can do what their name implies-establish relations between pieces of information. I'm not saying that they're all-knowing and do this work by themselves; I'm just saying that they provide the extra layers of information that you're looking for. So, for example, in a relational database, you might have a table of customers and their addresses, a table of names of people who attended your user conference, and a table of resellers of your product. Using a relational database, you could ask for a list of everyone who attended your user conference but isn't a customer and get the name and address of the reseller closest to that person. You could do this all in one big long command, without any steps in between.
Relational databases have several advantages over flat file databases. In a relational database, tables can point to one source of information; if five different tables include someone's address, for example, they're all really just pointing to some arbitrary storage space that they interpret as the address instead of five copies of that same data floating around. The power of being able to perform complex operations also weighs heavily in the favor of relational databases and makes heavy-duty processing easy.
An important point to come to terms with, though, is that a relational database is not necessarily easy to use. You can't just say, "Give me record #5." You have to be familiar with the syntax of the commands the database uses and how to create (and in some cases interlink) databases effectively. When you see positions for database administrators advertised, you should realize that the position is not just someone who watches over a filing cabinet; keeping all the connections and the data safe, accessible, and up to date is a real endeavor.
Just as you can make choices of different database types, you can make choices for how you access the data. Different methods of communication have their own benefits and drawbacks, but the format of the file you're accessing and what you're accessing it with can sometimes end up making the choice for you.
Typical flat file access is either sequential, in which you just keep reading until you get there, or offset, in which you can either pick specific sections of the file (based on sizes of data chunks you plan on storing) or be slightly more random about the whole process. Delimited ASCII text files, for example, can be addressed through a sequential read and the use of pattern matching; you simply check each line for the existence of some string and then store it (or a pointer to it) if and when it does match. Offset reads allow the program to jump to specific sections of the database. If you have a name field of 30 bytes followed by a phone number field of 30 bytes, for example, you can simply pick a byte position to land yourself at a particular field in a specific record.
Open Database Connectivity (ODBC) provides a translator for letting
different external
programs and databases talk. Like CGI in its own way, ODBC ensures
that incoming information gets placed into what the database considers
a standard form and that outgoing information gets sent back in
an understandable manner to the original program. These translators,
or drivers, come in different flavors for different database formats
and make use of some system files in the background that tie them
into the rest of the system.
ODBC doesn't just help with files on the same system. You can have an ODBC driver that takes care of special networking communications needed to go from a Windows pc to a UNIX machine running an Oracle database, and indeed a whole slew of them are available from different vendors. You still need some other hardware and software on the machine asking the questions to ensure that information can get through and then get back, but the driver takes care of a lot of messy details.
The Structured Query Language (SQL) is a standard for asking questions of more powerful databases. Most flat file databases don't support it because some overhead is involved, and with simple databases, SQL has power that's not necessarily worth the effort. When you need it, though, SQL provides the power to make even the most complex relational databases behave like you want them to.
The concept behind using SQL is that it's reasonably like forming
a sentence. In real life,
you might say, "Get all the strawberries in the big bucket."
In SQL, you would enter SELECT strawberries
from BigBucket.
Using SQL requires having a handle on SQL terms: actions, fields, tables, and conditions. You can use one of the four primary SQL actions: SELECT, INSERT, UPDATE, or DELETE. Because most data operation on the Web, such as search engines, rely on obtaining information from a large database rather than modifying it, we'll leave analysis of those data-changing operations (INSERT, UPDATE, and DELETE) for something you can investigate if your application requires it. All you'll need for that analysis is a good SQL language book, a database, and some time to try it all out.
Getting things from a database is selecting them, so that part's easy. Next, you have to figure out what you're getting. You also have to be careful not to confuse getting with any special instructions like " and don't get any rotten ones!"
In SQL, you specify the fields of information that you want to return-such as phone number, address, or name-that correspond to fields you've created in your database. Because SQL is normally used with large multitable databases, a connection is made; then you choose the portion of the database you want to affect: the table.
Finally, you can apply conditions to the statement-for example, " but only ripe ones." These conditions consist of some field of data and the value it should or shouldn't be. So, you might enter SELECT strawberries from BigBucket WHERE ripe = 'yes'.
SQL has lots of different variations, depending on the database you're accessing. If you need lots of conditions and power, you should familiarize yourself with SQL and whether your database of choice supports it through either a driver or internal functionality.
An unusual option available in Windows CGI is Dynamic Data Exchange (DDE). DDE causes a conversation to take place between two concurrently running applications (assuming they both support DDE). This way, they can share information in real time, and one can even cause the other program to do something. The reason this is unusual is that it's unique to Windows; you won't see it on a UNIX platform because DDE doesn't exist on UNIX. Some variants exist on those other platforms, of course, but none of them are quite as open or as easily accessible as DDE is to most Windows applications.
Using DDE as a database access method requires that the database program be running all the time, but this approach does have some advantages. Although the other program is taking up memory all the time, starting a DDE conversation has very little impact on the system. So, if you start up a tiny application that uses DDE and connects to a DDE-capable database application running on that machine, you can start performing database operations without any real additional work.
The best times to use DDE over ODBC or SQL are when the application supports neither, some other function needs to be performed (such as running a macro that can't be run through SQL), or initial load or access of the database takes an extremely long time. In all these cases, DDE communications to the already-open database may give you the performance or functionality edge that your application needs.
Now that you know what you can store in a database and how information will come to you, you're ready to see what kinds of tools are out there for general use. Depending on your comfort level with various programming tools, you may be more inclined to drift towards one particular solution. It's important, though, that you look at all the options you have available before you decide on a particular path. You may just see that there's an easier or faster way for what you want to do.
Visual Basic developers turn to Visual Basic (VB) time and time again because it provides power and flexibility, making it usable for almost any task. This fact certainly holds true with databases, because VB has both internal functions to access database formats directly and the internal functionality necessary to make short work of processing that information.
Though the object-oriented nature of Visual Basic doesn't help much when you're writing behind-the-scenes applications with no visual components except text output, the ease with which you can create and compile an application makes VB useful for any task. For example, SQL queries are built into VB's implementation of the Database Access Object (DAO) 2.5, enabling you to build in powerful queries as you work.
As a Rapid Application Development (RAD) tool, Visual Basic has more users than any other similar tool on the market; it is used extensively in examples provided by server manufacturers, individual consultants, and thousands of other people all over the place. Because it has so many built-in functions and prebuilt modules available for almost every task, using VB is more like stacking building blocks together than building your own application from scratch. And when you're in a hurry, that's good news.
Borland's Delphi, or "Object Pascal" as some have called it, is close in popularity to Visual Basic when it comes to developing powerful Windows applications quickly. Delphi is built on the same object-oriented principle, but it works just as well for nonvisual applications that just need raw processing horsepower.
With modules and classes, you can plug other components into what you're developing with as little pain as possible, and built-in functions for Delphi (in the form of these modules) give you the quick access you need to get the data.
Starting a C program from scratch to perform database access isn't exactly beginning programming. If you intend to read a text file line by line and print the results, using C is not too bad; but if you want to start creating dynamic sets of data from ODBC data sources, you'll be programming for a while.
Anything and everything that you need to do for database access you can do in C. Although freeware and shareware database access code isn't as rampant for C, you can find examples with source code that you can modify in almost any location on the Internet. The problem is that most modules and code fragments aren't designed to be neatly tied together, so you'll spend lots of time, effort, and frustration doing it yourself, instead of using a tool designed to do the job quicker, like Visual Basic or Delphi.
The benefits of a C program are all in speed and memory overhead. If you want the best response time and the most efficient program, you should use C. But you also need lots of expertise in using it and a good headache medicine for the more complex operations.
When you're dealing with Windows CGI and its lack of direct STDIN/STDOUT capabilities, one interesting side effect is that almost any program in existence can go out and retrieve data from a content file and then do something with it. All the program has to be able to do is read text files and environment variables and understand what it is it's reading in. If a program can take advantage of the ReadPrivateProfileString() function through a DLL call, that makes the task even easier. If you really want to use a tool, but you can't make it read environment variables, you can create a buffer program to help it along.
Buffer programs are nothing more than intermediary programs that have some of the functionality you need but don't do all the work. Say part of what you want to do involves using DDE communications to talk to your database. You have a tool that makes the conversations and other functions easy, but it can't read environment variables without special help, and starting it up every time you need to use it would be too slow. By using a buffer program to move the information to where the other program needs it, such as a specific file it's checking for in a loop, you can use the smaller tool to take care of the basics and the bigger tool to do the real work and return the data you're looking for.
The time has come to build a database. The purpose of the database itself is reasonably simple: You want to provide customers with a way to search for a movie title, get more information on it, and see how much it costs on videotape. You're creating sort of an online video store for this example, but this database would work for any kind of catalog or similar list.
The data in the Movie Catalog consists of the following seven parts:
The items marked with an asterisk (*) are the fields that you can let people search through. Sure, people might want to look for a movie under $20 in the Price field, but that's just another option that you could implement.
As you can see from this information, you don't need to make too much initial data available. People might want to know other possible information about the movie, but you're not creating an online reference here (yet), just a guide to what you have. Because you don't need to go "in depth" with your information, a flat file database will work just fine. Because you might want to do more later, however, you build the example so that you can easily make it more powerful.
The choice of tools to create the database for this endeavor is Microsoft Access. Besides being a cool database tool in general, Access has one or two other tricks that enable you to make this example work with less hassle. In addition, because things are built into tables, you can easily expand with more tables for reviews, orders, and other features later on, and tie them together. Remember, in some cases flat file databases are only a state of mind and implementation. In this case, you're planning for the future and shouldn't limit yourself much. Figure 15.1 shows some of the data being used in this example.
Figure 15.1: Movie catalog data in a Microsoft Access table.
To get at the data for this example, use Visual Basic for two basic reasons: built-in data access and available modules. First, and most important, VB can get data from a Microsoft Access database (MDB) without any extra filters or drivers. By default, Visual Basic applications can take advantage of Microsoft's DAO architecture for accessing databases. Because Access and VB both fit into this standard, an easy conversation occurs between them, saving you lots of work and enabling you to use the more advanced Access database functions later on if you want to. Second, but also important, some freely available VB modules (precreated snippets of VB code) make this whole task a breeze. After all, using databases isn't supposed to make life harder.
In this case, use the Visual Basic module CGI32.BAS, written by Robert Denny. Besides being an all-around smart guy and making cool server software, Bob puts a lot of effort into making other people's code-writing lives easier; CGI32.BAS is a prime example. Instead of needing to dive back into the intricacies of the Windows CGI format, you can rely on this prebuilt module to do the dirty work. Using such modules is absolutely invaluable for both experienced VB developers (to give a starting point for other code or just to use) as well as for beginning VB developers (to not have to worry about stuff you really don't care about yet and to save yourself countless hours of effort).
Tip |
You can find the CGI32.BAS module on any copy of the WebSite server software from O'Reilly and Associates at http://software.ora.com/techsupport/software/cgi32.zip. The functions in CGI32.BAS are looked at in detail in Chapter 14, "WinCGI: The Basics," which will help you better understand what it does for you. |
Because Visual Basic can get the information from the WinCGI input file, and it can automatically talk to an Access database, what do you need to know about communications? Well, getting the information there is only half the battle. To start, you need to be able to format the users' questions so that the database understands them.
Data read by Visual Basic can all be stored in internal variables. That's no trouble. You can, and do, check those variables to see whether anything is in them. So far, so easy. Now you need to talk to the database itself. To do that, SQL comes into play.
Benefits from Visual Basic and Access talking to each other come in a number of forms, but SQL is one of the big ones. Without outside interference, VB can direct SQL queries to the Access database and store the results for ease of use. This way, you get the full power of a relational database, where you can have multiple tables and conditions, but you also can do the basic operations that are currently needed.
Listing 15.1 shows the MOVIES.BAS file, a Visual Basic 4.0 file created to search the Access 2.0 database. The code itself is reasonably short because of the use of CGI32.BAS as an intermediary, which saves your having to put in lots of calls to read the Windows CGI. INI file that contains the data.
Note |
For Listing 15.1, you also need to include an INTER_MAIN subroutine in the application, though the subroutine itself can be empty. The subroutine is used by CGI32.BAS, but what it does doesn't matter as long as it's there. |
Listing 15.1. Visual Basic source code for MOVIES.BAS.
Main Module--
Attribute VB_Name = "Movies"
Sub CGI_Main()
If CGI_Request_method = "GET" Then
SendError
Exit Sub
Else
SendData
Exit Sub
End If
End Sub
--
SendData() subroutine--
Sub SendData()
' Set up some variables for use later
Dim Db As Database
Dim tmpSet As Dynaset
Dim category As String
Dim title As String
Dim director As String
Dim year As String
Dim SQLtext As String
Dim textpart() As String
Dim mycount As Integer
Dim where As String
' Uses the GetSmallField function (CGI32.BAS) to retrieve information
category = GetSmallField("category")
title = GetSmallField("title")
director = GetSmallField("director")
year = GetSmallField("year")
' Send back the basic header
Send ("Content-type: text/html")
Send (" ")
Send ("<title>Movie Finder</title>")
Send ("You were looking for: <p>")
' Start checking those values...
If category <> Null Then
Send ("Category: " & category & "<br>")
mycount = mycount + 1
textpart(mycount) = "category like " & category
End If
If title <> Null Then
Send ("Title: " & title & "<br>")
mycount = mycount + 1
textpart(mycount) = "title like " & title
End If
If director <> Null Then
Send ("Director: " & director & "<br>")
mycount = mycount + 1
textpart(mycount) = "director like " & director
End If
If year <> Null Then
Send ("Year: " & year & "<br>")
mycount = mycount + 1
textpart(mycount) = "year = " & year
End If
If mycount < 1 Then
Send ("<P>You didn't send any data!")
Exit Sub
End If
' Open the Access Database, using Db as the pointer to it
Set Db = OpenDatabase("cgi-win\movies.mdb", False, True)
' Join the array of search criteria together to form a single condition
For a = 1 To mycount
If a = mycount Then
where = where & textpart(mycount)
Else
where = where & textpart(mycount) & " AND "
End If
Next a
' Create one big SQL Query
SQLtext = "Select * from MovieList where " & where
' Go out and search the database, storing the results temporarily in tmpSet
Set tmpSet = Db.CreateDynaset(SQLtext)
' Check for results, and act on them accordingly
If tmpSet.RecordCount = 0 Then
Send ("No such luck, we don't have what you want.<p>")
Else
Send ("Here are the movies which matched your search:<br>")
Do While Not tmpSet.EOF
Send ("Title: " & tmpSet("title"))
Send ("Category: " & tmpSet("category"))
Send ("Year: " & tmpSet("year"))
Send ("Director: " & tmpSet("director"))
Send ("Cost: " & tmpSet("cost"))
Send ("Summary: " & tmpSet("summary"))
Send ("Starring: " & tmpSet("starring"))
Send ("------------------------------ <br>")
tmpSet.MoveNext
Loop
End If
End Sub
SendError Subroutine----
Sub SendError()
Send ("Content-type: text/html")
Send (" ")
Send ("<title>Sorry..</title>")
Send ("This program only accepts the ")
Send ("POST method")
Send ("<br> Please use the 'Back' button ")
Send ("to go to the form, and make sure ")
Send ("it uses the POST method")
End Sub
Tip |
If you're interested in the details of CGI32.BAS, open it up in a text editor or in Visual Basic, and you can see what happens in each of the possible calls. It's very well commented, so you can see the underpinnings such as the GetPrivateProfileString calls, which are doing the bulk of the real work for you. |
Every time one of the search fields contains some information, a little portion of an eventual SQL query is formed and stored in part of an array (named textpart in this example). At the end, this array is checked and joined together into one long string of data appropriate for a query condition. The whole SQL query, including the list of conditions, is communicated to the database, and the results come back.
If no matches are made, regrets are sent back to the users. If results are found, though, they're looped through one by one to generate some HTML for sending back to the users.
After all your work, you should look at everything in action. First, you need an HTML form to let users enter search data. Remember that the users can search for four primary fields: Title, Year, Director, or Category. You therefore need four input boxes in which users can enter searches if they want to.
Listing 15.2 shows the HTML code used for the search form, and Figure 15.2 shows the resulting form. As you can see, it doesn't have to be anything fancy to get the point across, but you can always spruce up the HTML code once the search does what you want it to.
Figure 15:2: The Movie search form.
Listing 15.2. HTML front end for searching.
<title>Movie-o-Rama!</title>
<body bgcolor=#FFFFFF>
<center>
<h2>Movie Database</h2>
To find what you're looking for in our online catalog,
please enter some search criteria in the boxes provided
below.
<hr>
<form method=POST action=/cgi-win/movies.exe>
Movie Title: <input name=title> <br>
Year Released: <input name=year> <br>
Director: <input name=director> <br>
Category: <input name=category> <br>
(Available Categories: Family, Childrens, Sci-Fi, Horror, Action, Classics)<p>
<input type=submit value="Search">
</form>
<hr>
Though the method covered in this chapter is one way of approaching databases, you can do the same job in any number of different ways or with different tools. The best implementation of a database is one that meets your current needs and does so in a way that benefits you the most. Don't aim for SQL and ODBC if all you need is a phone book, and don't shy away from more advanced programming when you need the power these languages give you. Dozens of database books are available for all sorts of situations, whether you're using a proprietary tool or something more popular, and the methodology holds true from one to the next. If you're short on ideas, go to a search engine and look for any of the following words: catalog, database, Delphi, Visual Basic, or search. These words will take you to a variety of resources, and from there you can branch out to whatever destinations suit your interests.
Here's one last thought: Databases don't just have to be the traditional "What's your question?" type. You can create a database that contains entire HTML pages and serve them up to users dynamically. Make use of the QUERY_STRING environment variable by embedding data in your navigation links as follows:
<a href=/cgi-win/makepage.exe?newstuff>
You can then evaluate where people are calling from, along with a host of other information, and serve these users the freshest information in town. The limits are only the boundaries of your imagination-everything on the Internet is data.