Teach Yourself Database Programming
with Visual C++ 6 in 21 days


Day 1
      Choosing the Right Database Technology



The storing of data is an essential part of most software applications. Virtually all C++ applications have the need to persist, or store, data of some kind.

Many applications also need to retrieve data efficiently. These applications typically need to search through data that has been stored in order to retrieve specific information. This need to search for and retrieve data means that an application must use a database.

A variety of database technologies are available to C++ programmers. Today you will explore these database technologies and gain the knowledge you need to choose the appropriate technologies for your applications.

Today you will learn

In addition to covering these topics, you will see how to write the code for implementing each of the various database technologies available to C++ programmers.

Deciding the Appropriate Database Technology for Your Visual C++ Applications

Choosing the right database technology means finding a technology that fills the requirements of your application.

Without knowing the capabilities of the various database technologies, you can easily choose the wrong one for your particular application. In the following sections, you will learn the capabilities of each database technology.

When choosing a database technology, you need to carefully consider the importance of your application's data. It would be easy to think that the data needs to be used only by your application. However, if you write your application with that thought in mind, you will end up creating an application that has a closed, proprietary database that no one else can use or make sense of.

You might think a closed, proprietary database is okay for your application because your application is the only one that needs access to the data. Don't underestimate the value of the data and the need to access the data through more than just your application.

NOTE
If your data is important to you, it's probably important to someone else, who will want access to the data through more than just your application.

Even if you are certain that others will never want to access your data except through your application, what about future implementations of your application? What if your application is Windows executable, and you need to produce a new version of it that runs behind a Web server and provides information to users with Web browsers? Because of the nature of Web development tools, an open, nonproprietary database can enable you to perform this conversion in much less time than a closed, proprietary database.

In the end, if you do decide to write an application that has a closed database, you will ultimately shorten the life expectancy of your own application. An application that has an open, accessible database and can interoperate with other databases and applications will sooner or later replace yours.

Now you will go through the process of choosing a database technology for an imaginary application. You will examine each database technology and see what each one has to offer. Through this process, you will learn the capabilities (and limitations) of each database technology and how to choose the most appropriate technology for your applications.

The best way to learn to choose a database is by using an example and applying it to each technology. Let's say that your job is to write an application for a company that sells products through television advertising. The company advertises products such as a vegetable slicing machine, a bamboo steamer, 8-track love songs of the 70s, and so on, and offers them for the low, low price of $19.95. Each time the TV commercial airs, the company's 800 line is flooded with calls from buyers.

The salespeople who take these calls have your application running on their computers. They use your application to enter each order so that the product can be shipped and the buyer's money can be collected.

This sounds easy enough. Your application needs to present a window into which the salesperson can enter the order, and then your application must write the information for the order to a data file. This being said, you might decide it would be easier to create your own database.

Building Your Own Database in C++

A C++ programmer is usually confident of his ability to write software. After all, if you can master a language as complex and powerful as C++, you can no doubt write any software tool you need, including your own database system.

However, because of the maturity of existing database technology, writing your own database is rarely a productive effort. Although an electrical engineer can perhaps build her own cell phone, doing so makes little practical sense. Existing cell phones are plentiful and inexpensive and adhere to standards that enable them to interoperate with cellular networks and other cell phones.

Likewise, a C++ programmer can build his own database system, but doing so makes little practical sense. Existing databases are plentiful and inexpensive and adhere to standards that enable them to interoperate with computer networks and other applications. You should concentrate on building your application, not on building its database.

Listing 1.1 shows what is required to store structured data in a file on disk to create a rudimentary database. To create this application, run Visual C++ and create a new project as a Win32 console application. You can call the new project anything you want. Calling it something like CPPDb would be appropriate. Create a source file in the project, perhaps called main.cpp, and enter the following code into it.


Listing 1.1.  C++ Code to Write Data to a File

 1:  #include <fstream.h>
 2:  
 3:  struct Date
 4:  {
 5:     int iMonth, iDay, iYear;
 6:  };
 7:  
 8:  struct Product
 9:  {
10:     int iPartNumber;
11:     char szName[80];
12:     double dPrice;
13:  };
14:
15:  struct Customer
16:  {
17:     int iID;
18:     char szName[50];
19:     char szAddress[50];
20:     char szCity[20];
21:     char szState[20];
22:     char szZip[9];
23:  };
24:  
25:  void main()
26:  {
27:     Date dt = { 6, 10, 92 };
28:     Product prod = {122, "Vegamatic", 19.95};
29:     Customer cust = {15, "Seymore Hoskins", "300 Oak St",
                         "Boring", "Oregon", "97203"};
30:     ofstream datafile( "data.dat" , ios::binary );
31:     datafile.write( (char *) &dt, sizeof dt );
32:     datafile.write( (char *) &prod, sizeof prod );
33:     datafile.write( (char *) &cust, sizeof cust );
34:  }

Notice a couple of things about this code. First, you can see that data structures are defined in lines 3-23. The structures are used to write data to the file in a predictable way (lines 31-33), in a pattern. Other routines in the application can also use these structures to read the data from the file and make sense of it.

Build the application. You should receive no errors or warnings. When you run the application, it creates a file called data.dat in your application's directory and writes the data to the file. If you open data.dat with a hex file viewer, or even with Notepad.exe, you will see the data in the file.

Defining Metadata

The structures used in Listing 1.1 are a kind of metadata, or data about data. This metadata must be defined somewhere, or the data in the file will be unorganized and totally inaccessible.

When building your own C++ database, you define the metadata within your source code. Unfortunately, your C++ source code isn't the best place for the metadata to reside. Anyone who wants to use this data must have access to your source code. This is one of the many limitations to building your own database in C++.

This metadata should, ideally, reside with the data. That way, the data file can be self-describing, and other routines can have easier access to it.

NOTE
Metadata is what makes a database a database. A true database contains a description of its own structure. A database contains both data and metadata.

A C++ Base Class to Handle the Database Work

The other thing to note is that the source code in Listing 1.1 is not very object-oriented.

Using C++, you can write a base class that handles the reading and writing of object data to files on disk. You can call this base class the Persistent class. In the sample application, you can derive an Orders class from the Persistent class, thereby making instances of the Orders class automatically capable of persisting (or saving) themselves to disk.

Sounds great, doesn't it? Unfortunately, C++ has a few limitations that make this Persistent base class approach unworkable. The Persistent base class can't know at runtime how big an object of a derived class is, so it can't persist an object of a derived class to disk. There also can be data members in an object of a derived class that deal with runtime context or contain pointers. It would be very difficult for a Persistent base class to have the intelligence to handle these data members properly.

These problems ultimately mean that you can't write a C++ base class to handle all database work. Some code for persisting data from a class must be contained in the class itself.

Problems with Building Your Own Database

When building your own database in C++, you typically embed the metadata in your source code and must build some code to store and retrieve objects into each and every class that needs persistence.

I haven't talked about how to handle multiple threads and multiple applications accessing the same data file simultaneously. One application can be reading while another is writing, or two can write at the same time and produce garbage in the file. Certainly this would be a common occurrence in our sample application, with multiple salespeople receiving a flood of calls each time a TV commercial airs. Believe me, the source code you need to write to handle the file locking and retrying is not trivial.

For our sample application, building your own database by using C++ and data files on disk forces you to write a lot of code, and no one could make sense of the data.

OLE Structured Storage

Within Microsoft's OLE technology is a technology called OLE structured storage (the newer documentation from Microsoft refers to it simply as structured storage). OLE structured storage promises to give other applications the potential of exploring the internal structure of your files. OLE structured storage is a storage architecture that enables a file on disk (as well as other storage mediums) to be divided into a hierarchy of storages and streams. Storages are analogous to operating-system directories or subdirectories. Streams are analogous to files in the operating system. These storages and streams can all exist within a single disk file.

Listing 1.2 shows how to create an OLE structured storage file, create a stream within it at the root storage, and write your order information into the stream. The code in Listing 1.2 doesn't check return values for errors to ensure code clarity and brevity.

To create this sample, run Visual C++ and create a new project as a Win32 console application. You can call the new project anything you want. Calling it OLESS might be appropriate. Create a source file in the project, perhaps called main.cpp, and enter the following code into it.


Listing 1.2.  OLE Structured Storage

 1:  #include <windows.h>
 2:  #include <ole2.h>
 3:  struct Date
 4:  {
 5:     int iMonth, iDay, iYear;
 6:  };
 7:
 8:  struct Product
 9:  {
10:     int iPartNumber;
11:     char szName[80];
12:     double dPrice;
13:  };
14:
15:  struct Customer
16:  {
17:     int iID;
18:     char szName[50];
19:     char szAddress[50];
20:     char szCity[20];
21:     char szState[20];
22:     char szZip[9];
23:  };
24:
25:  void main()
26:  {
27:     Date dt = { 6, 10, 92 };
28:     Product prod = {122, "Vegamatic", 19.95};
29:     Customer cust = {15, "Seymore Hoskins", "300 Oak St",
                         "Boring", "Oregon", "97203"};
30:
31:     IStorage * pRootStorage;
32:     IStream * pOrderInfo;
33:
34:     CoInitialize(NULL);
35:
36:     StgCreateDocfile(L"data.dat", 
         STGM_CREATE | STGM_READWRITE | STGM_SHARE_EXCLUSIVE,
         0, &pRootStorage);
37:
38:     pRootStorage->CreateStream(L"Order Information",
         STGM_CREATE | STGM_WRITE | STGM_SHARE_EXCLUSIVE, 
         0, 0, &pOrderInfo);
39:
40:     pOrderInfo->Write(&dt, sizeof(dt), NULL);
41:     pOrderInfo->Write(&prod, sizeof(prod), NULL);
42:     pOrderInfo->Write(&cust, sizeof(cust), NULL);
43:
44:     pOrderInfo->Release();
45:     pRootStorage->Release();
46:
47:     CoUninitialize();
48:  }

Within a single data file, you create a hierarchy of streams and storages (or files and directories). Lines 31 and 32 define pointers to two OLE interfaces (IStorage and IStream). Line 34 calls CoInitialize to initialize the COM libraries. Line 36 calls an API function, StgCreateDocFile, to create the OLE structured storage file. This call creates an instance of IStorage and returns a pointer to it in the last parameter, pRootStorage. Line 38 calls the CreateStream function of the IStorage class through the pRootStorage pointer to create a stream in the root storage. The CreateStream function returns an IStream class and returns a pointer to it in the last parameter, pOrderInfo. Lines 40-42 call the Istream's Write function to write our order data into the stream. Lines 44 and 45 call Release to delete the pOrderInfo and pRootStorage instances. Line 47 uninitializes the COM libraries.

Build the application. You should receive no errors or warnings. When you run the application, it creates an OLE structured storage file called data.dat in your application's directory and writes the data to the file.

You still had to define the metadata for your order information. The metadata for the sample application is too complex for OLE structured storage to represent. You need to be able to specify that an order includes an order date (with month, day, and year), a product (with product number, name, and price), and a customer (with name, address, and so on). Also, you need to be able to specify the data types and lengths. That level of detail cannot be represented using only a hierarchy of streams and storages.

If you use OLE structured storage, another routine or application that wants to open your data file and see the structure of the data won't be able to do so. The only thing it will see is a hierarchy of storages and streams. Other routines and applications still would need access to your source code in order to make sense of your data. Also, OLE structured storage has no inherent file-locking or record-locking capability, so it wouldn't reduce the amount of locking code you would have to write.

OLE structured storage is primarily used by document-centric applications, such as Microsoft Word and Excel, to create data files for documents. Word and Excel files are not self-describing. Any application that wants to access the data in an OLE structured storage file containing Excel data must have knowledge of how the data inside the streams is organized.

If you used OLE structured storage for the sample application, you would still have to write a lot of code and no one else would be able to make sense of your data. This would not be ideal for the database in the sample application.

Record Managers (Btrieve)

Record managers on the market can simplify the data storage piece of the sample application. Let's take a look at a popular record manager, Btrieve, to see what it does.

Btrieve provides a layer of insulation between your application and its data files. In other words, your application doesn't directly talk to the data files. The application talks to Btrieve, and Btrieve talks to the data files.

Btrieve provides an API (application programming interface) for record-based data access from your application. This API enables your application to insert, edit, and delete records from data files. Btrieve also enables your application to search the data files for a certain record, such as an order placed by John Smith on June 1. Your application can tell Btrieve to position its record pointer at this record and read, edit, or delete it.

Listing 1.3 shows the code to open a data file in Btrieve, find a certain record, and display it. This is a code snippet only and will not compile as shown.


Listing 1.3.  Btrieve Example

 1:  #define FILE1_NAME "c:\\data.btr"
 2:  typedef struct
 3:  {
 4:    BTI_LONG  ID;
 5:    BTI_CHAR  FirstName[16];
 6:    BTI_CHAR  LastName[26];
 7:    BTI_CHAR  Street[31];
 8:    BTI_CHAR  City[31];
 9:    BTI_CHAR  State[3];
10:    BTI_CHAR  Zip[11];
11:    BTI_CHAR  Country[21];
12:    BTI_CHAR  Phone[14];
13:  } PERSON_STRUCT;
14:
15:  typedef struct
16:  {
17:    BTI_CHAR networkAndNode[12];
18:    BTI_CHAR applicationID[2];
19:    BTI_WORD threadID;
20:  } CLIENT_ID;
21:
22:  CLIENT_ID       clientID;
23:  PERSON_STRUCT   personRecord;
24:
25:  strcpy((BTI_CHAR *)keyBuf1, FILE1_NAME);
25:
27:  keyNum  = 0;
28:  dataLen = 0;
29:
30:  status = BTRVID(
            B_OPEN,
            posBlock1,
            dataBuf,
            &dataLen,
            keyBuf1,
            keyNum,
            (BTI_BUFFER_PTR)&clientID);
31:
32:  printf("Btrieve B_OPEN status (c:\\data.btr) = %d\n", status);
33:
34:  /* get the record with key 0 = 263512477 using B_GET_EQUAL */
35:  if (status == B_NO_ERROR)
36:  {
37:     file1Open = TRUE;
38:     memset(&personRecord, 0, sizeof(personRecord));
39:     dataLen = sizeof(personRecord);
40:     personID = 263512477;    /* this is really a social security Ânumber */
41:     *(BTI_LONG BTI_FAR *)&keyBuf1[0] = personID;
42:     keyNum = 0;
43:
44:     status = BTRVID(
            B_GET_EQUAL,
            posBlock1,
            &personRecord,
            &dataLen,
            keyBuf1,
            keyNum,
            (BTI_BUFFER_PTR)&clientID);
45:
46:     printf("Btrieve B_GET_EQUAL status = %d\n", status);
47:     if (status == B_NO_ERROR)
48:     {
49:        printf("\n");
50:        printf("The retrieved record is:\n");
51:        printf("ID:      %ld\n", personRecord.ID);
52:        printf("Name:    %s %s\n", personRecord.FirstName,
                  personRecord.LastName);
53:        printf("Street:  %s\n", personRecord.Street);
54:        printf("City:    %s\n", personRecord.City);
55:        printf("State:   %s\n", personRecord.State);
56:        printf("Zip:     %s\n", personRecord.Zip);
57:        printf("Country: %s\n", personRecord.Country);
58:        printf("Phone:   %s\n", personRecord.Phone);
59:        printf("\n");
60:     }
61:  }

In Listing 1.3, line 1 defines the data filename. Lines 2-23 define two structures and declare an instance of each. The PERSON_STRUCT structure is the definition of the meta-data for a data record. The CLIENT_ID structure is used internally by Btrieve to identify the application. Line 25 copies the filename into a variable. Lines 27 and 28 initialize a couple of variables, and then line 30 uses those variables in a call to the Btrieve record manager API to open the data file.

If there are no problems opening the data file, lines 37-42 initialize some variables to use in searching for a particular record in the data file. Line 44 calls the Btrieve record manager API to find the record. Line 47 tests to see whether the record was found in the data file. If it was, lines 49-59 display the data contained in the record.

Btrieve uses a form of data storage known as the Indexed Sequential Access Method (ISAM). Btrieve ISAM files are a highly advanced version of the data files created in Listing 1.1.

Btrieve can index the data in the data files to enable very fast record searches. Btrieve can also handle record locking, so multiple threads and applications can simultaneously access the data files.

Using a record manager is much easier than writing all that code from scratch. In fact, many commercial software packages use the Btrieve record manager. It provides excellent performance (compared to what you can probably write yourself) and is easy to distribute with a commercial application.

However, record managers such as Btrieve do have limitations and can leave some important database work undone. As you see in Listing 1.3, the metadata is defined in your source code. Btrieve doesn't store the metadata within the ISAM files. A Btrieve data file isn't self-describing. No one else can make sense of a Btrieve data file without some outside knowledge of its structure.

NOTE
Btrieve provides a way to store metadata in separate files named DDF files. This isn't ideal because the files can be changed or deleted independent of each other. No mechanism ensures that the metadata is accurate or in sync with the actual data.

The Btrieve record manager never makes use of metadata. Btrieve interprets a record in a data file only as a collection of bytes and doesn't recognize discrete pieces of information within a record. To Btrieve, a product number, name, and price don't exist inside a record. The record is simply a collection of bytes. Because Btrieve doesn't use the metadata, the application must handle all information about the format and type of data in a Btrieve data file. Btrieve does nothing to ensure the integrity of the data within a record. Your application must do all the work of validating the data before it's stored in the data file.

Another limitation of a record manager is a lack of set-based operations on the data. The application must touch each and every record that is involved in any given operation. For example, in the sample application, to discover the total sales volume in dollars, the application needs to iterate through the records of all the orders, adding up the sales amount of each one.

Set-based operations, however, like those found in true databases (as opposed to record managers), can enable the application to issue a single command to ask the database for the total sales volume in dollars. You will learn more about set-based operations in the sections "Desktop Databases" and "Relational Database Servers."

NOTE
Btrieve has produced an open database connectivity (ODBC) driver and data-base engine that sit on top of the Btrieve record manager and provide set-based operations through an ODBC API. Applications can use this ODBC API to access Btrieve data files. However, the Btrieve ODBC API doesn't provide the same level of performance that the Btrieve record manager API provides.

Using a record manager for the sample application would be easier than creating your own C++ database and easier than using OLE structured storage. The location of the metadata still isn't ideal, however. The lack of integrated metadata can limit the capability of other applications to read the data file. For instance, this might prevent the manager from being able to analyze sales data from the database in a spreadsheet. You need to consider these questions regarding this technology and the sample application: Will the record manager provide sufficient open access to the data for other applications? Will the data file become too large for the record manager to handle? Will the performance of the record manager be fast enough, especially with multiple users over a network?

Desktop Databases (FoxPro and Access)

Desktop databases is a class of database software, sometimes called ISAM databases because they use ISAM files. Several desktop databases are on the market. These include Microsoft Access, Microsoft FoxPro, and Borland Paradox. These database products differ from each other in many ways, but they all have certain features and characteristics in common.

Desktop databases store the metadata within their ISAM data files. The data files are self-describing. This enables a variety of applications to readily access the data in desktop databases. Desktop databases have their own languages and data types and include an interpreter to run programs written in their language. You can use the language of a desktop database to build database applications. (These interpreted database languages typically aren't used to build complete commercial applications because of their many limitations.)

The desktop databases are designed to provide standard DBMS (database management system) functionality such as data definition, data manipulation, querying, security, and maintenance. The desktop databases are built specifically to run on personal computers.

C++ programs can use the ODBC (open database connectivity) API to talk to desktop databases. For instance, a C++ program can call ODBC API functions to store and retrieve data in a Microsoft Access database file. You can even use ODBC to send language statements to the Access interpreter (also called the Jet database engine) and then retrieve any data that Access (Jet) might return as a result of that operation.

Listing 1.4 shows some ODBC API function calls. This is a code snippet only and will not compile as shown.


Listing 1.4.  ODBC API Function Calls

 1:  long      lResult;
 2:  SDWORD      cbResult;
 3:  HSTMT       hstmt;
 4:  CDatabase   mfcdb;
 5:
 6:  mfcdb.OpenEx("DSN=MyDataSource;UID=MyUserLogin;PWD=MyPassword;");
 7:
 8:  AFX_SQL_SYNC(::SQLAllocStmt(mfcdb.m_hdbc,&hstmt));
 9:
10:  AFX_ODBC_CALL(::SQLExecDirect(hstmt, (UCHAR FAR*)
                   "SELECT * FROM Orders",SQL_NTS));
11:
12:  while (::SQLFetch(hstmt) != SQL_NO_DATA_FOUND)
13:  {
14:     ::SQLGetData(hstmt, 1, SQL_C_LONG, &lResult, 0, &cbResult);
15:  }
16:
17:  AFX_SQL_SYNC(::SQLFreeStmt(hstmt,SQL_CLOSE));
18:
19:  mfcdb.Close();

Line 4 declares an instance of the MFC CDatabase class. CDatabase encapsulates and simplifies the code for connecting to ODBC databases. Line 6 calls the Cdatabase's OpenEx function to connect to (or open) a database. Line 8 allocates a statement handle, which enables SQL language statements to be sent to the database to be interpreted. Line 10 calls SQLExecDirect to send a SQL statement, "SELECT * FROM Orders", to the database to be interpreted and executed. Lines 12-15 retrieve the information that the database returns as a result of the SQL statement in line 10. Line 14 places the value of the first field in each record that was returned into the lResult variable. Line 17 frees the statement that was allocated in line 8. Line 19 closes the database connection that was opened in line 6.

Desktop databases index the data and use ISAM for fast record searches. Desktop databases can also handle record locking, so multiple threads and applications can access the data files simultaneously.

NOTE
A key difference between the ISAM files used by Btrieve and the ISAM files used by desktop databases is that desktop databases store metadata inside the ISAM files with the data. This means other programs can make sense of the data without having to obtain your source code.

Desktop databases provide type checking of the data within the records. Whenever an application sends data to a desktop database, the database checks the values and data types to make sure they are appropriate. Thus, the database itself can help ensure the integrity of the data.

Desktop databases provide set-based operations in their programming model. With a single command, an application can perform operations that affect potentially thousands of records. For example, in the sample application, to discover the total sales volume in dollars, the application need issue only a single command to the database-for example,

SELECT SUM(price) FROM Orders

Desktop databases do have some limitations. The raw performance of desktop databases is generally not as good as the performance of straight record managers such as Btrieve.

NOTE
The Btrieve record manager is lean and fast but doesn't provide the programming functionality and data openness that the desktop databases provide. In choosing between a record manager such as Btrieve or a desktop database, you have to balance your need for execution speed, which a record manager can provide, with your need for speedy development time and data openness, which a desktop database can provide).

Accessing ISAM Data over a LAN

ISAM data files from a desktop database can be accessed from a remote machine over a local area network (LAN). (The machine running the application is usually called the client machine, and the machine where the data file resides is usually called the server machine.) However, the capacity and efficiency of accessing ISAM files over a LAN is limited.

When an ISAM data file is accessed over a LAN, the data is processed on the client machine. All the data and indexes must travel from the server machine over the network to the client machine to be processed. This is because all the logic for processing the records exists in the application running on the client machine.

Because all the data and indexes must travel over the network, desktop databases can't be used to build high-capacity client/server applications. I talk more about client/server architectures in the section "Relational Database Servers."

Desktop databases are designed to run on personal computers, so their capacity and throughput is limited. The client/server limitations of ISAM files hinder the capacity of desktop databases. The documentation for desktop databases typically specifies that they are limited to a dozen or so concurrent users and to data files of 100MB or so in size.

Using a desktop database for the sample application provides many advantages over creating your own database, using OLE structured storage, or using Btrieve. The programming model for desktop databases is more advanced and requires less code. Desktop databases store the metadata in the data file, so the data can more easily be queried by other applications, such as spreadsheets. However, you need to consider these questions regarding this technology and the sample application: If the application uses a desktop database, will the database run fast enough, especially over a network with multiple users? Will the data file become too large for the desktop database to handle?

For raw speed, a rich programming model, data openness, and client/server capability, you need to use a relational database server. I'll explain more about relational database servers after I talk about object databases.

Object Databases

Primitive database technologies store only raw data (bits and bytes) with no metadata in the data file. Desktop and relational databases store data and metadata together to make the data files self-describing. Object databases go one step further. Object databases store data, and the code to act on that data, in the data file. Several object databases on the market provide a broad range of features and capabilities.

Object databases are typically tied to a particular programming language. C++ object databases directly support the type system of the C++ language. In other words, you can use a C++ object database to store instances of C++ classes right in the database.

Listing 1.5 shows how to use a C++ object database to store product information. This is a code snippet only and will not compile as shown.


Listing 1.5.  An Object Database

 1:  #include <string.h>
 2:
 3:  // Header file for the Object Database 
 4:  // Management Group (ODMG) object model.
 5:  #include <odmg.h>
 6:
 7:  // Derive our Product class from d_Object 
 8:  // so Product can persist itself in the database.
 9:  class Product : public d_Object
10:  {
11:  public:
12:     int iPartNumber;
13:     char szName[80];
14:     double dPrice;
15:  private:
16:     d_Ref<Product> next;  // For iterating instances of Product
17:                           // in the database.
18:  };
19:
20:  d_Database db;          // Global instance of the object database.
21:  const char * const db_name[] = "Products";
22:
23:  void main()
24:  {
25:     db.Open(db_name);   // Opens the Products database.
26:     d_Transaction tx;   // Create and begin a transaction.
27:     tx.begin();
28:
29:     // Create a new product instance in the database.
30:     Product *prod = new(&db, "Product") Product; 
31:     prod->iPartNumber = 122;
32:     strcpy(prod->szName, "Vegamatic");
33:     prod->dPrice = 19.95;
34:
35:     tx.commit();      // Commit the additions to the db.
36:     db.close();       // Close the db.
37:  }

In Listing 1.5, line 5 assumes that the object database vendor has provided a header file called odmg.h, which contains the declarations for the d_Object class. The d_Object class is a C++ base class that enables instances of derived classes to be persisted to the object database. Line 9 derives the Product class from the d_Object class, which enables instances of Product to be persisted. The Product class has a d_Ref<Product> member (line 16). d_Ref<> is a smart pointer class provided by the object database vendor that enables references (or pointers) to objects to be stored in the database.

The new operator in line 30 has been overloaded in d_Object to take a pointer to a d_Database instance as a parameter. The call to new in line 30 creates a persistent instance of Product in the database. Lines 31-33 change the values of the data members in this instance of Product (in the database). Line 35 commits the changes to the database, and line 36 closes the database. As you can see, using an object database, you get database functionality for your C++ objects with very little extra code.

This tight integration with the C++ programming language provides great power for designing and building applications that have complex information models. You can use the full power of C++ with encapsulation, inheritance, and polymorphism to reduce complexity.

If you use an Object DBMS, your application has a database that can handle great complexity. C++ classes enable you to model elaborate data entities and their relationships, and an Object DBMS enables you to store instances of those elaborate C++ classes right in the database. However, object databases have limitations, too. Because object databases are so tightly integrated with the programming language of the application, the data tends not to be open or accessible to other applications.

NOTE
Some Object DBMS vendors pledge that their object databases support (or will support in the future) open technologies such as COM, CORBA, XML, and ODBC/OLEDB. Support for these technologies will make object databases more open and accessible. However, support for these open technologies is neither universal nor uniform among Object DBMS vendors. So caveat emptor (let the buyer beware).

Also, with a C++ object database in a client/server environment, the object functionality executes primarily in the client application. Like desktop databases, C++ object data-bases will do most of their processing on the client machine. This varies between object database implementations, but object databases tend to be client-centric and do not fully take advantage of the server machine in client/server applications.

Relational database servers, however, tend to be better able to take advantage of client/server architectures. The next section talks about relational database servers in more detail.

Using an object database for the sample application could be easier than using your own database, OLE structured storage, Btrieve record manager, or a desktop database. However, consider these questions regarding this technology and the sample application: Would an object database be overkill for the application? Is the data model sufficiently complex and intricate to require a direct mapping of C++ classes into the database? Would this capability justify the added time you would have to spend researching the capabilities of the various object databases? Would the database be open to other applications? Would the performance with multiple users over a network be sufficient? Would performance degrade significantly as the amount of data in the database increases?

Relational Database Servers (Oracle and SQL Server)

Relational database servers are in some ways similar to desktop databases. Relational database servers have their own programming languages, interpreters, and data types. They integrate data and metadata. C++ programs can talk to them through ODBC. The code in Listing 1.4 operates with a relational database server as well as a desktop database.

NOTE
C++ applications that use the ODBC API can interoperate with desktop database as well as with relational database servers.

Relational database servers provide the rich functionality and data openness of desktop databases while far exceeding desktop databases and record managers in capacity and throughput. Relational database servers can capitalize on client/server architectures much more than desktop servers, record managers, and object databases. They provide true set-based operations.

NOTE
With set-based operations, a relational database can process thousands, even millions, of records at the server machine and then send only a small result set to the client computer. Set-based operations are a powerful tool to make your applications highly scalable. Set-based operations at the server enable relational database servers to do the heavy lifting in large client/server applications.

Relational database servers are also built to take advantage of server hardware, such as large amounts of RAM and high-performance disk subsystems. If you put a record manager on a RAID disk system, the record manager probably wouldn't know what to do with it. However, if you put a relational database server on a RAID system, it takes advantage of the RAID drives to provide phenomenal throughput and reliability.

Relational database servers have their downside, too. They tend to be more expensive than record managers and desktop databases. (Of course, some relational database servers are more expensive than others.) Relational database servers also are more difficult to integrate with commercial applications. They might have stringent hardware requirements and complex installation processes. Relational database servers also require the periodic attention of a database administrator to tune and maintain them. This also varies between database servers.

Also, compared to object databases, relational database servers are limited in the complexity of the data model they can support. For an application with a highly complex data model, the process of converting from C++ to the relational database server's type system and language interpreter can be very difficult.

Using a relational database server for the sample application would be easier than using your own database, OLE structured storage, or the Btrieve record manager. However, consider the following questions regarding this technology and the sample application: Would the fact that the relational database might be more time-consuming to implement than other technologies be a problem? Is a relational database overkill for the application? Does the application require the capacity and throughput a relational database server provides? Would the cost of the licensing fees for the database server be prohibitive?

How Do the Database Technologies Compare?

Table 1.1 illustrates the relative strengths and weaknesses of the various database technologies. In Table 1.1, a plus sign (+) indicates a strength of the technology, a minus sign (-) indicates a weakness, a blank indicates no particular strength or weakness, and a question mark (?) indicates that it varies between vendors or implementations of the technology.

Table 1.1.  How Do the Database Technologies Compare?
 

C++
OLE
SS
Record
Mgr
Desktop
Db
Object
Db
RDBMS
Server
Openness of the Data
-
-
 
+
-?
+
Complex Data Models
-
-
-
 
+
 
Multiuser
-
-
 
 
+?
+
Performance
-
-
+
 
+?
+
Scalability and Capacity
-
-
 
 
+?
+
Set-based Operations in Code
-
-
-
+
+
+
Set-based Operations at the Server
-
-
-
-
-?
+
Embeddable with Your Application
+
+
 
 
?
-
Data Validation/ Integrity
-
-
-
+
+
+
Code-to-Functionality Ratio
-
-
-
+
++
+

As you can see from the table, the appropriateness of each of these technologies depends on the requirements of the application. If you need your commercial application to write a small amount of simple data to a temporary file, your own C++ database would probably work fine. If, however, you need to store moderately complex data to a file with
multiple threads or multiple applications using it, consider one of the more advanced database technologies.

Summary

Today's database technology offers a broad spectrum of functionality. Choosing the right database technology means finding one that fills the requirements of your application.

In choosing a database technology, don't underestimate the importance of the data. Consider carefully which database technology is the best steward of the data. A database should ensure the data's integrity and provide appropriate open access to the data for other routines and applications, now and in the future.

Q&A

Q
It looks like I will have to learn so much before I can even start using data-bases. Shouldn't I just write a database myself so that I can get up and running more quickly?
A
For all but the very simplest applications, it doesn't make sense to invent your own database. You must realize that the time you spend writing your database is time spent not writing your application. It's best to learn once how to use real databases and then apply this valuable knowledge over and over.
Q
If I decide to use a desktop database for my application, which one of the desktop databases (Access, Paradox, FoxPro, and so on) should I use?
A
Which desktop database you should use depends on the requirements of your application. Microsoft Access offers one advantage over the other desktop data-bases, however. Access does the best job of mimicking the functionality of a relational database server: It uses a version of Structured Query Language (SQL), it can have multiple tables per file, and it can store predefined queries.
Q
Doesn't ODBC provide a single API that can be used with all ODBC data-bases? If my application uses ODBC, why should I care which database is used underneath?
A
The capabilities of the various database technologies differ fundamentally from each other, and these differences are reflected in the way that the technologies implement the ODBC API. Some databases support a superset of the ODBC API, whereas other databases support only a subset of it. Also, some technologies provide excellent performance through ODBC, whereas others provide very poor performance with ODBC. You need to select the database for your application based on the database's capabilities. If you plan to use ODBC, you need to consider how well the database you want to use supports it.
Q
Object databases look like they provide excellent power and flexibility. Why would I use relational database servers when object databases seem to integrate so well with C++?
A
Relational database servers and object databases each have their own strengths and weaknesses. Be aware that object databases are not yet fully mature, and the capabilities of the different object databases vary greatly. Also remember that relational database servers provide open and accessible databases, whereas object databases generally do not. The need for open access to the data shouldn't be underestimated and might outweigh the other areas in which object databases can be superior.

Workshop

The Workshop quiz questions test your understanding of today's material. (The answers appear in Appendix F, "Answers.") The exercises encourage you to apply the information you learned today to real-life situations.

Quiz

  1. What are the primary benefits of using a record manager (such as Btrieve) rather than invent your own database routines?
  2. What do the desktop databases provide that record managers do not?
  3. What are the benefits of using a database technology that provides open, accessible data stores?
  4. Which database technologies provide open, accessible data stores?
  5. What is the significance of server-side processing of set-based operations in a client/server architecture?

Exercises

  1. The code in Listing 1.1 creates a data file that contains order information. Write a program that reads the order data from that file.
  2. Decide which database technology would be most appropriate for the sample application described earlier. Create a list of capabilities that the database for this application needs to provide. Justify your decision by comparing the database requirements with the capabilities of the database technology that you have chosen.

© Copyright, Sams Publishing. All rights reserved.