Day 2

Creating Databases

In today's lesson, you learn a working definition of a relational database, as well as the basic elements of a database, including data table, data record, and data field. You also learn the importance of establishing and maintaining data relationships. These are some of the key elements to developing quality databases for your applications.

You also learn Visual Basic database field types, including their names, storage sizes, and common uses. Along the way, you create a programming project that explores the limits, possibilities, and common uses of Visual Basic database field types.

Relational Databases

Before looking at the individual components of relational databases, let's first establish a simple definition. For the purposes of this book, a relational database is defined as a collection of data that indicates relation among data elements; or, to put it even more directly, a relational database is a collection of related data.

In order to build a collection of related data, you need three key building blocks. These building blocks are (from smallest to largest)

Let's look at each of these elements in more depth.

Data Fields

The first building block in a relational database is the data field. The data field contains the smallest element of data that you can store in a database, and each field contains only one data element. For example, if you want to store the name of a customer, you must create a data field somewhere in the database and also give that field a name, such as CustomerName. If you want to store the current account balance of a customer, you must create another field, possibly calling it AccountBalance. All the fields you create are stored in a single database (see Figure 2.1).

Figure 2.1. Examples of data fields in a database.


NOTE: In formal database theory, a data field is often referred to as a data column. Throughout this book, the phrases data field and data column are used interchangeably.

Although it is possible to store more than one data element in a single field (such as first and last name), it is not good database practice to do so. In fact, storing more than one data element in a field can lead to problems when you or other users try to retrieve or update data.

This concept seems simple in theory, but it's not so easy in practice. The CustomerName field discussed earlier is a good example. Assume that you have a database that contains a list of your customers by name, and you need to sort the list by last name. How would this be done? Can you assume that each CustomerName data field contains a last name? Do some contain only a first name? Possibly some contain both first and last names--but in what order (last name, first name or first name, last name)? When you look at this situation, you discover that you're actually storing two data elements in the CustomerName field (first name and last name). For this reason, many databases contain not just the CustomerName data field, but data fields for LastName and FirstName.

When you begin constructing your database, spend time thinking about the various ways you (and your users) need to retrieve useful data. The quality and usefulness of your database rests on the integrity of its smallest element--the data field.

Data Records

Data records are a collection of related data fields. To use the example started earlier, a Customer Record could contain the fields LastName, FirstName, and AccountBalance. All three fields describe a single customer in the database.


NOTE: Formal database theory refers to a data record as a data row. Both data record and data row are used interchangeably throughout this book.

A single data record contains only one copy of each defined data field. For example, a single data record cannot contain more than one LastName data field. Figure 2.2 shows the Company Database with a Customer Record defined. The Customer Record (row) contains three fields (columns).

Figure 2.2. An example of a data record in a database.


Data Table Rows and Columns

By combining data fields and data records, you create the most common element of relational databases--the data table. This element contains multiple data records, and each data record contains multiple data fields (see Figure 2.3).

Figure 2.3. An example of a data table in a database.

Just as each data record contains related data fields (LastName, FirstName, and AccountBalance), each data table contains related records. Data tables have meaningful names (Customer Table or Invoice Table, for example) in the same way that data fields have meaningful names (LastName, FirstName, AccountBalance, and so on). These names help you and other users to remember the contents of the elements (table elements and field elements).

Database Relationships

Just as a data record can contain several related data fields, a database can contain several related tables. Using relationships is a very efficient way to store complex data. For example, a table storing customer names could be related to another table storing the names of items the customer has bought, which could be related in turn to a table storing the names of all the items you have to sell. By establishing meaningful relationships between data tables, you can create flexible data structures that are easy to maintain.

You establish relationships between data tables by using pointer or qualifier fields in your data table.

You use qualifier fields to point to records in other tables that have additional information. Qualifier fields usually describe what's known as one-to-one relationships. A good example of a one-to-one relationship is the relationship between a single customer record and a single record in the shipping address table (see Figure 2.4).

Figure 2.4. An example of a one-to-one relationship between tables.


You use pointer fields to point to one or more records in other tables that have related information. Pointer fields usually describe what are known as one-to-many relationships. A good example of a one-to-many relationship is the relationship between a single customer master record and several outstanding customer orders (see Figure 2.5). One-to-One Relationships One-to-one relationships are used to link records in a master table (such as the Customer Table) to a single related record in another table.

Figure 2.5. customer records relationships

For example, assume you have two types of customers in your Company Database: retail and wholesale. Retail customers get paid commissions on sales, so you need to add a Commission field to the Customers table. Wholesale customers, however, purchase their products at a discount, so you also need to add a Discount field to the Customers table. Now your database users have to remember that, for Retail customers, the Discount field must be left empty, and for Wholesale customers, the Commission field must be left empty. You must remember these rules when adding, editing, and deleting data from the database, and you must remember these rules when creating reports.

This might seem to be a manageable task now, but try adding dozens more data fields (along with the exceptions), and you have quite a mess on your hands! Instead of establishing all data fields for all customers, what you need is a way to define only the fields you need for each type of customer. You can do this by setting up multiple tables in a single database and then setting up relationships between the tables.

In the example illustrated in Figure 2.6, you have added an additional data field: Type.

Figure 2.6. Using a qualifier field to establish a one-to-one relationship.


This data field qualifies, or describes, the type of customer stored in this data record. You can use this type of information to tell you where to look for additional information about the customer. For example, if the Type field is set to Retail, you know you can look for the customer in the Retail Table to find additional information. If the Type field is set to Wholesale, you can find additional information in the Wholesale Table.

By creating the RecordType field, you can establish a one-to-one relationship between records in the Customer Table and the Retail and Wholesale Tables. One-to-Many Relationships One-to-many relationships are used to link records in a master table (such as the Customer Table) to multiple records in another table.

For example, you can keep track of several orders for each customer in your database. If you were not creating a relational database, you would probably add a data field to your customer table called Order. This would contain the last order placed by this customer. But what if you needed to keep track of more than one outstanding order? Would you add two, four, or six more order fields? You can see the problem.

Instead, you can add an additional table (the Orders Table) that can contain as many outstanding orders for a single customer as you need. After you create the Orders Table, you can establish a relationship between the Customer Table and the Orders Table using the LastName field (refer back to Figure 2.4). The LastName field is used as a pointer into the Orders Table to locate all the orders for this customer.

You can use many different approaches to establish relationships between tables. They are usually established through a key field. Key fields are covered in depth in the next section.

Key Fields

Usually, at least one data field in each data table acts as a key field for the table. Key fields in relational databases are used to define and maintain database integrity and to establish relationships between data tables. You create keys in your data table by designating one (field) or more in your table as either a primary key or a foreign key. A data table can have only one primary key, but it can have several foreign keys. The primary key is used to control the order in which the data is displayed. The foreign key is used to relate fields to fields in other (foreign) tables in the database.


NOTE: Key fields are sometimes referred to as index fields or indexes. Both "key fields" and "index fields" are used interchangeably throughout the book. It is important to note that in most PC databases (Xbase, Paradox, Btreive, and so forth), indexes are used only to speed processing of large files and play only a minor role in maintaining table relationships. The Visual Basic database model (.mdb files) and other true relational database models use key fields to establish database integrity rules as well as to speed database search and retrieval.

As mentioned earlier, a data table can have only one primary key. The primary key is used to define a unique record in the data table. In the Customer table, the LastName field is the primary key field for the data table. This means that no two records in that table can have exactly the same value in the LastName fields (see Figure 2.7). Any attempt to add more than one record with an identical primary key would result in a database error.

Figure 2.7. The LastName field is the primary key field of the Customer table.



TIP: The main role of the primary key is to maintain the internal integrity of a data table. For this reason, no two records in a data table can have the same primary key value. Many companies with large customer bases use Social Security numbers or area codes and telephone numbers, because they know they are likely to have more than one customer with the same name. In these cases, the SSN or phone number would be the primary key field.

A data table can have more than one foreign key. It can also have no foreign key at all. In the Orders Table, the LastName field would be defined as a foreign key field. This means that it is a nonunique field in this data table that points to a key field in an external (foreign) table. Any attempt to add to the Orders table a record that contains a value in the LastName field, which does not also exist in a LastName field in the Customer Table, would result in a database error. For example, if the Customer table contains three records (Smith, Amundsen, and Jones), and you try to add a record to the Orders Table by filling the LastName field of the Orders Table with Paxton, you get a database error. By creating foreign key fields in a table, you build data integrity into your database. This is called referential integrity.


TIP: The main role of a foreign key is to define and maintain relationships between data tables in a database. For this reason, foreign key fields are not unique in the data table in which they exist.


NOTE: Database integrity and foreign keys are covered in depth on Day 16, "Database Normalization," and Day 17, "Multiuser Considerations."

Now that you've worked through the basics of database elements in general, let's look at specific characteristics of Visual Basic data fields.

Visual Basic Database Field Types

Visual Basic stores values in the data table in data fields. Visual Basic recognizes 14 different data field types that you can use to store values. Each data field type has unique qualities that make it especially suitable for storing different types of data. Some are used to store images, the results of checkbox fields, currency amounts, calendar dates, and various sizes of numeric values. Table 2.1 lists the 14 database field types recognized by Visual Basic.

The first column contains the Visual Basic data field type name. This is the name you use when you create data tables using the Visual Data Manager from the Toolbar. You learn about using this tool in Day 7, "Using the Visdata Program."

The second column shows the number of bytes of storage taken by the various data field types. If the size column is set to "V," the length is variable and is determined by you at design time or by the program at runtime.

The third column in the table shows the equivalent Visual Basic data type for the associated database field type. This column tells you what Visual Basic data type you can use to update the database field.

Table 2.1. Visual Basic data field types.
Data Field Type Size VBType Comments
BINARY V (none) Limited to 255 bytes
BOOLEAN 1 Boolean Stores 0 or -1 only
BYTE 1 Integer Stores 0 to 255 only
COUNTER 8 Long Auto-incrementing Long type
CURRENCY 8 Currency 15 places to left of decimal, 4 to right
DATETIME 8 Date/Time Date stored on the left of decimal point, time stored on the right
DOUBLE 8 Double
GUID 16 (none) Used to store Globally Unique Identifiers
INTEGER 2 Integer
LONG 8 Long
LONGBINARY V (none) Used for OLE objects
MEMO V String Length varies up to 1.2 gigabytes
SINGLE 4 Single
TEXT V String Length limited to 255



NOTE: It is important to understand the difference between the Visual Basic data field types and the Visual Basic data types. The data field types are those recognized as valid data types within data tables. The data types are those types recognized by Visual Basic when defining variables within a program. For example, you can store the value 3 in a BYTE field in a data table, but you store that same value in an Integer field in a Visual Basic program variable.

Even though it is true that Visual Basic allows programmers to create database applications that can read and write data in several different data formats, all database formats do not recognize all data field types. For example, xBase data fields do not recognize a CURRENCY data field type. Before developing cross data-engine applications, you need to know exactly what data field types are needed and how they are to be mapped to various data formats. The various data formats are covered in Day 9, "Visual Basic and the Microsoft Jet Engine."

A number of things in Table 2.1 deserve additional comment:


WARNING: This behavior is changed from Visual Basic 4.0. In the past, Microsoft Jet would automatically convert the invalid value to a byte value and not report an error. For example, if you enter the value 255 (stored as FF in hexadecimal), Visual Basic stores 255 in the data field. If you enter 260 (stored as 0104 in hexadecimal--it takes two bytes!), Visual Basic stores a decimal 4 in the data field because the right-most byte is set to hexadecimal 04.

Building the Visual Basic 5 Field Data Types Project

The following project illustrates how different Visual Basic data field types store user input. You also see how Visual Basic responds to input that is out of range for the various data field types.

1. Begin by creating a new Visual Basic project (select File | New Project). Using Table 2.2 and Figure 2.8 as guides, populate the Visual Basic form.


WARNING: Notice that you are creating a set of four buttons with the same name, but different Index property values. This is a control array. Control arrays offer an excellent way to simplify Visual Basic coding. However, they behave a bit differently than non-arrayed controls. It is important that you build the controls exactly as described in this table.

Figure 2.8. The form for the Visual Basic data field types project.


Table 2.2. Controls for the Visual Basic data field types project.
Control Property Setting
Project Name prjFieldTypes
Form Name frmFieldTypes
Caption VB5 Basic Data Field Types
CommandButton Name cmdBtn
Caption &Add
Height 300
Index 0
Width 1200
CommandButton Name cmdBtn
Caption &Add
Height 300
Index 0
Width 1200
CommandButton Name cmdBtn
Caption &Update
Height 300
Index 1
Width 1200
CommandButton Name cmdBtn
Caption &Delete
Height 300
Index 2
Width 1200
CommandButton Name cmdBtn
Caption E&xit
Height 300
Index 3
Width 1200
DataControl Name datFieldTypes
DatabaseName FIELDTYPES.MDB (include correct path)
RecordSource FieldTypes

2. Now add the code behind the command button array. Double-click the Add button (or any other button in the array) to bring up the code window. Enter the code from Listing 2.1 into the cmdBtn_Click event.

Listing 2.1. Code for the cmdBtn_Click event.

Private Sub cmdBtn_Click(Index As Integer)
    `
    ` handle button selections
    `
    On Error GoTo LocalError
    `
    Select Case Index
        Case 0 ` add
            datFieldTypes.Recordset.AddNew
        Case 1 ` update
            datFieldTypes.UpdateRecord
            datFieldTypes.Recordset.Bookmark =
ÂdatFieldTypes.Recordset.LastModified
        Case 2 ` delete
            datFieldTypes.Recordset.Delete
            datFieldTypes.Recordset.MovePrevious
        Case 3 ` exit
            Unload Me
    End Select
    Exit Sub
    `
LocalError:
    MsgBox Err.Description, vbCritical, Err.Number
    `
End Sub 


There may be several things in this code segment that are new to you. First, different lines of code are executed based on the button that is pushed by the user. This is indicated by the Index parameter that is passed to the Click event. Second, some error-handling code has been added to make it easy for you to experiment with the data form. You learn more about error-handling in Day 14, "Error Handling in Visual Basic 5.0." Don't worry if this code segment looks a bit confusing. For now, just go ahead and enter the code that is shown here.

Now is a good time to save the project. Save the form as FieldTypes.frm and the project as FieldTypes.vbp. Run the project just to make sure that you have entered all the code correctly up to this point. If you get error messages from Visual Basic, refer back to Table 2.2 and the preceding code lines to correct the problem. Testing the BOOLEAN Data Type Now you can add a text box input control and a label to this form. Set the caption of the label to Boolean:. Set the DataSource property of the text box to datFieldTypes and the DataField property to BooleanField. Set the Text property to blank. Refer to Figure 2.9 for placement and sizing.

Figure 2.9. Adding the BOOLEAN data type input control.


Now run the program. If this is the first time you've run the program, you should see an empty field. Press the Add button to create a new record and then press the Update button to save that record. You see that the first value in the input box is a 0, the default value for BOOLEAN fields. Enter the number 13 in the text box and click the Update button. This forces the data control to save the input field to the data table and update the display. What happened to the 13? It was converted to -1. Any value other than 0, when entered into a BOOLEAN data type field, is converted to -1. Testing the BYTE Data Type Now let's add a label and input control for the BYTE data type field. Instead of picking additional controls from the Toolbox Window and typing in property settings, Visual Basic allows you to copy existing controls. Copying controls saves time, reduces typing errors, and helps to keep the size and shape of the controls on your form consistent.

To copy controls, use the mouse pointer, with the left mouse button depressed, to create a dotted-line box around both the label control and the text box control already on your form (in this case, the label Boolean and its text box). When you release the left mouse button, you see that both controls have been marked as selected. Now click Edit | Copy to copy the selected controls to the Clipboard. Use Edit | Paste to copy the controls from the Clipboard back onto your form.

At this point, Visual Basic asks you whether you want to create a Control Array. Say yes, both times. You then see the two controls appear at the top left of the form. Use your mouse to position them on the form (see Figure 2.10).

Figure 2.10. Copying controls on a form.



TIP: The Textbox and Label controls on this form are part of a control array. Because using control arrays reduces the total number of distinct controls on your forms, they reduce the amount of Windows resources your program uses. You can copy controls as many times as you like--even across forms and projects!

You just created duplicates of the BOOLEAN input control. All you need to do now is change the label caption to Byte and the text box DataField property to ByteField, and you have two new controls on your form with minimal typing. Your form should look like the one in Figure 2.11.

Figure 2.11. Adding the BYTE data type to your form.


Save and run the program. This time, after pressing the Add button, enter the value 256 into the Byte input control and press the Update button. You see that when Visual Basic attempts to store the value to the data table, a runtime error is reported. Byte data fields can only accept positive values between 0 and 255. Trying to save any other value in this data field causes the Microsoft Jet data engine to report an error to Visual Basic. Testing the CURRENCY Data Type Copy the label and text box control again using the mouse to select the controls to be copied, and the Copy and Paste commands from the Edit menu. Change the label Caption property to Currency and the text box DataField property to CurrencyField. Refer to Figure 2.12 for spacing and sizing of the controls.

Figure 2.12. Adding the CURRENCY data type to the form.


Save and run the program and test the CURRENCY data type text box. Press the Add button; enter the value 1.00001; force Visual Basic to save the value to the data table (press the Update button) and see what happens. Try entering 1.23456. When storing values to the CURRENCY data type field, Visual Basic stores only four places to the right of the decimal. If the number is larger than four decimal places to the right, Visual Basic rounds the value before storing it in the data field. Also, you notice that Visual Basic does not add a dollar sign ($) to the display of CURRENCY type data fields. Testing the DATETIME Data Type The Visual Basic DATETIME data type field is one of the most powerful data types. Visual Basic performs extensive edit checks on values entered in the DATETIME data type field. Using DATETIME data type fields can save a lot of coding when you need to make sure valid dates are entered by users.

Create a new set of label and text box controls by copying the label and text box controls again. Change the label caption property to DateTime and the text box DataField property to DateTimeField. Your form should look like the one in Figure 2.13.

Save and run the program. Try entering 12/32/95. As you can see, Visual Basic gives you an error message whenever you enter an invalid date. Now enter 1/1/0 into the Date text box. Notice that Visual Basic formats the date for you.

Figure 2.13. Adding the DATETIME data type to the form.


How does Visual Basic decide what date format to use? The date format used comes from the settings in the Windows 95 Control Panel Regional Settings applet. While you have this program running, experiment by calling up the Windows 95 Regional Settings applet. (From the task bar, select Start | Settings | Control Panel, and then select Regional Settings.) Change the date format settings, and return to your Visual Basic program to see the results.


TIP: The Visual Basic DATETIME data type should always be used to store date values. If you install your program in Europe, where the common date display format is DD-MM-YY instead of the common U.S. format of MM-DD-YY, your program will work without a problem. If you store dates as strings in the format MM/DD/YY or as numeric values in the format YYMMDD, your program will not be able to compute or display dates correctly across international boundaries.

Testing the COUNTER Data Type Now let's test a very special database field type--the COUNTER data type. This data type is automatically set by Visual Basic each time you add a new record to the data table. The COUNTER data type makes an excellent unique primary key field because Visual Basic is able to create and store more than a billion unique values in the COUNTER field without duplication.


NOTE: Actually, the Counter data type is not a true database field type. Instead, the Counter data type is a Long data field with its Attribute property set to AutoIncrField. You won't find the Counter data type listed in the documentation, but you will see references to auto-incrementing fields and see a "Counter" type as an option when you build data fields with the Visual Data Manager.

Copy another label/text box control set onto the form. Change the label caption property to Counter and the text box DataField property to AutoIncrField. See Figure 2.14 for guidance in positioning and sizing the control.

Figure 2.14. Adding the COUNTER data type to the form.


Now save and run the program one more time. Notice that the COUNTER data type already has a value in it, even though you have not entered data into the field. Visual Basic sets the value of COUNTER fields; users do not. Add a new record to the table by pressing the Add button. You see a new value in the COUNTER input control. Visual Basic uses the next available number in sequence. Visual Basic is also able to ensure unique numbers in a multiuser setting. If you have three people running the same program adding records to this table, they will all receive unique values in the Counter text box.


WARNING: You should never attempt to edit the value in the COUNTER text box! If Visual Basic determines that the counter value has been changed, it displays a Visual Basic error message, and you cannot save the record. Even if you reset the value in the COUNTER data field back to its original value, Visual Basic refuses to save the record.

Additional Visual Basic Data Types The rest of the Visual Basic data types (INTEGER, SINGLE, DOUBLE, TEXT, MEMO, BINARY, LONGBINARY, and GUID) are rather unspectacular when placed on a form. The following are some notes on the various Visual Basic data types that you should keep in mind when you are designing your data tables.

Summary

Today you learned the following about relational databases:

You also learned the 14 basic data field types recognized by Microsoft Jet and Visual Basic. You constructed a data entry form that allows you to test the way Visual Basic behaves when attempting to store data entered into the various data field types.

Quiz

1. What are the three main building blocks for relational databases?

2. What is the smallest building block in a relational database?

3. A data record is a collection of related __________.

4. What is the main role of a primary key in a data table?

5.
Can a data table have more than one foreign key defined?

6. List all the possible values that can be stored in a BOOLEAN data field.

7. What is the highest value that can be stored in a BYTE data field?

8. What happens when you attempt to edit a COUNTER data field?

9. How many places to the right of the decimal can be stored in a CURRENCY data field?

10. What Windows Control Panel Applet determines the display format of DATE data fields?

Exercises

Answer questions 1, 2, and 3 based on the data in this table:
SSN Last First Age City St Comments
123-45-6789 Smith Mark 17 Austin TX Trans. from New York.
456-79-1258 Smith Ron 21 New York NY Born in Wyoming.
987-65-8764 Johnson Curt 68 Chicago IL Plays golf on Wed.

1. How many records are in the previous data table?

2. Which field should you select as the primary key?

3. Identify each data field, its Data Field Type, and its VISUAL BASIC Type.

4.
Modify the Visual Basic Data Field Types example from this lesson by creating a checkbox and placing the results in the existing BOOLEAN textbox.