Teach Yourself SQL in 21 Days, Second Edition

Previous chapterNext chapterContents


- Day 10 -
Creating Views and Indexes

Objectives

Today we begin to cover topics that may be new even to programmers or database users who have already had some exposure to SQL. Days 1 through 8 covered nearly all the introductory material you need to get started using SQL and relational databases. Day 9, "Creating and Manipulating Tables," was devoted to a discussion of database design, table creation, and other data manipulation commands. The common feature of the objects discussed so far--databases, tables, records, and fields--is that they are all physical objects located on a hard disk. Today the focus shifts to two features of SQL that enable you to view or present data in a different format than it appears on the disk. These two features are the view and the index. By the end of today, you will know the following:

A view is often referred to as a virtual table. Views are created by using the CREATE VIEW statement. After the view has been created, you can use the following SQL commands to refer to that view:

An index is another way of presenting data differently than it appears on the disk. Special types of indexes reorder the record's physical location within a table. Indexes can be created on a column within a table or on a combination of columns within a table. When an index is used, the data is presented to the user in a sorted order, which you can control with the CREATE INDEX statement. You can usually gain substantial performance improvements by indexing on the correct fields, particularly fields that are being joined between tables.


NOTE: Views and indexes are two totally different objects, but they have one thing in common: They are both associated with a table in the database. Although each object's association with a table is unique, they both enhance a table, thus unveiling powerful features such as presorted data and predefined queries.


NOTE: We used Personal Oracle7 to generate today's examples. Please see the documentation for your specific SQL implementation for any minor differences in syntax.

Using Views

You can use views, or virtual tables, to encapsulate complex queries. After a view on a set of data has been created, you can treat that view as another table. However, special restrictions are placed on modifying the data within views. When data in a table changes, what you see when you query the view also changes. Views do not take up physical space in the database as tables do.

The syntax for the CREATE VIEW statement is

SYNTAX:
CREATE VIEW <view_name> [(column1, column2...)] AS
SELECT <table_name column_names>
FROM <table_name>

As usual, this syntax may not be clear at first glance, but today's material contains many examples that illustrate the uses and advantages of views. This command tells SQL to create a view (with the name of your choice) that comprises columns (with the names of your choice if you like). An SQL SELECT statement determines the fields in these columns and their data types. Yes, this is the same SELECT statement that you have used repeatedly for the last nine days.

Before you can do anything useful with views, you need to populate the BILLS database with a little more data. Don't worry if you got excited and took advantage of your newfound knowledge of the DROP DATABASE command. You can simply re-create it. (See Tables 10.1, 10.2, and 10.3 for sample data.)

INPUTOUTPUT:
SQL> create database BILLS;

Statement processed.
INPUTOUTPUT:
SQL> create table BILLS (
  2   NAME CHAR(30) NOT NULL,
  3   AMOUNT NUMBER,
  4   ACCOUNT_ID NUMBER NOT NULL);

Table created.
INPUTOUTPUT:
SQL> create table BANK_ACCOUNTS (
  2   ACCOUNT_ID NUMBER NOT NULL,
  3   TYPE CHAR(30),
  4   BALANCE NUMBER,
  5   BANK CHAR(30));

Table created.
INPUTOUTPUT:
SQL> create table COMPANY (
  2   NAME CHAR(30) NOT NULL,
  3   ADDRESS CHAR(50),
  4   CITY CHAR(30),
  5   STATE CHAR(2));

Table created.

Table 10.1. Sample data for the BILLS table.

Name Amount Account_ID
Phone Company 125 1
Power Company 75 1
Record Club 25 2
Software Company 250 1
Cable TV Company 35 3
Joe's Car Palace 350 5
S.C. Student Loan 200 6
Florida Water Company 20 1
U-O-Us Insurance Company 125 5
Debtor's Credit Card 35 4

Table 10.2. Sample data for the BANK_ACCOUNTS table.

Account_ID Type Balance Bank
1 Checking 500 First Federal
2 Money Market 1200 First Investor's
3 Checking 90 Credit Union
4 Savings 400 First Federal
5 Checking 2500 Second Mutual
6 Business 4500 Fidelity

Table 10.3. Sample data for the COMPANY table.

Name Address City State
Phone Company 111 1st Street Atlanta GA
Power Company 222 2nd Street Jacksonville FL
Record Club 333 3rd Avenue Los Angeles CA
Software Company 444 4th Drive San Francisco CA
Cable TV Company 555 5th Drive Austin TX
Joe's Car Palace 1000 Govt. Blvd Miami FL
S.C. Student Loan 25 College Blvd Columbia SC
Florida Water Company 1883 Hwy 87 Navarre FL
U-O-Us Insurance 295 Beltline Hwy Macon GA
Company
Debtor's Credit Card 115 2nd Avenue Newark NJ

Now that you have successfully used the CREATE DATABASE, CREATE TABLE, and INSERT commands to input all this information, you are ready for an in-depth discussion of the view.

A Simple View

Let's begin with the simplest of all views. Suppose, for some unknown reason, you want to make a view on the BILLS table that looks identical to the table but has a different name. (We call it DEBTS.) Here's the statement:

INPUT:
SQL> CREATE VIEW DEBTS AS
     SELECT * FROM BILLS;

To confirm that this operation did what it should, you can treat the view just like a table:

INPUT/OUTPUT:
SQL> SELECT * FROM DEBTS;

NAME                     AMOUNT  ACCOUNT_ID
Phone Company            125     1
Power Company             75     1
Record Club               25     2
Software Company         250     1
Cable TV Company          35     3
Joe's Car Palace         350     5
S.C. Student Loan        200     6
Florida Water Company     20     1
U-O-Us Insurance Company 125     5
Debtor's Credit Card      35     4
10 rows selected.

You can even create new views from existing views. Be careful when creating views of views. Although this practice is acceptable, it complicates maintenance. Suppose you have a view three levels down from a table, such as a view of a view of a view of a table. What do you think will happen if the first view on the table is dropped? The other two views will still exist, but they will be useless because they get part of their information from the first view. Remember, after the view has been created, it functions as a virtual table.

INPUT:
SQL> CREATE VIEW CREDITCARD_DEBTS AS
  2   SELECT * FROM DEBTS
  3   WHERE ACCOUNT_ID = 4;
SQL> SELECT * FROM CREDITCARD_DEBTS;
OUTPUT:
NAME                   AMOUNT    ACCOUNT_ID
Debtor's Credit Card       35    4

1 row selected.

The CREATE VIEW also enables you to select individual columns from a table and place them in a view. The following example selects the NAME and STATE fields from the COMPANY table.

INPUT:
SQL> CREATE VIEW COMPANY_INFO (NAME, STATE) AS
  2   SELECT * FROM COMPANY;
SQL> SELECT * FROM COMPANY_INFO;
OUTPUT:
NAME                      STATE
Phone Company                GA
Power Company                FL
Record Club                  CA
Software Company             CA
Cable TV Company             TX
Joe's Car Palace             FL
S.C. Student Loan            SC
Florida Water Company        FL
U-O-Us Insurance Company     GA
Debtor's Credit Card         NJ

10 rows selected.


NOTE: Users may create views to query specific data. Say you have a table with 50 columns and hundreds of thousands of rows, but you need to see data in only 2 columns. You can create a view on these two columns, and then by querying from the view, you should see a remarkable difference in the amount of time it takes for your query results to be returned.

Renaming Columns

Views simplify the representation of data. In addition to naming the view, the SQL syntax for the CREATE VIEW statement enables you to rename selected columns. Consider the preceding example a little more closely. What if you wanted to combine the ADDRESS, CITY, and STATE fields from the COMPANY table to print them on an envelope? The following example illustrates this. This example uses the SQL + operator to combine the address fields into one long address by combining spaces and commas with the character data.

INPUT:
SQL> CREATE VIEW ENVELOPE (COMPANY, MAILING_ADDRESS) AS
  2   SELECT NAME, ADDRESS + " " + CITY + ", " + STATE
  3   FROM COMPANY;
SQL> SELECT * FROM ENVELOPE;
OUTPUT:
COMPANY                      MAILING_ADDRESS
Phone Company                111 1st Street Atlanta, GA
Power Company                222 2nd Street Jacksonville, FL
Record Club                  333 3rd Avenue Los Angeles, CA
Software Company             444 4th Drive San Francisco, CA
Cable TV Company             555 5th Drive Austin, TX
Joe's Car Palace             1000 Govt. Blvd Miami, FL
S.C. Student Loan            25 College Blvd. Columbia, SC
Florida Water Company        1883 Hwy. 87 Navarre, FL
U-O-Us Insurance Company     295 Beltline Hwy. Macon, GA
Debtor's Credit Card         115 2nd Avenue Newark, NJ

10 rows selected.
ANALYSIS:

The SQL syntax requires you to supply a virtual field name whenever the view's virtual field is created using a calculation or SQL function. This pro- cedure makes sense because you wouldn't want a view's column name to be COUNT(*) or AVG(PAYMENT).


NOTE: Check your implementation for the use of the + operator.

SQL View Processing

Views can represent data within tables in a more convenient fashion than what actually exists in the database's table structure. Views can also be extremely convenient when performing several complex queries in a series (such as within a stored procedure or application program). To solidify your understanding of the view and the SELECT statement, the next section examines the way in which SQL processes a query against a view. Suppose you have a query that occurs often, for example, you routinely join the BILLS table with the BANK_ACCOUNTS table to retrieve information on your payments.

INPUT:
SQL> SELECT BILLS.NAME, BILLS.AMOUNT, BANK_ACCOUNTS.BALANCE,
  2   BANK_ACCOUNTS.BANK FROM BILLS, BANK_ACCOUNTS
  3   WHERE BILLS.ACCOUNT_ID = BANK_ACCOUNTS.ACCOUNT_ID;
OUTPUT:
BILLS.NAME	           BILLS.AMOUNT   BANK_ACCOUNTS.BALANCE   BANK_ACCOUNTS.BANK  
Phone Company	           125            500                     First Federal
Power Company	           75             500                     First Federal
Record Club	           25             1200                    First Investor's
Software Company	      250            500                     First  Federal
Cable TV Company	      35             90                      Credit Union
Joe's Car Palace	      350            2500                    Second Mutual
S.C. Student Loan	      200            4500                    Fidelity
Florida Water Company     20             500                     First Federal
U-O-Us Insurance Company  125            2500                    Second Mutual

9 rows selected.

You could convert this process into a view using the following statement:

INPUT/OUTPUT:
SQL> CREATE VIEW BILLS_DUE (NAME, AMOUNT, ACCT_BALANCE, BANK) AS
  2   SELECT BILLS.NAME, BILLS.AMOUNT, BANK_ACCOUNTS.BALANCE,
  3   BANK_ACCOUNTS.BANK FROM BILLS, BANK_ACCOUNTS
  4   WHERE BILLS.ACCOUNT_ID = BANK_ACCOUNTS.ACCOUNT_ID;

View created.

If you queried the BILLS_DUE view using some condition, the statement would look like this:

INPUT/OUTPUT:
SQL> SELECT * FROM BILLS_DUE
  2   WHERE ACCT_BALANCE > 500;

NAME                         AMOUNT     ACCT_BALANCE     BANK
Record Club                  25         1200             First Investor's
Joe's Car Palace             350        2500             Second Mutual
S.C. Student Loan            200        4500             Fidelity
U-O-Us Insurance Company     125        2500             Second Mutual

4 rows selected.
ANALYSIS:

SQL uses several steps to process the preceding statement. Because BILLS_DUE is a view, not an actual table, SQL first looks for a table named BILLS_DUE and finds nothing. The SQL processor will probably (depending on what database system you are using) find out from a system table that BILLS_DUE is a view. It will then use the view's plan to construct the following query:

SQL> SELECT BILLS.NAME, BILLS.AMOUNT, BANK_ACCOUNTS.BALANCE,
  2   BANK_ACCOUNTS.BANK FROM BILLS, BANK_ACCOUNTS
  3   WHERE BILLS.ACCOUNT_ID = BANK_ACCOUNTS.ACCOUNT_ID
  4   AND BANK_ACCOUNTS.BALANCE > 500;

Example 10.1

Construct a view that shows all states to which the bills are being sent. Also display the total amount of money and the total number of bills being sent to each state.

First of all, you know that the CREATE VIEW part of the statement will look like this:

CREATE VIEW EXAMPLE (STATE, TOTAL_BILLS, TOTAL_AMOUNT) AS...

Now you must determine what the SELECT query will look like. You know that you want to select the STATE field first using the SELECT DISTINCT syntax based on the requirement to show the states to which bills are being sent. For example:

INPUT:
SQL> SELECT DISTINCT STATE FROM COMPANY;
OUTPUT:
STATE
GA
FL
CA
TX
SC
NJ

6 rows selected.

In addition to selecting the STATE field, you need to total the number of payments sent to that STATE. Therefore, you need to join the BILLS table and the COMPANY table.

INPUT/OUTPUT:
SQL> SELECT DISTINCT COMPANY.STATE, COUNT(BILLS.*) FROM BILLS, COMPANY
  2   GROUP BY COMPANY.STATE
  3   HAVING BILLS.NAME = COMPANY.NAME;

STATE     COUNT(BILLS.*)
GA        2
FL        3
CA        2
TX        1
SC        1
NJ        1

6 rows selected.

Now that you have successfully returned two-thirds of the desired result, you can add the final required return value. Use the SUM function to total the amount of money sent to each state.

INPUT/OUTPUT:
SQL> SELECT DISTINCT COMPANY.STATE, COUNT(BILLS.NAME), SUM(BILLS.AMOUNT)
  2   FROM BILLS, COMPANY
  3   GROUP BY COMPANY.STATE
  4   HAVING BILLS.NAME = COMPANY.NAME;
STATE     COUNT(BILLS.*)     SUM(BILLS.AMOUNT)
GA        2                  250
FL        3                  445
CA        2                  275
TX        1                  35
SC        1                  200
NJ        1                  35

6 rows selected.

As the final step, you can combine this SELECT statement with the CREATE VIEW statement you created at the beginning of this project:

INPUT/OUTPUT:
SQL> CREATE VIEW EXAMPLE (STATE, TOTAL_BILLS, TOTAL_AMOUNT) AS
  2   SELECT DISTINCT COMPANY.STATE, COUNT(BILLS.NAME),SUM(BILLS.AMOUNT)
  3   FROM BILLS, COMPANY
  4   GROUP BY COMPANY.STATE
  5   HAVING BILLS.NAME = COMPANY.NAME;

View created.
INPUT/OUTPUT:
SQL> SELECT * FROM EXAMPLE;

STATE     TOTAL_BILLS     TOTAL_AMOUNT
GA        2               250
FL        3               445
CA        2               275
TX        1               35
SC        1               200
NJ        1               35

6 rows selected.

The preceding example shows you how to plan the CREATE VIEW statement and the SELECT statements. This code tests the SELECT statements to see whether they will generate the proper results and then combines the statements to create the view.

Example 10.2

Assume that your creditors charge a 10 percent service charge for all late payments, and unfortunately you are late on everything this month. You want to see this late charge along with the type of accounts the payments are coming from.

This join is straightforward. (You don't need to use anything like COUNT or SUM.) However, you will discover one of the primary benefits of using views. You can add the 10 percent service charge and present it as a field within the view. From that point on, you can select records from the view and already have the total amount calculated for you. The statement would look like this:

INPUT:
SQL> CREATE VIEW LATE_PAYMENT (NAME, NEW_TOTAL, ACCOUNT_TYPE) AS
  2  SELECT BILLS.NAME, BILLS.AMOUNT * 1.10, BANK_ACCOUNTS.TYPE
  3  FROM BILLS, BANK_ACCOUNTS
  4  WHERE BILLS.ACCOUNT_ID = BANK_ACCOUNTS.ACCOUNT_ID;
OUTPUT:
View created.
INPUT/OUTPUT:
SQL> SELECT * FROM LATE_PAYMENT;

NAME                         NEW_TOTAL     ACCOUNT_TYPE
Phone Company                137.50        Checking
Power Company                82.50         Checking
Record Club                  27.50         Money Market
Software Company             275           Checking
Cable TV Company             38.50         Checking
Joe's Car Palace             385           Checking
S.C. Student Loan            220           Business
Florida Water Company        22            Checking
U-O-Us Insurance Company     137.50        Business
Debtor's Credit Card         38.50         Savings

10 rows selected.

Restrictions on Using SELECT

SQL places certain restrictions on using the SELECT statement to formulate a view. The following two rules apply when using the SELECT statement:

Modifying Data in a View

As you have learned, by creating a view on one or more physical tables within a database, you can create a virtual table for use throughout an SQL script or a database application. After the view has been created using the CREATE VIEW...SELECT statement, you can update, insert, or delete view data using the UPDATE, INSERT, and DELETE commands you learned about on Day 8, "Manipulating Data."

We discuss the limitations on modifying a view's data in greater detail later. The next group of examples illustrates how to manipulate data that is in a view.

To continue on the work you did in Example 10.2, update the BILLS table to reflect that unfortunate 10 percent late charge.

INPUT/OUTPUT:
SQL> CREATE VIEW LATE_PAYMENT AS
  2  SELECT * FROM BILLS;

View created.

SQL> UPDATE LATE_PAYMENT
  2  SET AMOUNT = AMOUNT * 1.10;

1 row updated.

SQL>  SELECT * FROM LATE_PAYMENT;

NAME                         NEW_TOTAL     ACCOUNT_ID
Phone Company                137.50        1
Power Company                82.50         1
Record Club                  27.50         2
Software Company             275           1
Cable TV Company             38.50         3
Joe's Car Palace             385           5
S.C. Student Loan            220           6
Florida Water Company        22            1
U-O-Us Insurance Company     137.50        5
Debtor's Credit Card         38.50         4

10 rows selected.

To verify that the UPDATE actually updated the underlying table, BILLS, query the BILLS table:

INPUT/OUTPUT:
SQL> SELECT * FROM BILLS;

NAME                         NEW_TOTAL     ACCOUNT_ID
Phone Company                137.50        1
Power Company                82.50         1
Record Club                  27.50         2
Software Company             275           1
Cable TV Company             38.50         3
Joe's Car Palace             385           5
S.C. Student Loan            220           6
Florida Water Company        22            1
U-O-Us Insurance Company     137.50        5
Debtor's Credit Card         38.50         4

10 rows selected.

Now delete a row from the view:

INPUT/OUTPUT:
SQL> DELETE FROM LATE_PAYMENT
  2   WHERE ACCOUNT_ID = 4;

1 row deleted.

SQL> SELECT * FROM LATE_PAYMENT;

NAME                         NEW_TOTAL     ACCOUNT_ID
Phone Company                137.50        1
Power Company                82.50         1
Record Club                  27.50         2
Software Company             275           1
Cable TV Company             38.50         3
Joe's Car Palace             385           5
S.C. Student Loan            220           6
Florida Water Company        22            1
U-O-Us Insurance Company     137.50        5

9 rows selected.

The final step is to test the UPDATE function. For all bills that have a NEW_TOTAL greater than 100, add an additional 10.

INPUT/OUTPUT:
SQL> UPDATE LATE_PAYMENT
  2  SET NEW_TOTAL = NEW_TOTAL + 10
  3  WHERE NEW_TOTAL > 100;

9 rows updated.

SQL> SELECT * FROM LATE_PAYMENT;

NAME                         NEW_TOTAL     ACCOUNT_ID
Phone Company                147.50        1
Power Company                82.50         1
Record Club                  27.50         2
Software Company             285           1
Cable TV Company             38.50         3
Joe's Car Palace             395           5
S.C. Student Loan            230           6
Florida Water Company        22            1
U-O-Us Insurance Company     147.50        5

9 rows selected.

Problems with Modifying Data Using Views

Because what you see through a view can be some set of a group of tables, modifying the data in the underlying tables is not always as straightforward as the previous examples. Following is a list of the most common restrictions you will encounter while working with views:

Common Applications of Views

Here are a few of the tasks that views can perform:

Views and Security

Although a complete discussion of database security appears in Day 12, "Database Security," we briefly touch on the topic now to explain how you can use views in performing security functions.

All relational database systems in use today include a full suite of built-in security features. Users of the database system are generally divided into groups based on their use of the database. Common group types are database administrators, database developers, data entry personnel, and public users. These groups of users have varying degrees of privileges when using the database. The database administrator will probably have complete control of the system, including UPDATE, INSERT, DELETE, and ALTER database privileges. The public group may be granted only SELECT privileges--and perhaps may be allowed to SELECT only from certain tables within certain databases.

Views are commonly used in this situation to control the information that the database user has access to. For instance, if you wanted users to have access only to the NAME field of the BILLS table, you could simply create a view called BILLS_NAME:

INPUT/OUTPUT:
SQL>  CREATE VIEW BILLS_NAME AS
  2   SELECT NAME FROM BILLS;

View created.

Someone with system administrator-level privileges could grant the public group SELECT privileges on the BILLS_NAME view. This group would not have any privileges on the underlying BILLS table. As you might guess, SQL has provided data security statements for your use also. Keep in mind that views are very useful for implementing database security.

Using Views to Convert Units

Views are also useful in situations in which you need to present the user with data that is different from the data that actually exists within the database. For instance, if the AMOUNT field is actually stored in U.S. dollars and you don't want Canadian users to have to continually do mental calculations to see the AMOUNT total in Canadian dollars, you could create a simple view called CANADIAN_BILLS:

INPUT/OUTPUT:
SQL>  CREATE VIEW CANADIAN_BILLS (NAME, CAN_AMOUNT) AS
  2   SELECT NAME, AMOUNT / 1.10
  3   FROM BILLS;

View Created.

SQL> SELECT * FROM CANADIAN_BILLS;

NAME                         CAN_AMOUNT
Phone Company                125
Power Company                75
Record Club                  25
Software Company             250
Cable TV Company             35
Joe's Car Palace             350
S.C. Student Loan            200
Florida Water Company        20
U-O-Us Insurance Company     125

9 rows selected.
ANALYSIS:

When converting units like this, keep in mind the possible problems inherent in modifying the underlying data in a table when a calculation (such as the preceding example) was used to create one of the columns of the view. As always, you should consult your database system's documentation to determine exactly how the system implements the CREATE VIEW command.

Simplifying Complex Queries Using Views

Views are also useful in situations that require you to perform a sequence of queries to arrive at a result. The following example illustrates the use of a view in this situation.

To give the name of all banks that sent bills to the state of Texas with an amount less than $50, you would break the problem into two separate problems:

Let's solve this problem using two separate views: BILLS_1 and BILLS_2:

INPUT/OUTPUT:
SQL> CREATE TABLE BILLS1 AS
  2  SELECT * FROM BILLS
  3  WHERE AMOUNT < 50;

Table created.

SQL> CREATE TABLE BILLS2 (NAME, AMOUNT, ACCOUNT_ID) AS
  2  SELECT BILLS.* FROM BILLS, COMPANY
  3  WHERE BILLS.NAME = COMPANY.NAME AND COMPANY.STATE = "TX";

Table created.
ANALYSIS:

Because you want to find all bills sent to Texas and all bills that were less than $50, you can now use the SQL IN clause to find which bills in BILLS1 were sent to Texas. Use this information to create a new view called BILLS3:

INPUT/OUTPUT:
SQL> CREATE VIEW BILLS3 AS
  2  SELECT * FROM BILLS2 WHERE NAME IN
  3  (SELECT * FROM BILLS1);

View created.

Now combine the preceding query with the BANK_ACCOUNTS table to satisfy the original requirements of this example:

INPUT/OUTPUT:
SQL> CREATE VIEW BANKS_IN_TEXAS (BANK) AS
  2  SELECT BANK_ACCOUNTS.BANK
  3  FROM BANK_ACCOUNTS, BILLS3
  4  WHERE BILLS3.ACCOUNT_ID = BANK_ACCOUNTS.ACCOUNT_ID;

View created.

SQL> SELECT * FROM BANK_IN_TEXAS;

BANK
Credit Union

1 row selected.

ANALYSIS:

As you can see, after the queries were broken down into separate views, the final query was rather simple. Also, you can reuse the individual views as often as necessary.

The DROP VIEW Statement

In common with every other SQL CREATE... command, CREATE VIEW has a corresponding DROP... command. The syntax is as follows:

SYNTAX:
SQL> DROP VIEW view_name;

The only thing to remember when using the DROP VIEW command is that all other views that reference that view are now invalid. Some database systems even drop all views that used the view you dropped. Using Personal Oracle7, if you drop the view BILLS1, the final query would produce the following error:

INPUT/OUTPUT:
SQL> DROP VIEW BILLS1;

View dropped.

SQL> SELECT * FROM BANKS_IN_TEXAS;
*
ERROR at line 1:
ORA-04063: view "PERKINS.BANKS_IN_TEXAS" has errors


NOTE: A view can be dropped without any of the actual tables being modified, which explains why we often refer to views as virtual tables. (The same logic can be applied to the technology of virtual reality.)

Using Indexes

Another way to present data in a different format than it physically exists on the disk is to use an index. In addition, indexes can also reorder the data stored on the disk (something views cannot do).

Indexes are used in an SQL database for three primary reasons:

What Are Indexes?

Data can be retrieved from a database using two methods. The first method, often called the Sequential Access Method, requires SQL to go through each record looking for a match. This search method is inefficient, but it is the only way for SQL to locate the correct record. Think back to the days when libraries had massive card catalog filing systems. Suppose the librarian removed the alphabetical index cards, tossed the cards into the air, then placed them back into the filing cabinets. When you wanted to look up this book's shelf location, you would probably start at the very beginning, then go through one card at a time until you found the information you wanted. (Chances are, you would stop searching as soon as you found any book on this topic!)

Now suppose the librarian sorted the book titles alphabetically. You could quickly access this book's information by using your knowledge of the alphabet to move through the catalog.

Imagine the flexibility if the librarian was diligent enough to not only sort the books by title but also create another catalog sorted by author's name and another sorted by topic. This process would provide you, the library user, with a great deal of flexibility in retrieving information. Also, you would be able to retrieve your information in a fraction of the time it originally would have taken.

Adding indexes to your database enables SQL to use the Direct Access Method. SQL uses a treelike structure to store and retrieve the index's data. Pointers to a group of data are stored at the top of the tree. These groups are called nodes. Each node contains pointers to other nodes. The nodes pointing to the left contain values that are less than its parent node. The pointers to the right point to values greater than the parent node.

The database system starts its search at the top node and simply follows the pointers until it is successful.


NOTE: The result of a query against the unindexed table is commonly referred to as a full-table scan. A full-table scan is the process used by the database server to search every row of a table until all rows are returned with the given condition(s). This operation is comparable to searching for a book in the library by starting at the first book on the first shelf and scanning every book until you find the one you want. On the other hand, to find the book quickly, you would probably look in the (computerized) card catalog. Similarly, an index enables the database server to point to specific rows of data quickly within a table.

Fortunately, you are not required to actually implement the tree structure yourself, just as you are not required to write the implementation for saving and reading in tables or databases. The basic SQL syntax to create an index is as follows:

INPUT/OUTPUT:
SQL>  CREATE INDEX index_name
  2   ON table_name(column_name1, [column_name2], ...);

Index created.

As you have seen many times before, the syntax for CREATE INDEX can vary widely among database systems. For instance, the CREATE INDEX statement under Oracle7 looks like this:

SYNTAX:
CREATE INDEX [schema.]index
ON { [schema.]table (column [!!under!!ASC|DESC]
     [, column [!!under!!ASC|DESC]] ...)
   | CLUSTER [schema.]cluster }
[INITRANS integer] [MAXTRANS integer]
[TABLESPACE tablespace]
[STORAGE storage_clause]
[PCTFREE integer]
[NOSORT]

The syntax for CREATE INDEX using Sybase SQL Server is as follows:

SYNTAX:
create [unique] [clustered | nonclustered]
       index index_name
on [[database.]owner.]table_name (column_name
    [, column_name]...)
[with {fillfactor = x, ignore_dup_key, sorted_data,
      [ignore_dup_row | allow_dup_row]}]
[on segment_name]

Informix SQL implements the command like this:

SYNTAX:
CREATE [UNIQUE | DISTINCT] [CLUSTER] INDEX index_name
ON table_name (column_name [ASC | DESC],
               column_name [ASC | DESC]...)

Notice that all of these implementations have several things in common, starting with the basic statement

CREATE INDEX index_name ON table_name (column_name, ...)

SQL Server and Oracle allow you to create a clustered index, which is discussed later. Oracle and Informix allow you to designate whether the column name should be sorted in ascending or descending order. We hate to sound like a broken record, but, once again, you should definitely consult your database management system's documentation when using the CREATE INDEX command.

For instance, to create an index on the ACCOUNT_ID field of the BILLS table, the CREATE INDEX statement would look like this:

INPUT:
SQL> SELECT * FROM BILLS;
OUTPUT:
NAME                         AMOUNT     ACCOUNT_ID
Phone Company                125        1
Power Company                75         1
Record Club                  25         2
Software Company             250        1
Cable TV Company             35         3
Joe's Car Palace             350        5
S.C. Student Loan            200        6
Florida Water Company        20         1
U-O-Us Insurance Company     125        5
Debtor's Credit Card         35         4

10 rows selected.
INPUT/OUTPUT:
SQL> CREATE INDEX ID_INDEX ON BILLS( ACCOUNT_ID );

Index created.

SQL> SELECT * FROM BILLS;

NAME                         AMOUNT     ACCOUNT_ID
Phone Company                125        1
Power Company                75         1
Software Company             250        1
Florida Water Company        20         1
Record Club                  25         2
Cable TV Company             35         3
Debtor's Credit Card         35         4
Joe's Car Palace             350        5
U-O-Us Insurance Company     125        5
S.C. Student Loan            200        6

10 rows selected.

The BILLS table is sorted by the ACCOUNT_ID field until the index is dropped using the DROP INDEX statement. As usual, the DROP INDEX statement is very straightforward:

SYNTAX:
SQL> DROP INDEX index_name;

Here's what happens when the index is dropped:

INPUT/OUTPUT:
SQL> DROP INDEX ID_INDEX;

Index dropped.

SQL> SELECT * FROM BILLS;

NAME                         AMOUNT     ACCOUNT_ID
Phone Company                125        1
Power Company                75         1
Record Club                  25         2
Software Company             250        1
Cable TV Company             35         3
Joe's Car Palace             350        5
S.C. Student Loan            200        6
Florida Water Company        20         1
U-O-Us Insurance Company     125        5
Debtor's Credit Card         35         4

10 rows selected.
ANALYSIS:

Now the BILLS table is in its original form. Using the simplest form of the CREATE INDEX statement did not physically change the way the table was stored.

You may be wondering why database systems even provide indexes if they also enable you to use the ORDER BY clause.

INPUT/OUTPUT:
SQL> SELECT * FROM BILLS ORDER BY ACCOUNT_ID;

NAME                         AMOUNT     ACCOUNT_ID
Phone Company                125        1
Power Company                75         1
Software Company             250        1
Florida Water Company        20         1
Record Club                  25         2
Cable TV Company             35         3
Debtor's Credit Card         35         4
Joe's Car Palace             350        5
U-O-Us Insurance Company     125        5
S.C. Student Loan            200        6

10 rows selected.
ANALYSIS:

This SELECT statement and the ID_INDEX on the BILLS table generate the same result. The difference is that an ORDER BY clause re-sorts and orders the data each time you execute the corresponding SQL statement. When using an index, the database system creates a physical index object (using the tree structure explained earlier) and reuses the same index each time you query the table.


WARNING: When a table is dropped, all indexes associated with the table are dropped as well.

Indexing Tips

Listed here are several tips to keep in mind when using indexes:

Indexing on More Than One Field

SQL also enables you to index on more than one field. This type of index is a composite index. The following code illustrates a simple composite index. Note that even though two fields are being combined, only one physical index is created (called ID_CMPD_INDEX).

INPUT/OUTPUT:
SQL> CREATE INDEX ID_CMPD_INDEX ON BILLS( ACCOUNT_ID, AMOUNT );

Index created.

SQL> SELECT * FROM BILLS;

NAME                         AMOUNT         ACCOUNT_ID
Florida Water Company            20                  1
Power Company                    75                  1
Phone Company                   125                  1
Software Company                250                  1
Record Club                      25                  2
Cable TV Company                 35                  3
Debtor's Credit Card             35                  4
U-O-Us Insurance Company        125                  5
Joe's Car Palace                350                  5
S.C. Student Loan               200                  6

10 rows selected.

SQL> DROP INDEX ID_CMPD_INDEX;

Index dropped.
ANALYSIS:

You can achieve performance gains by selecting the column with the most unique values. For instance, every value in the NAME field of the BILLS table is unique. When using a compound index, place the most selective field first in the column list. That is, place the field that you expect to select most often at the beginning of the list. (The order in which the column names appear in the CREATE INDEX statement does not have to be the same as their order within the table.) Assume you are routinely using a statement such as the following:

SQL> SELECT * FROM BILLS WHERE NAME = "Cable TV Company";

To achieve performance gains, you must create an index using the NAME field as the leading column. Here are two examples:

SQL> CREATE INDEX NAME_INDEX ON BILLS(NAME, AMOUNT);

or

SQL> CREATE INDEX NAME_INDEX ON BILLS(NAME);

The NAME field is the left-most column for both of these indexes, so the preceding query would be optimized to search on the NAME field.

Composite indexes are also used to combine two or more columns that by themselves may have low selectivity. For an example of selectivity, examine the BANK_ACCOUNTS table:

ACCOUNT_ID     TYPE             BALANCE     BANK
1              Checking         500         First Federal
2              Money Market     1200        First Investor's
3              Checking         90          Credit Union
4              Savings          400         First Federal
5              Checking         2500        Second Mutual
6              Business         4500        Fidelity

Notice that out of six records, the value Checking appears in three of them. This column has a lower selectivity than the ACCOUNT_ID field. Notice that every value of the ACCOUNT_ID field is unique. To improve the selectivity of your index, you could combine the TYPE and ACCOUNT_ID fields in a new index. This step would create a unique index value (which, of course, is the highest selectivity you can get).


NOTE: An index containing multiple columns is often referred to as a composite index. Performance issues may sway your decision on whether to use a single-column or composite index. In Oracle, for example, you may decide to use a single-column index if most of your queries involve one particular column as part of a condition; on the other hand, you would probably create a composite index if the columns in that index are often used together as conditions for a query. Check your specific implementation on guidance when creating multiple-column indexes.

Using the UNIQUE Keyword with CREATE INDEX

Composite indexes are often used with the UNIQUE keyword to prevent multiple records from appearing with the same data. Suppose you wanted to force the BILLS table to have the following built-in "rule": Each bill paid to a company must come from a different bank account. You would create a UNIQUE index on the NAME and ACCOUNT_ID fields. Unfortunately, Oracle7 does not support the UNIQUE syntax. Instead, it implements the UNIQUE feature using the UNIQUE integrity constraint. The following example demonstrates the UNIQUE keyword with CREATE INDEX using Sybase's Transact-SQL language.

INPUT:
1> create unique index unique_id_name
2> on BILLS(ACCOUNT_ID, NAME)
3> go
1> select * from BILLS
2> go
OUTPUT:
NAME                         AMOUNT     ACCOUNT_ID
Florida Water Company        20         1
Power Company                75         1
Phone Company                125        1
Software Company             250        1
Record Club                  25         2
Cable TV Company             35         3
Debtor's Credit Card         35         4
U-O-Us Insurance Company     125        5
Joe's Car Palace             350        5
S.C. Student Loan            200        6

Now try to insert a record into the BILLS table that duplicates data that already exists.

INPUT:
1> insert BILLS (NAME, AMOUNT, ACCOUNT_ID)
2> values("Power Company", 125, 1)
3> go
ANALYSIS:

You should have received an error message telling you that the INSERT command was not allowed. This type of error message can be trapped within an application program, and a message could tell the user he or she inserted invalid data.

Example 10.3

Create an index on the BILLS table that will sort the AMOUNT field in descending order.

INPUT/OUTPUT:
SQL> CREATE INDEX DESC_AMOUNT
     ON  BILLS(AMOUNT DESC);

Index created.
ANALYSIS:

This is the first time you have used the DESC operator, which tells SQL to sort the index in descending order. (By default a number field is sorted in ascending order.) Now you can examine your handiwork:

INPUT/OUTPUT:
SQL> SELECT * FROM BILLS;

NAME                         AMOUNT     ACCOUNT_ID
Joe's Car Palace             350        5
Software Company             250        1
S.C. Student Loan            200        6
Phone Company                125        1
U-O-Us Insurance Company     125        5
Power Company                75         1
Cable TV Company             35         3
Debtor's Credit Card         35         4
Record Club                  25         2
Florida Water Company        20         1

10 rows selected.
ANALYSIS:

This example created an index using the DESC operator on the column amount. Notice in the output that the amount is ordered from largest to smallest.

Indexes and Joins

When using complicated joins in queries, your SELECT statement can take a long time. With large tables, this amount of time can approach several seconds (as compared to the milliseconds you are used to waiting). This type of performance in a client/server environment with many users becomes extremely frustrating to the users of your application. Creating an index on fields that are frequently used in joins can optimize the performance of your query considerably. However, if too many indexes are created, they can slow down the performance of your system, rather than speed it up. We recommend that you experiment with using indexes on several large tables (on the order of thousands of records). This type of experimentation leads to a better understanding of optimizing SQL statements.


NOTE: Most implementations have a mechanism for gathering the elapsed time of a query; Oracle refers to this feature as timing. Check your implementation for specific information.

The following example creates an index on the ACCOUNT_ID fields in the BILLS and BANK_ACCOUNTS tables:

INPUT/OUTPUT:
SQL> CREATE INDEX BILLS_INDEX ON BILLS(ACCOUNT_ID);

Index created.

SQL> CREATE INDEX BILLS_INDEX2 ON BANK_ACCOUNTS(ACCOUNT_ID);

Index created.

SQL> SELECT BILLS.NAME NAME, BILLS.AMOUNT AMOUNT, BANK_ACCOUNTS.BALANCE
  2  ACCOUNT_BALANCE
  3  FROM BILLS, BANK_ACCOUNTS
  4  WHERE BILLS.ACCOUNT_ID = BANK_ACCOUNTS.ACCOUNT_ID;

NAME                         AMOUNT     ACCOUNT_BALANCE
Phone Company                125        500
Power Company                75         500
Software Company             250        500
Florida Water Company        20         500
Record Club                  25         1200
Cable TV Company             35         90
Debtor's Credit Card         35         400
Joe's Car Palace             350        2500
U-O-Us Insurance Company     125        2500
S.C. Student Loan            200        4500

10 rows selected.
ANALYSIS:

This example first created an index for the ACCOUNT_ID on both tables in the associated query. By creating indexes for ACCOUNT_ID on each table, the join can more quickly access specific rows of data. As a rule, you should index the column(s) of a table that are unique or that you plan to join tables with in queries.

Using Clusters

Although we originally said that indexes can be used to present a view of a table that is different from the existing physical arrangement, this statement is not entirely accurate. A special type of index supported by many database systems allows the database manager or developer to cluster data. When a clustered index is used, the physical arrangement of the data within a table is modified. Using a clustered index usually results in faster data retrieval than using a traditional, nonclustered index. However, many database systems (such as Sybase SQL Server) allow only one clustered index per table. The field used to create the clustered index is usually the primary key field. Using Sybase Transact-SQL, you could create a clustered, unique index on the ACCOUNT_ID field of the BANK_ACCOUNTS table using the following syntax:

SYNTAX:
create unique clustered index id_index
on BANK_ACCOUNTS(ACCOUNT_ID)
 go

Oracle treats the concept of clusters differently. When using the Oracle relational database, a cluster is a database object like a database or table. A cluster is used to store tables with common fields so that their access speed is improved.

Here is the syntax to create a cluster using Oracle7:

SYNTAX:
CREATE CLUSTER [schema.]cluster
(column datatype [,column datatype] ... )
[PCTUSED integer] [PCTFREE integer]
[SIZE integer [K|M] ]
[INITRANS integer] [MAXTRANS integer]
[TABLESPACE tablespace]
[STORAGE storage_clause]
[!!under!!INDEX
| [HASH IS column] HASHKEYS integer]

You should then create an index within the cluster based on the tables that will be added to it. Then you can add the tables. You should add tables only to clusters that are frequently joined. Do not add tables to clusters that are accessed individually through a simple SELECT statement.

Obviously, clusters are a very vendor-specific feature of SQL. We will not go into more detail here on their use or on the syntax that creates them. However, consult your database vendor's documentation to determine whether your database management system supports these useful objects.

Summary

Views are virtual tables. Views are simply a way of presenting data in a format that is different from the way it actually exists in the database. The syntax of the CREATE VIEW statement uses a standard SELECT statement to create the view (with some exceptions). You can treat a view as a regular table and perform inserts, updates, deletes, and selects on it. We briefly discussed the use of database security and how views are commonly used to implement this security. Database security is covered in greater detail on Day 12.

The basic syntax used to create a view is

CREATE VIEW view_name AS
SELECT field_name(s) FROM table_name(s);

Here are the most common uses of views:

Indexes are also database design and SQL programming tools. Indexes are physical database objects stored by your database management system that can be used to retrieve data already sorted from the database. In addition, thanks to the way indexes are mapped out, using indexes and properly formed queries can yield significant performance improvements.

The basic syntax used to create an index looks like this:

CREATE INDEX index_name
ON table_name(field_name(s));

Some database systems include very useful additional options such as the UNIQUE and CLUSTERED keywords.

Q&A

Q If the data within my table is already in sorted order, why should I use an index on that table?

A An index still gives you a performance benefit by looking quickly through key values in a tree. The index can locate records faster than a direct access search through each record within your database. Remember--the SQL query processor doesn't necessarily know that your data is in sorted order.

Q Can I create an index that contains fields from multiple tables?

A No, you cannot. However, Oracle7, for instance, allows you to create a cluster. You can place tables within a cluster and create cluster indexes on fields that are common to the tables. This implementation is the exception, not the rule, so be sure to study your documentation on this topic in more detail.

Workshop

The Workshop provides quiz questions to help solidify your understanding of the material covered, as well as exercises to provide you with experience in using what you have learned. Try to answer the quiz and exercise questions before checking the answers in Appendix F, "Answers to Quizzes and Exercises."

Quiz

1. What will happen if a unique index is created on a nonunique field?

2. Are the following statements true or false?

Both views and indexes take up space in the database and therefore must be factored in the planning of the database size.

If someone updates a table on which a view has been created, the view must have an identical update performed on it to see the same data.

If you have the disk space and you really want to get your queries smoking, the more indexes the better.

3. Is the following CREATE statement correct?

SQL> create view credit_debts as
     (select all from debts
     where account_id = 4);
4. Is the following CREATE statement correct?
SQL> create unique view debts as
     select * from debts_tbl;
5. Is the following CREATE statement correct?
SQL> drop * from view debts;
6. Is the following CREATE statement correct?
SQL> create index id_index on bills
     (account_id);

Exercises

1. Examine the database system you are using. Does it support views? What options are you allowed to use when creating a view? Write a simple SQL statement that will create a view using the appropriate syntax. Perform some traditional operations such as SELECT or DELETE and then DROP the view.

2. Examine the database system you are using to determine how it supports indexes. You will undoubtedly have a wide range of options. Try out some of these options on a table that exists within your database. In particular, determine whether you are allowed to create UNIQUE or CLUSTERED indexes on a table within your database.

3. If possible, locate a table that has several thousand records. Use a stopwatch or clock to time various operations against the database. Add some indexes and see whether you can notice a performance improvement. Try to follow the tips given to you today.


Previous chapterNext chapterContents


© Copyright, Macmillan Computer Publishing. All rights reserved.