-- 28 --

From There to Here: Database Migration

The world is filled with legacy database applications that need to be rightsized for a number of reasons. Mainframe-based applications are often downsized to provide greater functionality, lower connectivity costs, and lower maintenance costs. Small applications that support a small number of users--;or even a single user--;are upsized so that more users can benefit from the features that they offer. Another reason for database migration is the "year 2000 problem"--;a legacy database may use only two digits to store the year portion of a date, whereas Oracle stores both century and year.

Data migration is, indeed, an unending activity. When 90 percent of the legacy applications that were implemented in the early 1970s have been rightsized, programmers will face the task of migrating legacy applications from later in the decade. In fact, developing database applications is the process of creating tomorrow's legacy applications. Of course, unless your own money is funding the migration of an application, you must build a solid business case that justifies a return on the investment; the availability of a faster CPU or a glitzier user interface is not a sufficient reason for making the significant investment that is required to rightsize an information system.

The intent of this chapter is to focus on the issues that pertain to migrating an application to an Oracle database. You can't rely on a standard recipe to accomplish this job; every migration is somewhat unique. Instead of a recipe, I use this chapter to outline the major tasks that you'll face regardless of the application. Of course, the details of migrating/reengineering the application code is beyond the scope of this book.

What Is Migration?

When I use the term migration, I am referring to the process of converting a database application to the Oracle database. Obviously, I'm assuming that the application is using some other database for information storage. This database might fit one of the following categories:

For greater efficiency, you should have plenty of free disk space on your target platform and a network connection between the legacy platform and the target platform.

Migrating an application to an Oracle database consists of three main tasks:

The following sections look at each subtask in greater detail.

Migrating Database Structures

Migration is the first critical task that you must accomplish; reengineering is the second. Migration consists of creating an Oracle table that is equivalent in structure to each legacy table or file. I refer to these Oracle tables as the migrated tables. The role of the migrated tables is that of an intermediary--;they store the legacy data before it is cleaned up and restructured.

Reengineering consists of the following tasks:

Figure 28.1 provides a graphical representation of the migration and reengineering tasks.

Figure 28.1 The migration and reengineering tasks.

The legacy database structures that must be migrated depend on the type of database:

I highly recommend the use of a data modeling tool for the migration of a legacy database. You'll want a tool that offers broad support for desktop and server databases, particularly if you're working with a variety of database products. (Your improved productivity will quickly cover the cost of a data modeling tool.) You can think of a data modeling tool as "light" CASE. In other words, a data modeling tool provides valuable support during the analysis phase of a project without forcing you to make a commitment to an entire CASE life cycle.

In the next section, I use ERwin, produced by Logic Works, to support the task of database migration. Although many similar products are available, I've found ERwin to be particularly useful.

Migrating Database Structures Using ERwin ER Data Modeler

ERwin is a Windows-based tool and supports the IDEF1X data modeling standard. In IDEF1X, the logical view of a data model consists of entities, attributes, primary and foreign keys, relationships, and role names. The physical view consists of the tables, columns, and datatypes. ERwin enables you to define domains that can then be shared by several columns. You can switch between the logical and physical views of the data model. Based on both the logical and physical models, ERwin can generate a schema for the target database. The generated schema can be stored to a file or processed directly by the target database.

ERwin provides support for both reverse and forward engineering. Reverse engineering is the process of connecting to a database, capturing data dictionary information, and constructing a graphical representation of the data model. Forward engineering is the process of developing a data model and generating a schema for the target database.

The first step is to identify the target server by selecting the appropriate radio button. (See Figure 28.2.) To illustrate this process, I've selected one of the sample Watcom databases that comes with PowerBuilder.

Figure 28.2 Selecting the target server.

Once you've identified the target server, ERwin must connect to that database. Select Watcom Connection from the Server menu. ERwin displays a dialog box in which you specify the username, password, and database. (See Figure 28.3.)

Figure 28.3 Connecting to the target server.

Once ERwin is connected to the target server, synchronization may begin. ERwin uses the term synchronization to refer to the process of aligning the ERwin data model with that of the target server. The default option is to synchronize ERwin with the target server. Select Synchronize ERwin with Watcom from the Server menu. A list of unsynchronized Watcom tables appears on the right side of the window. (See Figure 28.4.)

Figure 28.4 Synchronizing ERwin with the WATCOM tables.

Highlight both tables--;sales_rep and sales_summary--;and click Import. The tables appear in the list box on the bottom of the window. (See Figure 28.5.) Click Execute to begin synchronization.

Figure 28.5 Selecting the tables to reverse engineer.

ERwin automatically generates a graphical layout of the target server's tables and captures any relationships that exist in the target server. Figure 28.6 illustrates how the sales_rep and sales_summary tables are related by the sales rep ID.

Figure 28.6 Reverse engineering layout of data model.

Now that you've captured the data model from the Watcom database, you may begin the process of generating the schema for Personal Oracle. Select Target Server from the Server menu. This time, select Oracle as the target server. (See Figure 28.7.)

Figure 28.7 Selecting Oracle as the target server.

To connect to Personal Oracle, select Oracle Connection from the Server menu. ERwin displays a dialog box in which you specify the username, password, and database. (See Figure 28.8.) For database, remember to specify oracle7.

Figure 28.8 Connecting to Personal Oracle.

To generate the schema, select Oracle Schema Generation from the Server menu. ERwin displays a window containing many options for schema generation. Using the default settings, click Generate to begin creating the sales_rep and sales_summary tables for Personal Oracle. (See Figure 28.9.)

Figure 28.9 Selecting the schema generation options.

ERwin displays each SQL statement submitted to Personal Oracle for execution. When it has finished generating the schema, ERwin displays the total number of statements that were processed successfully. (See Figure 28.10.)

Figure 28.10 Personal Oracle schema is successfully generated.

Moving the Data

Once you've migrated the database structures, the task of data conversion begins. This activity is typically a two-step process, and it is rarely straightforward. First, you transfer the contents of the legacy structures to the migrated Oracle tables. Second, you copy the contents of the migrated Oracle tables to the target Oracle tables. Along the way, however, you'll discover erroneous data, illegal values, foreign keys whose values don't exist as primary key values--;in short, it's a messy job, but somebody has to do it!

Many methods are available for moving the data. This chapter presents two different tools for accomplishing this task:

Using SQL*Loader

As you read in Chapter 24, "Saving and Loading Data," SQL*Loader is one of the Database Administration Tools that loads the contents of flat or ASCII files into an Oracle database. To use SQL*Loader for data migration, here is a procedure you can follow:

  1. Unload the data from the legacy database into one or more flat files using whatever legacy tools are at your disposal. Remember that the flat files might require a considerable amount of disk storage.
  2. Transfer the flat files to location that is accessible to SQL*Loader. One option is to store the files on a network file server.
  3. Use SQL*Loader to load the contents of each file into the Personal Oracle database.

Using the Data Pipeline in PowerBuilder 4.0/5.0

The data pipeline tool, which was introduced in version 4.0 of PowerBuilder, is flexible and intuitive. Because the data pipeline is a PowerBuilder system object, it has its own attributes, events, and functions; in other words, you can control the creation and execution of a data pipeline from a PowerBuilder application.

To create a data pipeline, click the Data Pipeline icon; it looks like two aqua cylinders connected by a pipe. A window appears from which you can select an existing data pipeline. (See Figure 28.11.) Instead, click New.

Figure 28.11 PowerBuilder: Creating a new data pipeline.

Using its existing database profiles, PowerBuilder displays two lists: a list of source databases and a list of destination databases. Because you're interested in transferring data from the Watcom database to Personal Oracle, select ABNC Sales DB (v4) as the Source Connection and PO7 as the Destination Connection; then click OK. (See Figure 28.12.)

Figure 28.12 Choosing the source and destination database profiles.

Another window appears with a list of the tables that exist in the source connection. Highlight the sales_rep table and click Open. (See Figure 28.13.)

Figure 28.13 Selecting a table for the data pipeline.

PowerBuilder displays a box containing the columns belonging to the sales_rep table. (See Figure 28.14.) By default, no columns are selected for the data pipeline. Click each column to select it; the column name appears at the top of the window in the Selection List. (See Figure 28.15.)

Figure 28.14 The table structure is displayed.
Figure 28.15 Choosing the columns to be included.

Select File [vb] Design. The next window contains the data pipeline description along with execution parameters. (See Figure 28.16.) For example, the Commit field contains the value 100, signifying that a SQL Commit will be performed for every 100 rows that are inserted. To execute the data pipeline, select Options [vb] Execute. When the data pipeline execution finishes, PowerBuilder displays the execution statistics at the bottom of the window. (See Figure 28.17.)

Figure 28.16 The data pipeline options are displayed.
Figure 28.17 The pipeline transfer is finished.

Transforming the Application Logic

Be warned: You cannot buy or build a silver bullet that will automatically and flawlessly translate the legacy application into a client-server application. True, some specialized tools address a specific category of legacy environments, but they can be very expensive; in addition, these tools might impose a methodology that doesn't match the needs of your development environment.

If you merely translate the legacy application logic to the target application development environment, you will have missed an important opportunity to reengineer the application. To provide a simple example, consider a legacy information system that can produce 90 different management reports. Instead of blindly converting each of these reports to the new development environment, take some time to investigate how many of these reports are actually read. The number could be much lower than you expect. In addition, you might need to reengineer some of the reports that are used to make them more useful for the system's users.

Obviously, the first step in migrating the legacy application logic is to become thoroughly familiar with it. Identify the business rules of the legacy application and, where appropriate, use PL/SQL to implement these rules in the form of database triggers and stored programs. Please refer to Chapter 23, "Enforcing Business Rules with Database Triggers," for more information.

Migrating from Another Oracle Database to Personal Oracle

Migrating from another Oracle database--;whether it's from the Oracle Enterprise Server or Oracle Workgroup Server--;is a fairly straightforward process. Use the Export tool to extract the database objects that you want to migrate and use the Import tool to load those database objects into the Personal Oracle database. Be aware that because Personal Oracle doesn't support distributed database operations (except for the Enterprise Edition), you won't be able to use objects such as database links or snapshots in Personal Oracle.

Application Development Environments:
Making an Appropriate Choice

I'm not suggesting that only the tools described in this book are worthy of consideration; other excellent products should also be evaluated. However, the viability of a development tool--;the quality of each new release, its compatibility with new releases of operating systems and databases, and the existence of add-on products from third-party vendors--;depends greatly on its acceptance by the market.

You should consider the following factors in choosing an application development environment:

Summary

The focus of this chapter is database migration.