Several examples in Chapter 6 "Creating Basic Queries," used subqueries to define subsets of data from which to draw records in the main SELECT statement. However, what if you need to combine the results of two separate SELECT statements?
Suppose that it is time to send out invitations to the Tasmanian Annual Appreciation Picnic. You want to send invitations to all current customers (ones who bought something this year), suppliers, and employees.
You have customer information stored in table CUSTOMER.DBF
of the database TASTRADE.DBC. Table 7.1 shows the appropriate
customer fields.
Customer_Id | Character | |
Contact_Name | Character | |
Company_Name | Character | |
Address | Character | |
City | Character | |
Region | Character | |
Postal_Code | Character | |
Country | Character |
Because you want to include only customers who made purchases in the current year, you need to use ORDERS.DBF to identify records for current year purchases. Then match the customer ID field, Customer_Id, with a record in CUSTOMER.DBF. The SQL SELECT statement in the following example extracts the needed records:
SELECT Customer.contact_name, Customer.company_name, Customer.address,; Customer.city, Customer.region, Customer.postal_code, Customer.country; FROM tastrade!customer INNER JOIN tastrade!orders ; ON Customer.customer_id = Orders.customer_id; WHERE YEAR(Orders.order_date) = 1998 ; INTO CURSOR MyResult
TIP |
By fully qualifying the table names, you can successfully run SELECTs from any directory. |
These commands display the records in a browse window. The first thing to observe is the use of an alias name, as defined in the database, rather than the table name when specifying the fields. Instead of repeating the table name with each field, you can use a local alias to reduce the amount of typing and the spelling errors associated with typing SQL SELECT statements. Note that this alias is independent of the work-area alias.
Observe also that the selected records do not appear in any particular order. If you want to see the customers in order by name, you need to add the following clause to the SELECT statement:
ORDER BY Customer.Contact_Name
After putting the records in order, you see that the names of many customers occur several times. You get this result because the SELECT statement includes a record for each customer order in ORDERS.DBF during 1995. Adding the DISTINCT clause at the beginning of the field list includes each customer one time only. The SELECT now looks like the following:
SELECT DISTINCT Customer.contact_name, Customer.company_name, ÂCustomer.address,; Customer.city, Customer.region, Customer.postal_code, Customer.country; FROM tastrade!customer INNER JOIN tastrade!orders ; ON Customer.customer_id = Orders.customer_id; WHERE YEAR(Orders.order_date) = 1995; ORDER BY Customer.contact_name ; INTO CURSOR MyResult
This command generates a list of all active customers in 1995, listed alphabetically by contact name. However, the ORDER BY clause is no longer needed. When you include DISTINCT, SELECT automatically orders the records alphabetically based on the field sequence. As long as you place the fields in order beginning with the first one you want to sort on, you do not need a separate ORDER BY clause.
You can perform the same SELECT with improved performance in yet another way. Rather than use SELECT DISTINCT, select all the records and then include a GROUP BY clause. When SELECT uses DISTINCT, it checks all the fields in the added record to see whether anything has changed. On the other hand, GROUP BY works with the result table and combines records with the same selected group field or fields. In this case, group on CONTACT_NAME. The resulting SELECT, shown in the following example, executes faster:
SELECT Customer.contact_name, Customer.company_name, Customer.address,; Customer.city, Customer.region, Customer.postal_code, Customer.country; FROM tastrade!customer INNER JOIN tastrade!orders ; ON Customer.customer_id = Orders.customer_id; WHERE YEAR(Orders.order_date) = 1995; GROUP BY Customer.contact_name ; INTO CURSOR MyResult
Using similar logic, you can retrieve records of suppliers from
which you purchased products during the year. Table 7.2 shows
the appropriate Supplier fields.
Supplier_Id | ||
Contact_Name | ||
Company_Name | ||
Address | ||
City | ||
Region | ||
Postal_Code | ||
Country |
The required SELECT statement to retrieve supplier names and addresses selects suppliers based on which products Tasmanian Traders sold during the year. Product-supplier information appears in the Products file. Therefore, you need to work from ORDERS, through ORDER_LINE_ITEMS, and then Products to identify the current suppliers. The following SELECT captures this information:
SELECT Supplier.contact_name, Supplier.company_name, ; Supplier.address, Supplier.city, Supplier.region, ; Supplier.postal_code, Supplier.country; FROM tastrade!supplier INNER JOIN tastrade!products; INNER JOIN tastrade!order_line_items; INNER JOIN tastrade!order_line_items ; ON Orders.order_id = Order_line_items.order_id ; ON Products.product_id = Order_line_items.product_id ; ON Supplier.supplier_id = Products.supplier_id; WHERE YEAR( Orders.order_date) = 1995; GROUP BY Supplier.contact_name ; INTO CURSOR MyResult
This SELECT is similar to the customer SELECT, except that it requires several files to determine which suppliers to invite. Both SELECT statements use fields in tables to select records for the result table, even though those fields do not appear in the result. This is not a problem. Selection criteria looks at the source tables, not the result table.
Finally, you need an employee list. Table 7.3 shows the appropriate
Employee fields. Unfortunately, the Employee table includes
only a hire date; it doesn't include a termination date.
Perhaps everyone is so happy working for Tasmanian Traders that
no one ever leaves, and all the employees are model employees.
LAST_NAME | ||
FIRST_NAME | ||
ADDRESS | ||
CITY | ||
REGION | ||
POSTAL_CODE | ||
COUNTRY |
This SELECT is the simplest of the three:
SELECT Employee.last_name, Employee.first_name, ; Employee.address, Employee.city, ; Employee.region, Employee.postal_code, ; Employee.country; FROM tastrade!employee; GROUP BY Employee.last_name, Employee.first_name ; INTO CURSOR MyResult
You can now run each of these three SELECT statements and obtain three separate mailing lists. However, what you really want is a single list. For that job, the UNION clause helps.
UNION combines information from two or more separate SELECT statements in a single cursor or table. To use UNION to combine the result sets of multiple SELECT commands, you must follow these rules:
Observe that the SELECTs for CUSTOMER and SUPPLIER have a contact name of 30 characters, which includes both the first and last names. On the other hand, EMPLOYEE uses a separate field for first and last names, although the sum of their lengths is also 30. Another difference is that CUSTOMER and SUPPLIER have a company field; EMPLOYEE does not.
To combine these result sets, you must reconcile these differences. Listing 7.1 combines the three SELECTs to create a single result table.
* Creates an annual picnic invitation list from customers, * suppliers, and employees for Tasmanian Traders * Create mailing list of employees, suppliers and customers SELECT Customer.contact_name, Customer.company_name, ; Customer.address, Customer.city, ; Customer.region, Customer.postal_code, ; Customer.country; FROM tastrade!customer INNER JOIN tastrade!orders; ON Customer.customer_id = Orders.customer_id; WHERE YEAR(Orders.order_date) = 1995; UNION ; SELECT SPACE(40) AS Company_Name, ; PADR(ALLTRIM(Employee.first_name) + ' ' + ; ALLTRIM(Employee.last_name),30) AS Contact,; Employee.address, Employee.city, Employee.region, ; Employee.postal_code, Employee.country ; FROM tastrade!employee ; UNION ; SELECT Supplier.contact_name, Supplier.company_name, ; Supplier.address, Supplier.city, Supplier.region, ; Supplier.postal_code, Supplier.country ; FROM tastrade!supplier INNER JOIN tastrade!products ; INNER JOIN tastrade!order_line_items ; INNER JOIN tastrade!order_line_items ; ON Orders.order_id = Order_line_items.order_id ; ON Products.product_id = Order_line_items.product_id ; ON Supplier.supplier_id = Products.supplier_id ; WHERE YEAR(Orders.order_date) = 1995 ; INTO CURSOR MyResult
Notice the use of the following clause:
SPACE(40) as Company
This expression is a placeholder that corresponds to the company field in the CUSTOMER and SUPPLIER files. This clause fills the company field in the combined result table with spaces because the EMPLOYEE file has no company field. Without it, Visual FoxPro cannot perform the UNION correctly, and an error occurs.
You must put a placeholder in any SELECT statement in which you need to satisfy the rule that all fields, data types, and widths must match. You can put it in the first SELECT statement of a group to reserve space for a field that exists only in later SELECTs. Alternatively, as in this example, you can include it in a later SELECT to match the columns of the first (master) SELECT.
To ensure that employees are not confused with customers or suppliers, you could replace the preceding clause with the following:
PADR("Tastrade",30) AS Company_Name
Next, the SELECT statement concatenates the employee first- and lastname fields to match the contact names in CUSTOMER and SUPPLIER. It uses ALLTRIM with the employee's first name to remove trailing blanks and uses the PADR() function to produce a field that is of equal length to the other SELECTs. However, a blank must then be added to separate it from the employee's last name. The employee first name can store up to 10 characters, and the last name can have 20 characters, according to the Employee table structure. Thus, with the addition of a blank between these two fields, the total field size might exceed the 30-character limit of CONTACT. In that case, Visual FoxPro truncates the last character.
You can define placeholders for variables of other types as well. The following are some examples:
.T. AS InStock 000.00 AS UnitPrice 00000 AS OnHand {//} AS OrderDate
Observe that in the case of numeric values, the picture used determines the size of the field and number of decimal places.
Sometimes, a field exists in all SELECTs combined with UNIONs, but the field sizes differ. Suppose, for the sake of this example, that the Employee files use a 30-character address field. The Address field in CUSTOMER and SUPPLIER has 60 characters. Although you do not have to do anything when subsequent fields are smaller, you could pad the employee address in the third SELECT with 30 trailing blanks, as follows:
PADR(AEmployee.Address,30) AS Address
TIP |
Make sure that field definition in the first SELECT is large enough for its associated fields in the other SELECTs. If it is not, Visual FoxPro truncates the data. |
NOTE |
A set of SELECT statements combined by multiple instances of UNION is actually one line of code. Therefore, any error occurring anywhere within a set of SELECTs connected with UNION causes the entire statement to fail. Often, you cannot easily determine which SELECT caused the failure. Therefore, consider testing each SELECT individually before combining them with UNION. |
The following are some other considerations in combining SELECTs with UNION:
CAUTION |
Surprisingly, you can put a single ORDER BY or INTO clause in any of the SELECT statements, and Visual FoxPro uses it appropriately for the entire result table. However, if more than one SELECT has an ORDER BY or INTO clause, Visual FoxPro displays the nonspecific error message Unrecognized phrase/keyword in command. |
NOTE |
To sort the output of the company picnic list, you could include the clause ORDER BY Company_Name. However, the program would fail. In individual SELECT statements, you can sort the result table by adding an ORDER BY clause, followed by the name of the field, as follows: |
Perhaps you also noticed that the records in the result set were already sorted by contact name. Whenever you create a UNION between two or more SELECTs, and you do not specify a sort order, Visual FoxPro automatically sorts them, using the selected field order to define a default sort. Thus, the Tasmanian Trader picnic SELECT sorts by contact.
Unlike the basic SELECT statement, which automatically includes duplicate records as long as they match the selection criteria, UNION performs an automatic DISTINCT. This means that it checks each record added to the result table to ensure that no other records in the result table match it exactly, field for field. Obviously, this process takes additional time. To help, VFP places the records in a default sorted order if the code does not specify an order.
If you know that the SELECT statements do not create duplicate records, or you don't care about the duplicates that might be created, you can replace UNION with UNION ALL. Adding ALL eliminates the test for duplicates, thus reducing the overall execution time.
Visual FoxPro uses UNION only between queries, not subqueries. Suppose that you want to see the names of all employees who do not live in a country where you have suppliers or customers. (Pink-slip time!) You might want to perform the following query:
SELECT Em.First_Name, Em.Last_Name ; FROM \VFP\SAMPLES\MAINSAMP\DATA\TASTRADE!EMPLOYEE Em ; WHERE Country NOT IN (SELECT customer.Country ; FROM \VFP\SAMPLES\MAINSAMP\DATA\TASTRADE!CUSTOMER, ; UNION ; SELECT supplier.Country FROM \VFP\SAMPLES\MAINSAMP\DATA\TASTRADE!SUPPLIER) ; INTO CURSOR MyResult
Visual FoxPro does not support this use of UNION. In fact, it generates the error SQL Invalid use of union in subquery. Rather, you need to ask the following:
SELECT Em.First_Name, Em.Last_Name ; FROM EMPLOYEE Em ; WHERE Country NOT IN (SELECT customer.Country) ; FROM \VFP\SAMPLES\MAINSAMP\DATA\TASTRADE!CUSTOMER) ; OR Country NOT IN (SELECT supplier.Country ; FROM \VFP\SAMPLES\MAINSAMP\DATA\TASTRADE!SUPPLIER) ; INTO CURSOR MyResult
Following is a summary of Visual FoxPro's UNION rules:
Actually, you have been creating inner joins for the past chapter and a half. An inner join includes only the records from each table that match a join condition.
The following SQL SELECT command from Visual FoxPro 6 shows the syntax that supports the JOIN argument:
SELECT Customer.contact_name, Customer.company_name, ; Customer.address, Customer.city, ; Customer.region, Customer.postal_code, ; Customer.country; FROM tastrade!customer INNER JOIN tastrade!orders; ON Customer.customer_id = Orders.customer_id ; INTO CURSOR MyResult
These SELECTs include records for customer IDs that exist in both CUSTOMER.DBF and ORDERS.DBF.
In an inner join, Visual FoxPro gets from the first table the records that have at least one corresponding record in the second table.
Suppose that you want to create a SQL SELECT that sums the quantity of each product sold by Tasmanian Traders. The following SELECT counts the sales, using ORDER_LINE_ITEMS:
SELECT Oi.Product_ID, SUM(Oi.Quantity) AS Total_Sales, ; Pr.Product_Name ; FROM \VFP\SAMPLES\MAINSAMP\DATA\TASTRADE!ORDER_LINE_ITEMS Oi, ; \VFP\SAMPLES\MAINSAMP\DATA\TASTRADE!PRODUCTS Pr ; WHERE Oi.Product_ID = Pr.Product_ID ; GROUP BY Oi.Product_ID ; INTO CURSOR MyResult
The problem with this SELECT is that it includes only records for items with sales. You might have products without sales. To include them, you need an outer join.
Outer joins come in three flavors: left outer join, which includes all records for the table on the left side of the join statement and only matching records from the table on the right; right outer join, which includes all records from the table on the right side of the join condition and only matching records from the table on the left; and full outer join, which includes all records from both tables and matches up those that it can.
Left Outer Join One way to implement a left outer join is to perform two SELECTs: one for all records with children and one for all records without children. Then combine the results of these two SELECTs with a UNION. Listing 7.2 creates the necessary list.
NOTE |
To show that this example really does include products without sales, modify the PRODUCTS.DBF table in \VFP\SAMPLES\DATA before running this program. Add a few records with new Product_ID values. Include at least product names (for example, Discontinued Product #1) to help you identify the ones that were added. |
* Creates an outer-join to list all products and their sales. SELECT Oi.Product_ID, SUM(Oi.Quantity) AS TotalSales, ; Pr.Product_Name ; FROM \VFP\SAMPLES\MAINSAMP\DATA\TASTRADE!ORDER_LINE_ITEMS Oi, ; \VFP\SAMPLES\MAINSAMP\DATA\TASTRADE!PRODUCTS Pr ; WHERE Oi.Product_ID = Pr.Product_ID ; GROUP BY Pi.Product_ID ; UNION ALL ; SELECT Pr.Product_ID, 0, Pr.Product_Name ; FROM \VFP\SAMPLES\MAINSAMP\DATA\TASTRADE!PRODUCTS Pr ; WHERE Product_ID NOT IN (SELECT DISTINCT Product_ID ; FROM \VFP\SAMPLES\MAINSAMP\DATA\TASTRADE!ORDER_LINE_ITEMS) ; INTO CURSOR MyResult */ Using a JOIN clause SELECT Products.product_id,; SUM(Order_line_items.quantity) AS totalsales, Products.product_name; FROM tastrade!products LEFT OUTER JOIN tastrade!order_line_items ; ON Products.product_id = Order_line_items.product_id; GROUP BY Products.product_id ; INTO CURSOR MyResult
TIP |
A left outer join should never have an overlap of records. Therefore, use UNION ALL to optimize performance when you're combining the two record sets. |
Right Outer Join A right outer join is required when you want all the records from the table on the right side of the join criteria. The following example shows the use of a right outer join:
SELECT Customer.Cust_id,; SUM(Invoice.Amount) AS totalsales ; FROM Customer RIGHT OUTER JOIN Invoices ; ON Customer.Cust_id = Invoices.Cust_id; GROUP BY Cust_id ; INTO CURSOR MyResult
NOTE |
A right outer join combines all the records that satisfy the relational criteria, as well as all child records without parent records. |
This example will produce a result that includes only the names of customers who have invoices, but it also includes any invoices that have no corresponding customer records.
NOTE |
You might object to the right outer join, saying that sound referential integrity would prevent "orphaned" invoice records. The answer to your objection would lie in the business rules for the system. Some businesses might, for instance, want to record cash sales as invoices without customers. |
Full Outer Join What happens if you want all customers and all invoices, and you want the ones that match to be combined? The answer lies in the full outer join. The following example shows the full outer join syntax:
SELECT Customer.Cust_id,; SUM(Invoice.Amount) AS totalsales ; FROM Customer FULL OUTER JOIN Invoices ; ON Customer.Cust_id = Invoices.Cust_id; GROUP BY Cust_id ; INTO CURSOR MyResult
This type of join will produce a set of records that includes all customers and all invoices, and for those invoices that have customers, the data will be combined into one record.
A self-join is a query that needs to form a relation between two fields in the same table. See the section "Creating Self-Referencing Relations" in Chapter 4 "Advanced Database Management Concepts"; there, you used a self-join in an employee file to find the names of each employee's supervisor. Listing 7.3 shows the necessary SELECT statement to generate the required result.
* Creates a self-join to find the suppliers * of both products 16 and 17. SELECT E1.cEmpId, ; E1.cLastName AS Employee_Last_Name, ; E1.cFirstName AS Employee_First_Name, ; E2.cLastName AS Supervisor_Last_Name, ; E2.cFirstName AS Supervisor_First_Name ; FROM \USINGFOX.300\DATA\PTOFSALE!EMPL2 E1, ; \USINGFOX.300\DATA\PTOFSALE!EMPL2 E2 ; WHERE E1.cSupervisor = E2.cEmpId ; INTO CURSOR MyResult
Observe that to form a self-join, you must open the table more than once, with a different alias for each occurrence. This example opens the file EMPL2.DBF, using the simple character aliases E1 and E2. It then forms a relation between the supervisor ID in one instance and the employee ID in the other.
Another situation that calls for a self-join occurs when you need to find records that match two or more occurrences of the same field. Suppose that you want to know which Tasmanian Traders supplier, if any, provides both product ID 16 and 18. Because this request requires comparing products across multiple records, a simple query will not solve it. Listing 7.4 opens PRODUCT.DBF twice, once to find each product. If the supplier ID for both is the same, it retrieves the name of the supplier for the result table.
* Creates a self-join to find the suppliers of both products 16 and 18. * Ensure the Products.Supplier_Id for both products 16 and 18 are the same. SELECT Pa.Product_Name, Pb.Product_Name, ; Su.Company_Name, Pa.Product_ID, Pb.Product_id ; FROM TASTRADE!PRODUCTS Pa,; TASTRADE!PRODUCTS Pb,; TASTRADE!SUPPLIER Su ; WHERE Pa.Product_ID = ' 16' AND ; Pb.Product_ID = ' 18' AND ; Su.Supplier_ID = Pb.Supplier_ID AND ; Pa.Supplier_ID = Su.Supplier_ID ; INTO CURSOR MyResult * Visual FoxPro 6.0 CLOSE ALL SELECT 0 USE products ALIAS products SELECT 0 USE products ALIAS products_a AGAIN SELECT 0 USE supplier SELECT Products.product_name, Products_a.product_name, ; Supplier.company_name, Products.product_id, Products_a.product_id; FROM tastrade!products INNER JOIN tastrade!supplier; INNER JOIN tastrade!products Products_a ; ON Products.supplier_id ==Supplier.supplier_id ; ON Products.supplier_id == Products_a.supplier_id; WHERE Products.product_id == ' 16'; AND (Products_a.product_id == ' 18'); ORDER BY Products.product_id ; INTO CURSOR MyResult
A poorly designed query can require minutes or even hours to return a result set that a properly designed query can return in seconds. This section examines techniques that improve the performance time of queries.
Rushmore can improve the performance of most queries when used properly. However, many developers don't understand how Rushmore really works; therefore, they create queries that do not perform at their optimal level.
Rushmore uses existing indexes whenever possible to process a query. If an index does not exist, Rushmore creates a "virtual" index in memory for the request. However, creating an index takes more time than using an existing one.
Many developers, knowing that Rushmore uses indexes, try to "help" it along by setting the order of the tables before running the query. In commands that use the Rushmore optimizable FOR clauses, such as BROWSE, LIST, and SCAN, this practice slows the command. It does not turn off Rushmore, however. Rushmore finds the records that match the criteria; then it has to go back to the index to determine the order in which to display them.
When you create queries using SELECT, Visual FoxPro ignores any established order for selected tables. Therefore, you do not need to go back and turn off indexes to benefit from Rushmore. But setting them doesn't help either. You still have to phrase all WHERE clause criteria using Rushmore-optimizable clauses for the best performance.
The main criterion in determining whether a clause is Rushmore-optimizable is whether it exactly matches a current index expression. If it does, Rushmore optimizes the expression. For this reason, many developers create a separate index tag on each field in the table (this process is called inverting the table). Consider this example:
INDEX ON Company_Name TAG company INDEX ON Employee_ID TAG employee
Here, Visual FoxPro must update each index every time it adds, changes, or deletes a record. Alternatively, you can define a concatenated index like the following:
INDEX ON Company_Name + Employee_Id TAG employee
This index performs searches only slightly slower than indexes on individual fields. It reduces the total number of indexes that Visual FoxPro must maintain, however. If you look at your applications, you usually don't need indexes on every field. In fact, only a few indexes might really be necessary to form relations and queries. A good goal is to minimize the total number of indexes while providing an index for every defined need.
On the other hand, if you need individual indexes on Company and Employee ID, don't create a third index on the concatenation of the two; doing so actually slows Rushmore.
Even a seemingly minor change to the index expression can turn off Rushmore. Using the preceding index expression on Company_Name, for example, Rushmore will not optimize the following expression:
UPPER(Company_Name) = 'FRANS'
The function UPPER() invalidates the use of Rushmore. On the other hand, if you know that the application stores all company names in uppercase, Rushmore will optimize the following expression:
Company_Name = UPPER('frans')
TIP |
Don't control case in conditional statements. Use the InputMask or Format properties of the data input objects to control case. |
If the WHERE clause contains several conditions connected with AND, Rushmore optimizes each condition separately. This situation can result in a partially optimized expression when all the expressions connected with AND are Rushmore-optimizable.
Unless you want to display all records in the table regardless of their delete status, you should create an index on DELETED(). Rushmore uses this index to determine which records it can use when SET DELETED ON is set. If Rushmore has to read the table to determine the delete flag on each record, the SELECT cannot be fully optimized.
NOTE |
You might not have realized that deleted records can cause a problem for Rushmore. SET DELETED ON is equivalent to SET FILTER TO NOT DELETED(). Remember that the command SET DELETED ON can be set elsewhere in the current program, a previous program, the FoxPro CONFIG.FPW file, or even interactively. If you want to skip deleted records and want optimal performance from Rushmore, you must have a tag defined on the DELETED() function. |
Rushmore will not use indexes that contain NOT or FOR when optimizing an expression. However, you can use NOT in the condition. For example, you can have a SELECT like the following:
SELECT Company_Name FROM customer WHERE NOT (State='TX')
As long as an index exists on the field STATE, Rushmore will optimize the expression.
Sometimes, you can benefit from previous Rushmore optimization. Suppose that you begin by searching the ORDERS table for all orders in 1995, as follows:
BROWSE FOR YEAR(Order_Date) = 1995
Next, suppose that you need only orders from customer 'FRANS' in 1995:
BROWSE FOR YEAR(Order_Date) = 1995 AND Customer_ID = 'FRANS'
Now, assume that the following indexes exist:
INDEX ON YEAR(Order_Date) TAG year INDEX ON Customer_ID TAG customer
In this case, Rushmore optimizes the first expression, finding all orders for 1995. When it begins the second browse, it recognizes that it already has information about orders in 1995. Therefore, it examines those records only for customer 'FRANS'.
The biggest potential danger with Rushmore is that it creates a solution set for any given optimizable expression only once. If you use a SCAN FOR clause, Rushmore determines which records to process the first time it executes the FOR. In a shared environment, another user could make a change that would affect Rushmore's solution set while SCAN is processing. However, because Rushmore does not check for changes, you might process the wrong records. For the average application, the benefits of using Rushmore outweigh this remote but possible problem. However, you should be aware of it if you have very high transaction rate tables. If you decide that the potential danger is too great, turn optimization off by using the NOOPTIMIZE option on the command you are using to get the data.
Suppose that you need to reference selected records from a single table and need no special column functions, groups, or orders. SELECT creates a special cursor that effectively reopens the table in a different work area and applies a filter to it. It performs this activity almost instantaneously. Therefore, rather than use this code
SELECT orders SCAN FOR Customer_Name = 'TAZMAN' << commands that process each selected order >> ENDSCAN
you might instead use this code
SELECT * FROM orders WHERE Customer_Name = 'TAZMAN' SCAN << commands that process each selected order >> ENDSCAN
NOTE |
Rushmore will not use a UNIQUE index, and you shouldn't, either. In Chapter 3 "Defining Databases, Tables, and Indexes," I explained why UNIQUE indexes are not recommended except for the rare situation in which you might create one, use it, and then erase it. |
All too often, it seems easy to just include the * character in a SELECT statement to include all the fields in a table. However, you should include only the fields that you absolutely need. The most time-consuming aspect of any SELECT is getting the data for the result, and the fewer fields you ask for, the faster the SELECT will get them.
If you want only DISTINCT records, SELECT compares every field in every record to determine whether the new record is distinct. This process consumes time. Reducing the number of fields in the SELECT reduces the number of comparisons. However, a better solution is to use the GROUP BY clause wherever possible.
Another trap is the assumption that forming relations between SELECT results and other existing tables is easier than including all the fields in the SELECT. Actually, the most likely reason for doing so is reluctance to include all the field references. It is true that large multiple-table SELECTs consume a great deal of memory and often need to be stored partially on disk. Further, the more fields included in the SELECT, the more memory it needs or the more disk access time it requires. Of course, any disk access slows a SELECT considerably. You might be tempted to include only those fields that are necessary to uniquely identify the records in the SELECT and then form relations to other physical tables. This solution generally is not good. The main reason is that accessing all the other physical tables to form and access the relations definitely involves slower disk access.
Cross-tabs are special types of queries in which you define column and row headings of a table and calculate results for the intersecting cells. Suppose that you want to create a monthly sales summary, by customer, for Tasmanian Traders. This report needs to display 12 column headings, each one representing a different month. Each row represents a different customer. You probably would not normally store a table with a structure like the following:
CustomerId Character 8 JanuarySales Numeric 8 2 FebruarySales Numeric 8 2 MarchSales Numeric 8 2 AprilSales Numeric 8 2 MaySales Numeric 8 2 JuneSales Numeric 8 2 JulySales Numeric 8 2 AugustSales Numeric 8 2 SeptemberSales Numeric 8 2 OctoberSales Numeric 8 2 NovemberSales Numeric 8 2 DecemberSales Numeric 8 2
Instead, the data file looks like this:
CustomerId Character 8 SalesDate Date 8 SalesAmt Numeric 8 2
How do you get from the first form to the second? Visual FoxPro includes a special program called VFPXTAB, which converts a SELECT cursor that contains the necessary information to a cross-tab table. To gather the necessary information for the cross-tab, you first need a cursor with three fields:
The first SELECT statement in Listing 7.5 gathers the required data for 1994 and stores it in a cursor named MyTab. The second SELECT creates the cross-tab, using the data from MyTab to provide the row, column, and data. Function SYS(2015) generates a unique name for the cursor.
* Creates a cross-tab */ Create the source information for row, column, and data SELECT Or.Customer_id, Pe.Monthid, ; (Oi.Unit_Price*Oi.Quantity) AS Order_Net ; FROM TASTRADE!ORDERS Or, ; TASTRADE!ORDER_LINE_ITEMS Oi, ; PERIOD Pe ; WHERE YEAR(Oi.Order_Date) = 1994 AND ; MONTH(Oi.Order_Date) = Pe.MonthId AND ; Oi.Order_Id = Or.Order_Id ; GROUP BY Oi.Customer_Id, Pe.MonthId ; INTO CURSOR MyTab */ Create the cross-tab SELECT MyTab.customer_i, MyTab.monthid, SUM(MyTab.order_net); FROM MyTab; GROUP BY MyTab.customer_i, MyTab.monthid; ORDER BY MyTab.customer_i, MyTab.monthid; INTO CURSOR MyXTab DO (_GENXTAB) BROWSE NOMODIFY
NOTE |
In Listing 7.5, you might notice the use of _GENXTAB. This system memory variable, which is created by Visual FoxPro, holds the name of the program to use for creating cross-tabs. The default program is VFPXTAB.PRG, but you can replace that program with one you've written by adding the following in the command window: |
This SELECT requires a special table that is not included
with the Tasmanian Trader example. You can quickly create this
table, based on the structure shown in Table 7.4. Table 7.5 shows
the contents of all 12 records. Place this table in directory
\USINGFOX.500\DATA.
MonthID | Numeric | |
MonthName | Character |
January | ||
February | ||
March | ||
April | ||
May | ||
June | ||
July | ||
August | ||
September | ||
October | ||
November | ||
December |
The SELECT in Listing 7.5 creates a cursor with one record for each customer-month combination. It first has to link ORDERS with ORDER_LINE_ITEMS to calculate the product of the Unit_Price and Quantity. It stores this product in the result set field ORDER_NET. The GROUP BY clause then sums ORDER_NET if the customer ordered more than one item in any month.
The result table from this SELECT is not a cross tabulation. However, you can create a cross-tab report with it if you sort and group by customer ID. First, you need to define report variables such as the following:
JanSales = IIF(MonthName = 'JANUARY', Order_Net, 0)
This expression totals sales for January. You need to create 11 similar expressions, defining total sales for other months. Then add these report variables to the customer ID group footer, and set the calculation method to SUM. This report does not need detail lines-only group footers.
This solution works for a report. However, displaying the information in a form, using the results with another table, or generating a graph would not be as easy. For that reason, you need to use VFPXTAB.
VFPXTAB reads the cursor created by the preceding SELECT and determines the number of distinct columns. Next, it creates a new table with the same first column (Customer_Id, in this case) and columns for each distinct value in the cursor's second field. Then it creates one record for each customer and puts the corresponding total sales in the correct column. The net result is a cross-tab table with 13 columns and 1 row for each customer. Now you can directly use this table to generate reports or graphs.
To execute VFPXTAB, you can call it directly, as follows:
DO \VFP\VFPXTAB
This statement assumes that the root directory for Visual FoxPro is \VFP\. If your root directory is different, adjust this statement appropriately. If you have never run VFPXTAB before, Visual FoxPro must first compile it.
VFPXTAB uses, as input, the table in the current work area. Therefore, the table does not have to be named in the command that executes VFPXTAB. However, VFPXTAB has nine other possible parameters. All these parameters have default values and can be omitted. The parameters are as follow:
Parm1 Specifies the output file/cursor name (default: XTAB.DBF) Parm2 Creates a cursor only (default: .F.) Parm3 Closes the input table after use (default: .T.) Parm4 Shows a progress thermometer (default: .T.) Parm5 Specifies the row field (field number to use as row) (default: 1) Parm6 Specifies the column field (field number to use as column) (default: 2) Parm7 Specifies the data field (field number to use as data) (default: 3) Parm8 Calculates the row total ) (default: .F.) Parm9 Indicates the totaling options (0: sum, 1: count, 2: % of total)
The following command creates a cross-tab with one row per customer; then it sums all the customer's sales by month and displays one column per month:
DO \VFP\VFPXTAB WITH 'CUSTSALE.DBF', .F., .F., .T., 1, 2, 3, .T.
The eighth parameter has been set to .T.; thus, it creates an additional column at the end to total all the sales for the preceding columns. In this case, because the SELECT limited records to a single year, this column represents the annual sales to each customer.
When you add 2 as the ninth parameter, the cross-tab calculates the percentage of total sales attributed to each customer during the year. Then you can sort the resulting cross-tab table to display the customers in order of total sales percentage, as follows:
DO \VFP\VFPXTAB WITH 'CUSTSALE.DBF', .F., .F., .T., 1, 2, 3, .T., 2
CAUTION |
A maximum of 254 unique values are allowed for the "columns" side of the cross-tab. |
Views and queries are almost the same thing. The principal difference is that you can use views to update data in the source tables. Also, Visual FoxPro stores views in the database, not as separate .QPR files; as a result, you can access them only while the database is open. Because you can update data in a view, views provide an excellent way to access and edit data stored in multiple related files. They also provide access to remote data on the same machine or on a remote server. When you are working with server data, you do not have to download all the records to your local machine to accomplish a change. The view can retrieve a subset of data, process it, and return it.
This section starts by showing you how to create views of local data. Creating a view is similar to creating a query. You can create a view in two primary ways:
Both methods open the View Designer. The View Designer is similar to the Query Designer, with the addition of an Update Criteria Page. Suppose that you want to create a view between tables CUSTOMER and ORDERS in Tasmanian Traders. Figure 7.1 shows the opening screen of the View Designer, with the Add Table or View dialog box open and ready to add the second table.
The first five pages and the last page in the page frame provide the same options as queries. For further information about how to use these pages, refer to Chapter 6 "Creating Basic Queries."
The first page, shown in Figure 7.2, displays a list of selected fields.
You can add or remove fields from this list by doing the following:
The order in which you select fields becomes the fields' default order in the selected output list. You can easily change this order by clicking and dragging the button to the left of each field name, moving it up or down in the list. Defining the field order is important because it defines the field order in the result set.
Figure 7.3 shows the Properties dialog box that appears when you click the Properties button in the Fields page. In this dialog box, you can specify validation, display, and mapping options, as well as enter a comment for each field in the view.
Figure 7.3 : The View Field Properties dialog box provides field-property options.
On the Join page of the View Designer, you can set up inner, left, right, and full joins when a view or query has more than one table (see Figure 7.4).
There are two check boxes on the Miscellaneous tab of the View Designer. The one labeled No Duplicates creates a DISTINCT result set, and the one labeled Cross Tabulate creates a cross-tab result. Thus, you can use the Query or View Designer to create the necessary intermediate table for VFPXTAB, described in the preceding section. This check box is disabled if the memory variable _GENXTAB is empty (Visual FoxPro, by default, points _GENXTAB to the VFPXTAB program). You also have to select at least three fields.
On the Filter page, you enter the conditions that are expressed in the WHERE clause of the SELECT statement (see Figure 7.5). Fields do not have to be included in the Selected Fields list to be used in filter conditions.
Figure 7.5 : Exclude records from the view by selecting filter conditions.
The Order By page (see Figure 7.6) determines a sort order for the fields. Order criteria enable you to establish a sort order based on selected fields and expressions. You can sort each selected field in ascending or descending order. The order of the fields in the Ordering Criteria list determines the sort hierarchy. To change the sort hierarchy, simply click and drag the button to the left of each field name.
The Group By page (see Figure 7.7) provides an alternative to using No Duplicates.
The section "Using DISTINCT Versus GROUP BY" earlier in this chapter describes the advantage of using GROUP BY over the DISTINCT clause in SQL statements. The Group By page also enables you to add a HAVING clause by clicking the Having button. Remember that the HAVING clause operates on the selected records in the result set. Therefore, you can use GROUP BY to collect and sum records by customer ID. Then you can select customers who made total purchases of more than $1,000 by using a HAVING clause such as the following:
HAVING TotalPurchases > 1000
For a local table view, Visual FoxPro supports only buffering. All views are buffered with optimistic row buffering by default. By using the CURSORSETPROP() function, however, you can change the buffering mode to one of the following values:
1 Pessimistic row buffering on
2 Optimistic row buffering on (Default)
3 Pessimistic table buffering on
4 Optimistic table buffering on
NOTE |
You cannot turn buffering off for a view. |
To use buffering, you must first turn on MULTILOCKS. Then, using the CURSORSETPROP() function, you can select one of the buffering methods. The following code turns on optimistic table buffering for the CUSTOMER table:
SET MULTILOCKS ON CURSORSETPROP('BUFFERING', 5, 'CUSTOMER')
Notice that this function requires three parameters. The first tells CURSORSETPROP() which cursor property you want to change; in the example, we are changing the BUFFERING property. The second is a numeric value to identify the new value for the property; acceptable values depend on the property being changed. The last parameter identifies the alias to be affected.
NOTE |
The Miscellaneous page of the View Designer contains check boxes for the duplicate-records and cross-tabulate options. The cross-tabulate option is enabled when you select three fields. These fields must represent the X axis, the Y axis, and data for the cross-tab. Record selections include all records, a specified number of values, or a percentage of the values that meet the selection criteria. (See Figure 7.8.) |
The Update Criteria page of the View Designer page frame contains options that are specifically related to the capability of views to update the data that they represent. Figure 7.9 shows the options on this page.
Within a view, you can control which tables and fields the user can update and how Visual FoxPro performs the updates. By default, Visual FoxPro prohibits all field updates in the view. To enable updates, first select the table to which the field belongs, using the table combo box in the upper-left corner of this page. You can select individual tables or all tables involved in the view. When you select a table, Visual FoxPro displays its fields in the center list box. Observe that it does not display all the fields from the table-just those that are included in the output set.
If a table has a primary index, and if that field appears in the output set, a check mark appears in the first column, below the key symbol. You can change the field(s) used to uniquely identify records in the original table. However, you should select only the primary-key field or a candidate-key field. These fields must appear in the selected fields.
The Reset Key button immediately below the table combo box returns the key-field status to the original setting used by the source table.
The Update All button places check marks in the second column of the field list, below the pencil icon. This column determines which fields VFP will update. You can select individual fields for updating by clicking the button in this column next to the field name. Conversely, you can click the Update All button and turn off individual fields that you do not want to update.
By default, the Update All button does not select the key field as an updatable field. Although you can mark the key field as updatable by manually clicking its second column, you normally do not want to update the key field. Before enabling updates to a key field, you should define referential-integrity rules by using the RI Builder or by using the append, delete, and update triggers. Then you should enable updates only from the primary table field.
Even if you mark fields as updatable, SQL does not send the updates
back to the original files unless you also select the Send SQL
Updates option. When you enable updates, you must also select
one of the four update options shown in Table 7.6.
Key Fields Only | Tells VFP to use only the key field in finding records to update in the source data. |
Key and Updatable Fields | Tells VFP to use the key field and all the updatable fields, even if they haven't been changed, to locate records to be updated. |
Key and Modified Fields | Tells VFP to use the key and any modifiable fields that have been modified to locate records to be updated. |
Key and Timestamp | Causes the update to fail if the key value or the time stamp in the original table has changed. (Not all database servers support a time stamp.) This option is disabled for local views. |
NOTE |
In all the options in Table 7.6, the original value for the referenced fields is used to find a record to update. Even the Key and Modified option uses the original values of the modified fields. |
Observe that the most severe test in this sequence is the last one, but it is also the safest. It can cause an update to fail even if the only field in the original table to change is one that is not selected as updatable in the current view.
The next-most-severe test checks the key and updatable fields. Usually, this test does not cause a problem because views cannot update nonupdatable fields. Therefore, changes made by other users are safe.
The option that checks only modified fields leaves open the possibility of overwriting changes to updatable fields in the original table. Overwriting occurs if you do not change an updatable field that someone else changes before you send the update back.
A similar problem occurs if you check only whether the key field has changed-the least severe test. Another user could change one of the other updatable fields, and you could easily overwrite these changes. However, if other users have only add or delete rights to a file, this less restrictive update test performs better because it has fewer fields to check for changes.
The last two options in the Update Criteria page determine which technique to use when the original data is updated. SQL can either delete the original records first and then insert a new one or simply update the existing record. The reason for two methods is to enable for the fact that certain database servers are much faster at deleting and inserting records than they are at updating records. This option creates SQL DELETE and SQL INSERT commands to update the data. For local views, you can choose either of these methods, but there is little reason to use the delete and insert method.
Updating the existing record creates a SQL UPDATE command to update the source table. This option preserves changes to fields that are not used in the SQL WHERE clause.
NOTE |
Although you can include a memo field in a view, it is not updatable. |
Creating a SQL View in Code |
You also can create a view directly within your program code by first opening the database and then using the CREATE SQL VIEW command, as follows: |
In all the examples so far, the selection criteria used to limit the selected records used actual values in the sample portion of the criteria. Unfortunately, this approach limits the query to extracting only a fixed set of records each time the query is run, unless you modify the query. You might perform the following query, using the field Region to select records from the table CUSTOMER of TASTRADE:
SELECT CUSTOMER.Customer_Id, CUSTOMER.Company_Name, ; CUSTOMER.Region, CUSTOMER.Max_Order_Amt ; FROM TASTRADE!Customer ; WHERE CUSTOMER.Region = "PA"; INTO CURSOR MyResult
This WHERE clause returns those records for customers from Pennsylvania (PA). Every time you run the query, however, it returns the same records. To change the records that are returned, you have to modify the statement.
This operation might be fine for users who are familiar and comfortable with working in interactive mode. However, suppose that you need to put the query in a program. Further suppose that you want the user to still be able to select from which state to retrieve records.
One solution involves the use of memory variables. You must define a memory variable to hold the state code. Then you need to create a way to enable the user to enter a state code in this variable. The easiest way is to give the user a simple form that asks for the state. Alternatively, you can give the user a list from which he or she can pick valid state codes.
In either case, the next step is for you to redefine the SELECT statement with a memory variable that holds the selection for the criteria's example rather than use a fixed value. The following statement shows the new SELECT, assuming that the user enters the state code in the lcGetState variable:
SELECT CUSTOMER.Customer_Id, CUSTOMER.Company_Name, ; CUSTOMER.Region, CUSTOMER.Max_Order_Amt ; FROM TASTRADE!Customer ; WHERE CUSTOMER.Region = lcGetState ; INTO CURSOR MyResult
You can try this example interactively simply by assigning a value to lcGetState through the Command window before executing the SELECT. In a program, this technique enables the user to customize the records returned by the SELECT.
This method works for both queries and views. However, views support another way to obtain a value from the user.
Fixing the record-selection criteria at design time is not always possible, or even desirable. Neither do you necessarily want to create a separate form just to prompt for criteria values. Yet you still want the user to determine the condition. You can do so relatively easily by using a view.
First, you need to define a view parameter by choosing Query, View Parameters. In the View Parameters dialog box, you can enter any valid variable name, along with its type. Then click OK to save it. In Figure 7.10, the parameter GroupDiscount has been defined as numeric.
After you define the parameter, place the view parameter in the Example box of the appropriate selection criteria. To tell Visual FoxPro that this parameter is a view parameter, precede it with a question mark. Do not put a space between the question mark and the parameter. For example, you enter the GroupDiscount parameter as follows:
?GroupDiscount
A view parameter is a memory variable. The only difference is that when a view is opened, if the variable does not exist, Visual FoxPro will prompt the user for a value to replace the parameter.
When the view runs, a dialog box appears, asking the user to enter a value of the view parameter. VFP then uses the entered value to select records for the view. If you run the view and respond with a value of 2, VFP creates a view of customers who have a 2 percent discount.
TIP |
You might want to use a parameterized view in a form and not want the user to be prompted for the value while the form loads. In this case, you can change the NODATAONLOAD property for the view in the form's data environment to .T., thus not running the SELECT for the view when it is initially opened. Then, at a later time after you have created the variable and assigned the proper value to it, you can use the Requery() function to execute the SELECT statement that makes up the view definition. |
© Copyright, Sams Publishing. All rights reserved.