Today you will learn about joins. This information will enable you to gather and manipulate data across several tables. By the end of the day, you will understand and be able to do the following:
One of the most powerful features of SQL is its capability to gather and manipulate data from across several tables. Without this feature you would have to store all the data elements necessary for each application in one table. Without common tables you would need to store the same data in several tables. Imagine having to redesign, rebuild, and repopulate your tables and databases every time your user needed a query with a new piece of information. The JOIN statement of SQL enables you to design smaller, more specific tables that are easier to maintain than larger tables.
Like Dorothy in The Wizard of Oz, you have had the power to join tables since Day 2, "Introduction to the Query: The SELECT Statement," when you learned about SELECT and FROM. Unlike Dorothy, you don't have to click you heels together three times to perform a join. Use the following two tables, named, cleverly enough, TABLE1 and TABLE2.
NOTE: The queries in today's examples were produced using Borland's ISQL tool. You will notice some differences between these queries and the ones that we used earlier in the book. For example, these queries do not begin with an SQL prompt. Another difference is that ISQL does not require a semicolon at the end of the statement. (The semicolon is optional in ISQL.) But the SQL basics are still the same.
SELECT * FROM TABLE1
ROW REMARKS ========== ======= row 1 Table 1 row 2 Table 1 row 3 Table 1 row 4 Table 1 row 5 Table 1 row 6 Table 1
SELECT * FROM TABLE2
ROW REMARKS ========== ======== row 1 table 2 row 2 table 2 row 3 table 2 row 4 table 2 row 5 table 2 row 6 table 2
To join these two tables, type this:
SELECT * FROM TABLE1,TABLE2
ROW REMARKS ROW REMARKS ========== ========== ========== ======== row 1 Table 1 row 1 table 2 row 1 Table 1 row 2 table 2 row 1 Table 1 row 3 table 2 row 1 Table 1 row 4 table 2 row 1 Table 1 row 5 table 2 row 1 Table 1 row 6 table 2 row 2 Table 1 row 1 table 2 row 2 Table 1 row 2 table 2 row 2 Table 1 row 3 table 2 row 2 Table 1 row 4 table 2 row 2 Table 1 row 5 table 2 row 2 Table 1 row 6 table 2 row 3 Table 1 row 1 table 2 row 3 Table 1 row 2 table 2 row 3 Table 1 row 3 table 2 row 3 Table 1 row 4 table 2 row 3 Table 1 row 5 table 2 row 3 Table 1 row 6 table 2 row 4 Table 1 row 1 table 2 row 4 Table 1 row 2 table 2 row 4 Table 1 row 3 table 2 row 4 Table 1 row 4 table 2 row 4 Table 1 row 5 table 2 row 4 Table 1 row 6 table 2 row 5 Table 1 row 1 table 2 row 5 Table 1 row 2 table 2 row 5 Table 1 row 3 table 2 row 5 Table 1 row 4 table 2 row 5 Table 1 row 5 table 2 row 5 Table 1 row 6 table 2 row 6 Table 1 row 1 table 2 row 6 Table 1 row 2 table 2 row 6 Table 1 row 3 table 2 row 6 Table 1 row 4 table 2 row 6 Table 1 row 5 table 2 row 6 Table 1 row 6 table 2
Thirty-six rows! Where did they come from? And what kind of join is this?
A close examination of the result of your first join shows that each row from TABLE1 was added to each row from TABLE2. An extract from this join shows what happened:
ROW REMARKS ROW REMARKS ===== ========== ========= ======== row 1 Table 1 row 1 table 2 row 1 Table 1 row 2 table 2 row 1 Table 1 row 3 table 2 row 1 Table 1 row 4 table 2 row 1 Table 1 row 5 table 2 row 1 Table 1 row 6 table 2
Notice how each row in TABLE2 was combined with row 1 in TABLE1. Congratulations! You have performed your first join. But what kind of join? An inner join? an outer join? or what? Well, actually this type of join is called a cross-join. A cross-join is not normally as useful as the other joins covered today, but this join does illustrate the basic combining property of all joins: Joins bring tables together.
Suppose you sold parts to bike shops for a living. When you designed your database, you built one big table with all the pertinent columns. Every time you had a new requirement, you added a new column or started a new table with all the old data plus the new data required to create a specific query. Eventually, your database would collapse from its own weight--not a pretty sight. An alternative design, based on a relational model, would have you put all related data into one table. Here's how your customer table would look:
SELECT * FROM CUSTOMER
NAME ADDRESS STATE ZIP PHONE REMARKS ========== ========== ====== ========== ========= ========== TRUE WHEEL 55O HUSKER NE 58702 555-4545 NONE BIKE SPEC CPT SHRIVE LA 45678 555-1234 NONE LE SHOPPE HOMETOWN KS 54678 555-1278 NONE AAA BIKE 10 OLDTOWN NE 56784 555-3421 JOHN-MGR JACKS BIKE 24 EGLIN FL 34567 555-2314 NONE
This table contains all the information you need to describe your customers. The items you sold would go into another table:
SELECT * FROM PART
PARTNUM DESCRIPTION PRICE =========== ==================== =========== 54 PEDALS 54.25 42 SEATS 24.50 46 TIRES 15.25 23 MOUNTAIN BIKE 350.45 76 ROAD BIKE 530.00 10 TANDEM 1200.00
And the orders you take would have their own table:
SELECT * FROM ORDERS
ORDEREDON NAME PARTNUM QUANTITY REMARKS =========== ========== =========== =========== ======= 15-MAY-1996 TRUE WHEEL 23 6 PAID 19-MAY-1996 TRUE WHEEL 76 3 PAID 2-SEP-1996 TRUE WHEEL 10 1 PAID 30-JUN-1996 TRUE WHEEL 42 8 PAID 30-JUN-1996 BIKE SPEC 54 10 PAID 30-MAY-1996 BIKE SPEC 10 2 PAID 30-MAY-1996 BIKE SPEC 23 8 PAID 17-JAN-1996 BIKE SPEC 76 11 PAID 17-JAN-1996 LE SHOPPE 76 5 PAID 1-JUN-1996 LE SHOPPE 10 3 PAID 1-JUN-1996 AAA BIKE 10 1 PAID 1-JUL-1996 AAA BIKE 76 4 PAID 1-JUL-1996 AAA BIKE 46 14 PAID 11-JUL-1996 JACKS BIKE 76 14 PAID
One advantage of this approach is that you can have three specialized people or departments responsible for maintaining their own data. You don't need a database administrator who is conversant with all aspects of your project to shepherd one gigantic, multidepartmental database. Another advantage is that in the age of networks, each table could reside on a different machine. People who understand the data could maintain it, and it could reside on an appropriate machine (rather than that nasty corporate mainframe protected by legions of system administrators).
Now join PARTS and ORDERS:
SELECT O.ORDEREDON, O.NAME, O.PARTNUM, P.PARTNUM, P.DESCRIPTION FROM ORDERS O, PART P ORDEREDON NAME PARTNUM PARTNUM DESCRIPTION =========== ========== =========== ========= ============ 15-MAY-1996 TRUE WHEEL 23 54 PEDALS 19-MAY-1996 TRUE WHEEL 76 54 PEDALS 2-SEP-1996 TRUE WHEEL 10 54 PEDALS 30-JUN-1996 TRUE WHEEL 42 54 PEDALS 30-JUN-1996 BIKE SPEC 54 54 PEDALS 30-MAY-1996 BIKE SPEC 10 54 PEDALS 30-MAY-1996 BIKE SPEC 23 54 PEDALS 17-JAN-1996 BIKE SPEC 76 54 PEDALS 17-JAN-1996 LE SHOPPE 76 54 PEDALS 1-JUN-1996 LE SHOPPE 10 54 PEDALS 1-JUN-1996 AAA BIKE 10 54 PEDALS 1-JUL-1996 AAA BIKE 76 54 PEDALS 1-JUL-1996 AAA BIKE 46 54 PEDALS 11-JUL-1996 JACKS BIKE 76 54 PEDALS ...
The preceding code is just a portion of the result set. The actual set is 14 (number of rows in ORDERS) x 6 (number of rows in PART), or 84 rows. It is similar to the result from joining TABLE1 and TABLE2 earlier today, and it is still one statement shy of being useful. Before we reveal that statement, we need to regress a little and talk about another use for the alias.
When you joined TABLE1 and TABLE2, you used SELECT *, which returned all the columns in both tables. In joining ORDERS to PART, the SELECT statement is a bit more complicated:
SELECT O.ORDEREDON, O.NAME, O.PARTNUM, P.PARTNUM, P.DESCRIPTION
SQL is smart enough to know that ORDEREDON and NAME exist only in ORDERS and that DESCRIPTION exists only in PART, but what about PARTNUM, which exists in both? If you have a column that has the same name in two tables, you must use an alias in your SELECT clause to specify which column you want to display. A common technique is to assign a single character to each table, as you did in the FROM clause:
FROM ORDERS O, PART P
You use that character with each column name, as you did in the preceding SELECT clause. The SELECT clause could also be written like this:
SELECT ORDEREDON, NAME, O.PARTNUM, P.PARTNUM, DESCRIPTION
But remember, someday you might have to come back and maintain this query. It doesn't hurt to make it more readable. Now back to the missing statement.
An extract from the PART/ORDERS join provides a clue as to what is missing:
30-JUN-1996 TRUE WHEEL 42 54 PEDALS 30-JUN-1996 BIKE SPEC 54 54 PEDALS 30-MAY-1996 BIKE SPEC 10 54 PEDALS
Notice the PARTNUM fields that are common to both tables. What if you wrote the following?
SELECT O.ORDEREDON, O.NAME, O.PARTNUM, P.PARTNUM, P.DESCRIPTION FROM ORDERS O, PART P WHERE O.PARTNUM = P.PARTNUM
ORDEREDON NAME PARTNUM PARTNUM DESCRIPTION =========== ========== =========== ========= ============== 1-JUN-1996 AAA BIKE 10 10 TANDEM 30-MAY-1996 BIKE SPEC 10 10 TANDEM 2-SEP-1996 TRUE WHEEL 10 10 TANDEM 1-JUN-1996 LE SHOPPE 10 10 TANDEM 30-MAY-1996 BIKE SPEC 23 23 MOUNTAIN BIKE 15-MAY-1996 TRUE WHEEL 23 23 MOUNTAIN BIKE 30-JUN-1996 TRUE WHEEL 42 42 SEATS 1-JUL-1996 AAA BIKE 46 46 TIRES 30-JUN-1996 BIKE SPEC 54 54 PEDALS 1-JUL-1996 AAA BIKE 76 76 ROAD BIKE 17-JAN-1996 BIKE SPEC 76 76 ROAD BIKE 19-MAY-1996 TRUE WHEEL 76 76 ROAD BIKE 11-JUL-1996 JACKS BIKE 76 76 ROAD BIKE 17-JAN-1996 LE SHOPPE 76 76 ROAD BIKE
Using the column PARTNUM that exists in both of the preceding tables, you have just combined the information you had stored in the ORDERS table with information from the PART table to show a description of the parts the bike shops have ordered from you. The join that was used is called an equi-join because the goal is to match the values of a column in one table to the corresponding values in the second table.
You can further qualify this query by adding more conditions in the WHERE clause. For example:
SELECT O.ORDEREDON, O.NAME, O.PARTNUM, P.PARTNUM, P.DESCRIPTION FROM ORDERS O, PART P WHERE O.PARTNUM = P.PARTNUM AND O.PARTNUM = 76 ORDEREDON NAME PARTNUM PARTNUM DESCRIPTION =========== ========== =========== ========== ============ 1-JUL-1996 AAA BIKE 76 76 ROAD BIKE 17-JAN-1996 BIKE SPEC 76 76 ROAD BIKE 19-MAY-1996 TRUE WHEEL 76 76 ROAD BIKE 11-JUL-1996 JACKS BIKE 76 76 ROAD BIKE 17-JAN-1996 LE SHOPPE 76 76 ROAD BIKE
The number 76 is not very descriptive, and you wouldn't want your sales people to have to memorize a part number. (We have had the misfortune to see many data information systems in the field that require the end user to know some obscure code for something that had a perfectly good name. Please don't write one of those!) Here's another way to write the query:
SELECT O.ORDEREDON, O.NAME, O.PARTNUM, P.PARTNUM, P.DESCRIPTION FROM ORDERS O, PART P WHERE O.PARTNUM = P.PARTNUM AND P.DESCRIPTION = 'ROAD BIKE' ORDEREDON NAME PARTNUM PARTNUM DESCRIPTION =========== ========== =========== ========== ============ 1-JUL-1996 AAA BIKE 76 76 ROAD BIKE 17-JAN-1996 BIKE SPEC 76 76 ROAD BIKE 19-MAY-1996 TRUE WHEEL 76 76 ROAD BIKE 11-JUL-1996 JACKS BIKE 76 76 ROAD BIKE 17-JAN-1996 LE SHOPPE 76 76 ROAD BIKE
Along the same line, take a look at two more tables to see how they can be joined. In this example the employee_id column should obviously be unique. You could have employees with the same name, they could work in the same department, and earn the same salary. However, each employee would have his or her own employee_id. To join these two tables, you would use the employee_id column.
EMPLOYEE_TABLE | EMPLOYEE_PAY_TABLE |
employee_id | employee_id |
last_name | salary |
first_name | department |
middle_name | supervisor |
marital_status |
SELECT E.EMPLOYEE_ID, E.LAST_NAME, EP.SALARY FROM EMPLOYEE_TBL E, EMPLOYEE_PAY_TBL EP WHERE E.EMPLOYEE_ID = EP.EMPLOYEE_ID AND E.LAST_NAME = 'SMITH';
E.EMPLOYEE_ID E.LAST_NAME EP.SALARY
============= =========== ========= 13245 SMITH 35000.00
TIP: When you join two tables without the use of a WHERE clause, you are performing a Cartesian join. This join combines all rows from all the tables in the FROM clause. If each table has 200 rows, then you will end up with 40,000 rows in your results (200 x 200). Always join your tables in the WHERE clause unless you have a real need to join all the rows of all the selected tables.
Back to the original tables. Now you are ready to use all this information about joins to do something really useful: finding out how much money you have made from selling road bikes:
SELECT SUM(O.QUANTITY * P.PRICE) TOTAL FROM ORDERS O, PART P WHERE O.PARTNUM = P.PARTNUM AND P.DESCRIPTION = 'ROAD BIKE' TOTAL =========== 19610.00
With this setup, the sales people can keep the ORDERS table updated, the production department can keep the PART table current, and you can find your bottom line without redesigning your database.
NOTE: Notice the consistent use of table and column aliases in the SQL statement examples. You will save many, many keystrokes by using aliases. They also help to make your statement more readable.
Can you join more than one table? For example, to generate information to send out an invoice, you could type this statement:
SELECT C.NAME, C.ADDRESS, (O.QUANTITY * P.PRICE) TOTAL FROM ORDER O, PART P, CUSTOMER C WHERE O.PARTNUM = P.PARTNUM AND O.NAME = C.NAME NAME ADDRESS TOTAL ========== ========== =========== TRUE WHEEL 55O HUSKER 1200.00 BIKE SPEC CPT SHRIVE 2400.00 LE SHOPPE HOMETOWN 3600.00 AAA BIKE 10 OLDTOWN 1200.00 TRUE WHEEL 55O HUSKER 2102.70 BIKE SPEC CPT SHRIVE 2803.60 TRUE WHEEL 55O HUSKER 196.00 AAA BIKE 10 OLDTOWN 213.50 BIKE SPEC CPT SHRIVE 542.50 TRUE WHEEL 55O HUSKER 1590.00 BIKE SPEC CPT SHRIVE 5830.00 JACKS BIKE 24 EGLIN 7420.00 LE SHOPPE HOMETOWN 2650.00 AAA BIKE 10 OLDTOWN 2120.00
You could make the output more readable by writing the statement like this:
SELECT C.NAME, C.ADDRESS, O.QUANTITY * P.PRICE TOTAL FROM ORDERS O, PART P, CUSTOMER C WHERE O.PARTNUM = P.PARTNUM AND O.NAME = C.NAME ORDER BY C.NAME NAME ADDRESS TOTAL ========== ========== =========== AAA BIKE 10 OLDTOWN 213.50 AAA BIKE 10 OLDTOWN 2120.00 AAA BIKE 10 OLDTOWN 1200.00 BIKE SPEC CPT SHRIVE 542.50 BIKE SPEC CPT SHRIVE 2803.60 BIKE SPEC CPT SHRIVE 5830.00 BIKE SPEC CPT SHRIVE 2400.00 JACKS BIKE 24 EGLIN 7420.00 LE SHOPPE HOMETOWN 2650.00 LE SHOPPE HOMETOWN 3600.00 TRUE WHEEL 55O HUSKER 196.00 TRUE WHEEL 55O HUSKER 2102.70 TRUE WHEEL 55O HUSKER 1590.00 TRUE WHEEL 55O HUSKER 1200.00
NOTE: Notice that when joining the three tables (ORDERS, PART, and CUSTOMER) that the ORDERS table was used in two joins and the other tables were used only once. Tables that will return the fewest rows with the given conditions are commonly referred to as driving tables, or base tables. Tables other than the base table in a query are usually joined to the base table for more efficient data retrieval. Consequently, the ORDERS table is the base table in this example. In most databases a few base tables join (either directly or indirectly) all the other tables. (See Day 15, "Streamlining SQL Statements for Improved Performance," for more on base tables.)
You can make the previous query more specific, thus more useful, by adding the DESCRIPTION column as in the following example:
SELECT C.NAME, C.ADDRESS, O.QUANTITY * P.PRICE TOTAL, P.DESCRIPTION FROM ORDERS O, PART P, CUSTOMER C WHERE O.PARTNUM = P.PARTNUM AND O.NAME = C.NAME ORDER BY C.NAME NAME ADDRESS TOTAL DESCRIPTION ========== ========== =========== ============== AAA BIKE 10 OLDTOWN 213.50 TIRES AAA BIKE 10 OLDTOWN 2120.00 ROAD BIKE AAA BIKE 10 OLDTOWN 1200.00 TANDEM BIKE SPEC CPT SHRIVE 542.50 PEDALS BIKE SPEC CPT SHRIVE 2803.60 MOUNTAIN BIKE BIKE SPEC CPT SHRIVE 5830.00 ROAD BIKE BIKE SPEC CPT SHRIVE 2400.00 TANDEM JACKS BIKE 24 EGLIN 7420.00 ROAD BIKE LE SHOPPE HOMETOWN 2650.00 ROAD BIKE LE SHOPPE HOMETOWN 3600.00 TANDEM TRUE WHEEL 55O HUSKER 196.00 SEATS TRUE WHEEL 55O HUSKER 2102.70 MOUNTAIN BIKE TRUE WHEEL 55O HUSKER 1590.00 ROAD BIKE TRUE WHEEL 55O HUSKER 1200.00 TANDEM
This information is a result of joining three tables. You can now use this information to create an invoice.
NOTE: In the example at the beginning of the day, SQL grouped TABLE1 and TABLE2 to create a new table with X (rows in TABLE1) x Y (rows in TABLE2) number of rows. A physical table is not created by the join, but rather in a virtual sense. The join between the two tables produces a new set that meets all conditions in the WHERE clause, including the join itself. The SELECT statement has reduced the number of rows displayed, but to evaluate the WHERE clause SQL still creates all the possible rows. The sample tables in today's examples have only a handful of rows. Your actual data may have thousands of rows. If you are working on a platform with lots of horsepower, using a multiple-table join might not visibly affect performance. However, if you are working in a slower environment, joins could cause a significant slowdown.We aren't telling you not to use joins, because you have seen the advantages to be gained from a relational design. Just be aware of the platform you are using and your customer's requirements for speed versus reliability.
Because SQL supports an equi-join, you might assume that SQL also has a non-equi-join. You would be right! Whereas the equi-join uses an = sign in the WHERE statement, the non-equi-join uses everything but an = sign. For example:
SELECT O.NAME, O.PARTNUM, P.PARTNUM, O.QUANTITY * P.PRICE TOTAL FROM ORDERS O, PART P WHERE O.PARTNUM > P.PARTNUM
NAME PARTNUM PARTNUM TOTAL ========== =========== =========== =========== TRUE WHEEL 76 54 162.75 BIKE SPEC 76 54 596.75 LE SHOPPE 76 54 271.25 AAA BIKE 76 54 217.00 JACKS BIKE 76 54 759.50 TRUE WHEEL 76 42 73.50 BIKE SPEC 54 42 245.00 BIKE SPEC 76 42 269.50 LE SHOPPE 76 42 122.50 AAA BIKE 76 42 98.00 AAA BIKE 46 42 343.00 JACKS BIKE 76 42 343.00 TRUE WHEEL 76 46 45.75 BIKE SPEC 54 46 152.50 BIKE SPEC 76 46 167.75 LE SHOPPE 76 46 76.25 AAA BIKE 76 46 61.00 JACKS BIKE 76 46 213.50 TRUE WHEEL 76 23 1051.35 TRUE WHEEL 42 23 2803.60 ...
This listing goes on to describe all the rows in the join WHERE O.PARTNUM > P.PARTNUM. In the context of your bicycle shop, this information doesn't have much meaning, and in the real world the equi-join is far more common than the non-equi-join. However, you may encounter an application in which a non-equi-join produces the perfect result.
Just as the non-equi-join balances the equi-join, an outer join complements the inner join. An inner join is where the rows of the tables are combined with each other, producing a number of new rows equal to the product of the number of rows in each table. Also, the inner join uses these rows to determine the result of the WHERE clause. An outer join groups the two tables in a slightly different way. Using the PART and ORDERS tables from the previous examples, perform the following inner join:
SELECT P.PARTNUM, P.DESCRIPTION,P.PRICE, O.NAME, O.PARTNUM FROM PART P JOIN ORDERS O ON ORDERS.PARTNUM = 54
PARTNUM DESCRIPTION PRICE NAME PARTNUM ======= ==================== =========== ========== =========== 54 PEDALS 54.25 BIKE SPEC 54 42 SEATS 24.50 BIKE SPEC 54 46 TIRES 15.25 BIKE SPEC 54 23 MOUNTAIN BIKE 350.45 BIKE SPEC 54 76 ROAD BIKE 530.00 BIKE SPEC 54 10 TANDEM 1200.00 BIKE SPEC 54
NOTE: The syntax you used to get this join--JOIN ON--is not ANSI standard. The implementation you used for this example has additional syntax. You are using it here to specify an inner and an outer join. Most implementations of SQL have similar extensions. Notice the absence of the WHERE clause in this type of join.
The result is that all the rows in PART are spliced on to specific rows in ORDERS where the column PARTNUM is 54. Here's a RIGHT OUTER JOIN statement:
SELECT P.PARTNUM, P.DESCRIPTION,P.PRICE, O.NAME, O.PARTNUM FROM PART P RIGHT OUTER JOIN ORDERS O ON ORDERS.PARTNUM = 54 PARTNUM DESCRIPTION PRICE NAME PARTNUM ======= ==================== ======= ============== ======= <null> <null> <null> TRUE WHEEL 23 <null> <null> <null> TRUE WHEEL 76 <null> <null> <null> TRUE WHEEL 10 <null> <null> <null> TRUE WHEEL 42 54 PEDALS 54.25 BIKE SPEC 54 42 SEATS 24.50 BIKE SPEC 54 46 TIRES 15.25 BIKE SPEC 54 23 MOUNTAIN BIKE 350.45 BIKE SPEC 54 76 ROAD BIKE 530.00 BIKE SPEC 54 10 TANDEM 1200.00 BIKE SPEC 54 <null> <null> <null> BIKE SPEC 10 <null> <null> <null> BIKE SPEC 23 <null> <null> <null> BIKE SPEC 76 <null> <null> <null> LE SHOPPE 76 <null> <null> <null> LE SHOPPE 10 <null> <null> <null> AAA BIKE 10 <null> <null> <null> AAA BIKE 76 <null> <null> <null> AAA BIKE 46 <null> <null> <null> JACKS BIKE 76
This type of query is new. First you specified a RIGHT OUTER JOIN, which caused SQL to return a full set of the right table, ORDERS, and to place nulls in the fields where ORDERS.PARTNUM <> 54. Following is a LEFT OUTER JOIN statement:
SELECT P.PARTNUM, P.DESCRIPTION,P.PRICE, O.NAME, O.PARTNUM FROM PART P LEFT OUTER JOIN ORDERS O ON ORDERS.PARTNUM = 54 PARTNUM DESCRIPTION PRICE NAME PARTNUM ======= ================== =========== ========== =========== 54 PEDALS 54.25 BIKE SPEC 54 42 SEATS 24.50 BIKE SPEC 54 46 TIRES 15.25 BIKE SPEC 54 23 MOUNTAIN BIKE 350.45 BIKE SPEC 54 76 ROAD BIKE 530.00 BIKE SPEC 54 10 TANDEM 1200.00 BIKE SPEC 54
You get the same six rows as the INNER JOIN. Because you specified LEFT (the LEFT table), PART determined the number of rows you would return. Because PART is smaller than ORDERS, SQL saw no need to pad those other fields with blanks.
Don't worry too much about inner and outer joins. Most SQL products determine the optimum JOIN for your query. In fact, if you are placing your query into a stored procedure (or using it inside a program (both stored procedures and Embedded SQL covered on Day 13, "Advanced SQL Topics"), you should not specify a join type even if your SQL implementation provides the proper syntax. If you do specify a join type, the optimizer chooses your way instead of the optimum way.
Some implementations of SQL use the + sign instead of an OUTER JOIN statement. The + simply means "Show me everything even if something is missing." Here's the syntax:
SQL> select e.name, e.employee_id, ep.salary, ep.marital_status from e,ployee_tbl e, employee_pay_tbl ep where e.employee_id = ep.employee_id(+) and e.name like '%MITH';
This statement is joining the two tables. The + sign on the ep.employee_id column will return all rows even if they are empty.
Today's final topic is the often-used technique of joining a table to itself. The syntax of this operation is similar to joining two tables. For example, to join table TABLE1 to itself, type this:
SELECT * FROM TABLE1, TABLE1
ROW REMARKS ROW REMARKS ========== ========== ========== ======== row 1 Table 1 row 1 Table 1 row 1 Table 1 row 2 Table 1 row 1 Table 1 row 3 Table 1 row 1 Table 1 row 4 Table 1 row 1 Table 1 row 5 Table 1 row 1 Table 1 row 6 Table 1 row 2 Table 1 row 1 Table 1 row 2 Table 1 row 2 Table 1 row 2 Table 1 row 3 Table 1 row 2 Table 1 row 4 Table 1 row 2 Table 1 row 5 Table 1 row 2 Table 1 row 6 Table 1 row 3 Table 1 row 1 Table 1 row 3 Table 1 row 2 Table 1 row 3 Table 1 row 3 Table 1 row 3 Table 1 row 4 Table 1 row 3 Table 1 row 5 Table 1 row 3 Table 1 row 6 Table 1 row 4 Table 1 row 1 Table 1 row 4 Table 1 row 2 Table 1 ...
In its complete form, this join produces the same number of combinations as joining two 6-row tables. This type of join could be useful to check the internal consistency of data. What would happen if someone fell asleep in the production department and entered a new part with a PARTNUM that already existed? That would be bad news for everybody: Invoices would be wrong; your application would probably blow up; and in general you would be in for a very bad time. And the cause of all your problems would be the duplicate PARTNUM in the following table:
SELECT * FROM PART PARTNUM DESCRIPTION PRICE =========== ==================== =========== 54 PEDALS 54.25 42 SEATS 24.50 46 TIRES 15.25 23 MOUNTAIN BIKE 350.45 76 ROAD BIKE 530.00 10 TANDEM 1200.00 76 CLIPPLESS SHOE 65.00 <-NOTE SAME #
You saved your company from this bad situation by checking PART before anyone used it:
SELECT F.PARTNUM, F.DESCRIPTION, S.PARTNUM,S.DESCRIPTION FROM PART F, PART S WHERE F.PARTNUM = S.PARTNUM AND F.DESCRIPTION <> S.DESCRIPTION PARTNUM DESCRIPTION PARTNUM DESCRIPTION ========== ======================== ======= ============ 76 ROAD BIKE 76 CLIPPLESS SHOE 76 CLIPPLESS SHOE 76 ROAD BIKE
Now you are a hero until someone asks why the table has only two entries. You, remembering what you have learned about JOINs, retain your hero status by explaining how the join produced two rows that satisfied the condition WHERE F.PARTNUM = S.PARTNUM AND F.DESCRIPTION <> S.DESCRIPTION. Of course, at some point, the row of data containing the duplicate PARTNUM would have to be corrected.
Today you learned that a join combines all possible combinations of rows present in the selected tables. These new rows are then available for selection based on the information that you want.
Congratulations--you have learned almost everything there is to know about the SELECT clause. The one remaining item, subqueries, is covered tomorrow (Day 7, "Subqueries: The Embedded SELECT Statement").
A A little knowledge is a dangerous thing, and no knowledge can be expensive. You now know enough to understand the basics of what your SQL engine might try while optimizing you queries.
Q How many tables can you join on?
A That depends on the implementation. Some implementations have a 25-table limit, whereas others have no limit. Just remember, the more tables you join on, the slower the response time will be. To be safe, check your implementation to find out the maximum number of tables allowed in a query.
Q Would it be fair to say that when tables are joined, they actually become one table?
A Very simply put, that is just about what happens. When you join the tables, you can select from any of the columns in either table.
The Workshop provides quiz questions to help solidify your understanding of the material covered, as well as exercises to provide you with experience in using what you have learned. Try to answer the quiz and exercise questions before checking the answers in Appendix F, "Answers to Quizzes and Exercises."
2. What type of join appears in the following SELECT statement?
select e.name, e.employee_id, ep.salary from employee_tbl e, employee_pay_tbl ep where e.employee_id = ep.employee_id;
a. select name, employee_id, salary
from employee_tbl e,
employee_pay_tbl ep
where employee_id = employee_id
and name like '%MITH';
b. select e.name, e.employee_id, ep.salary
from employee_tbl e,
employee_pay_tbl ep
where name like '%MITH';
c. select e.name, e.employee_id, ep.salary
from employee_tbl e,
employee_pay_tbl ep
where e.employee_id = ep.employee_id
and e.name like '%MITH';
5. In joining tables are you limited to one-column joins, or can you join on more than one column?
2. Rewrite the following query to make it more readable and shorter.
select orders.orderedon, orders.name, part.partnum, part.price, part.description from orders, part where orders.partnum = part.partnum and orders.orderedon between '1-SEP-96' and '30-SEP-96' order by part.partnum;
ORDEREDON NAME PARTNUM QUANTITY ================== ================== ======= ======== 2-SEP-96 TRUE WHEEL 10 1
© Copyright, Macmillan Computer Publishing. All rights reserved.