Today we talk about functions. Functions in SQL enable you to perform feats such as determining the sum of a column or converting all the characters of a string to uppercase. By the end of the day, you will understand and be able to use all the following:
These functions greatly increase your ability to manipulate the information you retrieved using the basic functions of SQL that were described earlier this week. The first five aggregate functions, COUNT, SUM, AVG, MAX, and MIN, are defined in the ANSI standard. Most implementations of SQL have extensions to these aggregate functions, some of which are covered today. Some implementations may use different names for these functions.
These functions are also referred to as group functions. They return a value based on the values in a column. (After all, you wouldn't ask for the average of a single field.) The examples in this section use the table TEAMSTATS:
SQL> SELECT * FROM TEAMSTATS;
NAME POS AB HITS WALKS SINGLES DOUBLES TRIPLES HR SO --------- --- --- ---- ----- ------- ------- ------- -- -- JONES 1B 145 45 34 31 8 1 5 10 DONKNOW 3B 175 65 23 50 10 1 4 15 WORLEY LF 157 49 15 35 8 3 3 16 DAVID OF 187 70 24 48 4 0 17 42 HAMHOCKER 3B 50 12 10 10 2 0 0 13 CASEY DH 1 0 0 0 0 0 0 1 6 rows selected.
The function COUNT returns the number of rows that satisfy the condition in the WHERE clause. Say you wanted to know how many ball players were hitting under 350. You would type
SQL> SELECT COUNT(*) 2 FROM TEAMSTATS 3 WHERE HITS/AB < .35; COUNT(*) -------- 4
To make the code more readable, try an alias:
SQL> SELECT COUNT(*) NUM_BELOW_350 2 FROM TEAMSTATS 3 WHERE HITS/AB < .35; NUM_BELOW_350 ------------- 4
Would it make any difference if you tried a column name instead of the asterisk? (Notice the use of parentheses around the column names.) Try this:
SQL> SELECT COUNT(NAME) NUM_BELOW_350 2 FROM TEAMSTATS 3 WHERE HITS/AB < .35; NUM_BELOW_350 ------------- 4
The answer is no. The NAME column that you selected was not involved in the WHERE statement. If you use COUNT without a WHERE clause, it returns the number of records in the table.
SQL> SELECT COUNT(*) 2 FROM TEAMSTATS; COUNT(*) --------- 6
SUM does just that. It returns the sum of all values in a column. To find out how many singles have been hit, type
SQL> SELECT SUM(SINGLES) TOTAL_SINGLES 2 FROM TEAMSTATS;
TOTAL_SINGLES ------------- 174
To get several sums, use
SQL> SELECT SUM(SINGLES) TOTAL_SINGLES, SUM(DOUBLES) TOTAL_DOUBLES, SUM(TRIPLES) TOTAL_TRIPLES, SUM(HR) TOTAL_HR 2 FROM TEAMSTATS; TOTAL_SINGLES TOTAL_DOUBLES TOTAL_TRIPLES TOTAL_HR ------------- ------------- ------------- -------- 174 32 5 29
To collect similar information on all 300 or better players, type
SQL> SELECT SUM(SINGLES) TOTAL_SINGLES, SUM(DOUBLES) TOTAL_DOUBLES, SUM(TRIPLES) TOTAL_TRIPLES, SUM(HR) TOTAL_HR 2 FROM TEAMSTATS 3 WHERE HITS/AB >= .300; TOTAL_SINGLES TOTAL_DOUBLES TOTAL_TRIPLES TOTAL_HR ------------- ------------- ------------- -------- 164 30 5 29
To compute a team batting average, type
SQL> SELECT SUM(HITS)/SUM(AB) TEAM_AVERAGE 2 FROM TEAMSTATS; TEAM_AVERAGE ------------ .33706294
SUM works only with numbers. If you try it on a nonnumerical field, you get
SQL> SELECT SUM(NAME) 2 FROM TEAMSTATS; ERROR: ORA-01722: invalid number no rows selected
This error message is logical because you cannot sum a group of names.
The AVG function computes the average of a column. To find the average number of strike outs, use this:
SQL> SELECT AVG(SO) AVE_STRIKE_OUTS 2 FROM TEAMSTATS;
AVE_STRIKE_OUTS --------------- 16.166667
The following example illustrates the difference between SUM and AVG:
SQL> SELECT AVG(HITS/AB) TEAM_AVERAGE 2 FROM TEAMSTATS; TEAM_AVERAGE ------------ .26803448
The team was batting over 300 in the previous example! What happened? AVG computed the average of the combined column hits divided by at bats, whereas the example with SUM divided the total number of hits by the number of at bats. For example, player A gets 50 hits in 100 at bats for a .500 average. Player B gets 0 hits in 1 at bat for a 0.0 average. The average of 0.0 and 0.5 is .250. If you compute the combined average of 50 hits in 101 at bats, the answer is a respectable .495. The following statement returns the correct batting average:
SQL> SELECT AVG(HITS)/AVG(AB) TEAM_AVERAGE 2 FROM TEAMSTATS;
TEAM_AVERAGE ------------ .33706294
Like the SUM function, AVG works only with numbers.
If you want to find the largest value in a column, use MAX. For example, what is the highest number of hits?
SQL> SELECT MAX(HITS) 2 FROM TEAMSTATS;
MAX(HITS) --------- 70
Can you find out who has the most hits?
SQL> SELECT NAME 2 FROM TEAMSTATS 3 WHERE HITS = MAX(HITS); ERROR at line 3: ORA-00934: group function is not allowed here
Unfortunately, you can't. The error message is a reminder that this group function (remember that aggregate functions are also called group functions) does not work in the WHERE clause. Don't despair, Day 7, "Subqueries: The Embedded SELECT Statement," covers the concept of subqueries and explains a way to find who has the MAX hits.
What happens if you try a nonnumerical column?
SQL> SELECT MAX(NAME) 2 FROM TEAMSTATS;
MAX(NAME) --------------- WORLEY
Here's something new. MAX returns the highest (closest to Z) string. Finally, a function that works with both characters and numbers.
MIN does the expected thing and works like MAX except it returns the lowest member of a column. To find out the fewest at bats, type
SQL> SELECT MIN(AB) 2 FROM TEAMSTATS;
MIN(AB) --------- 1
The following statement returns the name closest to the beginning of the alphabet:
SQL> SELECT MIN(NAME) 2 FROM TEAMSTATS; MIN(NAME) --------------- CASEY
You can combine MIN with MAX to give a range of values. For example:
SQL> SELECT MIN(AB), MAX(AB) 2 FROM TEAMSTATS; MIN(AB) MAX(AB) -------- -------- 1 187
This sort of information can be useful when using statistical functions.
NOTE: As we mentioned in the introduction, the first five aggregate functions are described in the ANSI standard. The remaining aggregate functions have become de facto standards, present in all important implementations of SQL. We use the Oracle7 names for these functions. Other implementations may use different names.
VARIANCE produces the square of the standard deviation, a number vital to many statistical calculations. It works like this:
SQL> SELECT VARIANCE(HITS) 2 FROM TEAMSTATS;
VARIANCE(HITS) -------------- 802.96667
If you try a string
SQL> SELECT VARIANCE(NAME) 2 FROM TEAMSTATS; ERROR: ORA-01722: invalid number no rows selected
you find that VARIANCE is another function that works exclusively with numbers.
The final group function, STDDEV, finds the standard deviation of a column of numbers, as demonstrated by this example:
SQL> SELECT STDDEV(HITS) 2 FROM TEAMSTATS;
STDDEV(HITS) ------------ 28.336666
It also returns an error when confronted by a string:
SQL> SELECT STDDEV(NAME) 2 FROM TEAMSTATS; ERROR: ORA-01722: invalid number no rows selected
These aggregate functions can also be used in various combinations:
SQL> SELECT COUNT(AB), 2 AVG(AB), 3 MIN(AB), 4 MAX(AB), 5 STDDEV(AB), 6 VARIANCE(AB), 7 SUM(AB) 8 FROM TEAMSTATS; COUNT(AB) AVG(AB) MIN(AB) MAX(AB) STDDEV(AB) VARIANCE(AB) SUM(AB) --------- ------- ------- ------- ---------- ------------ ------- 6 119.167 1 187 75.589 5712.97 715
The next time you hear a sportscaster use statistics to fill the time between plays, you will know that SQL is at work somewhere behind the scenes.
We live in a civilization governed by times and dates, and most major implementations of SQL have functions to cope with these concepts. This section uses the table PROJECT to demonstrate the time and date functions.
SQL> SELECT * FROM PROJECT;
TASK STARTDATE ENDDATE -------------- --------- --------- KICKOFF MTG 01-APR-95 01-APR-95 TECH SURVEY 02-APR-95 01-MAY-95 USER MTGS 15-MAY-95 30-MAY-95 DESIGN WIDGET 01-JUN-95 30-JUN-95 CODE WIDGET 01-JUL-95 02-SEP-95 TESTING 03-SEP-95 17-JAN-96
6 rows selected.
NOTE: This table used the Date data type. Most implementations of SQL have a Date data type, but the exact syntax may vary.
This function adds a number of months to a specified date. For example, say something extraordinary happened, and the preceding project slipped to the right by two months. You could make a new schedule by typing
SQL> SELECT TASK, 2 STARTDATE, 3 ENDDATE ORIGINAL_END, 4 ADD_MONTHS(ENDDATE,2) 5 FROM PROJECT;
TASK STARTDATE ORIGINAL_ ADD_MONTH -------------- --------- --------- --------- KICKOFF MTG 01-APR-95 01-APR-95 01-JUN-95 TECH SURVEY 02-APR-95 01-MAY-95 01-JUL-95 USER MTGS 15-MAY-95 30-MAY-95 30-JUL-95 DESIGN WIDGET 01-JUN-95 30-JUN-95 31-AUG-95 CODE WIDGET 01-JUL-95 02-SEP-95 02-NOV-95 TESTING 03-SEP-95 17-JAN-96 17-MAR-96 6 rows selected.
Not that a slip like this is possible, but it's nice to have a function that makes it so easy. ADD_MONTHS also works outside the SELECT clause. Typing
SQL> SELECT TASK TASKS_SHORTER_THAN_ONE_MONTH 2 FROM PROJECT 3 WHERE ADD_MONTHS(STARTDATE,1) > ENDDATE;
produces the following result:
TASKS_SHORTER_THAN_ONE_MONTH ---------------------------- KICKOFF MTG TECH SURVEY USER MTGS DESIGN WIDGET
You will find that all the functions in this section work in more than one place. However, ADD MONTHS does not work with other data types like character or number without the help of functions TO_CHAR and TO_DATE, which are discussed later today.
LAST_DAY returns the last day of a specified month. It is for those of us who haven't mastered the "Thirty days has September..." rhyme--or at least those of us who have not yet taught it to our computers. If, for example, you need to know what the last day of the month is in the column ENDDATE, you would type
SQL> SELECT ENDDATE, LAST_DAY(ENDDATE) 2 FROM PROJECT;
Here's the result:
ENDDATE LAST_DAY(ENDDATE) --------- ----------------- 01-APR-95 30-APR-95 01-MAY-95 31-MAY-95 30-MAY-95 31-MAY-95 30-JUN-95 30-JUN-95 02-SEP-95 30-SEP-95 17-JAN-96 31-JAN-96
6 rows selected.
How does LAST DAY handle leap years?
SQL> SELECT LAST_DAY('1-FEB-95') NON_LEAP, 2 LAST_DAY('1-FEB-96') LEAP 3 FROM PROJECT; NON_LEAP LEAP --------- --------- 28-FEB-95 29-FEB-96 28-FEB-95 29-FEB-96 28-FEB-95 29-FEB-96 28-FEB-95 29-FEB-96 28-FEB-95 29-FEB-96 28-FEB-95 29-FEB-96
6 rows selected.
You got the right result, but why were so many rows returned? Because you didn't specify an existing column or any conditions, the SQL engine applied the date functions in the statement to each existing row. Let's get something less redundant by using the following:
SQL> SELECT DISTINCT LAST_DAY('1-FEB-95') NON_LEAP, 2 LAST_DAY('1-FEB-96') LEAP 3 FROM PROJECT;
This statement uses the word DISTINCT (see Day 2, "Introduction to the Query: The SELECT Statement") to produce the singular result
NON_LEAP LEAP --------- --------- 28-FEB-95 29-FEB-96
Unlike me, this function knows which years are leap years. But before you trust your own or your company's financial future to this or any other function, check your implementation!
If you need to know how many months fall between month x and month y, use MONTHS_BETWEEN like this:
SQL> SELECT TASK, STARTDATE, ENDDATE,MONTHS_BETWEEN(STARTDATE,ENDDATE) DURATION 2 FROM PROJECT;
TASK STARTDATE ENDDATE DURATION -------------- --------- --------- --------- KICKOFF MTG 01-APR-95 01-APR-95 0 TECH SURVEY 02-APR-95 01-MAY-95 -.9677419 USER MTGS 15-MAY-95 30-MAY-95 -.483871 DESIGN WIDGET 01-JUN-95 30-JUN-95 -.9354839 CODE WIDGET 01-JUL-95 02-SEP-95 -2.032258 TESTING 03-SEP-95 17-JAN-96 -4.451613
6 rows selected.
Wait a minute--that doesn't look right. Try this:
SQL> SELECT TASK, STARTDATE, ENDDATE, 2 MONTHS_BETWEEN(ENDDATE,STARTDATE) DURATION 3 FROM PROJECT; TASK STARTDATE ENDDATE DURATION -------------- --------- --------- --------- KICKOFF MTG 01-APR-95 01-APR-95 0 TECH SURVEY 02-APR-95 01-MAY-95 .96774194 USER MTGS 15-MAY-95 30-MAY-95 .48387097 DESIGN WIDGET 01-JUN-95 30-JUN-95 .93548387 CODE WIDGET 01-JUL-95 02-SEP-95 2.0322581 TESTING 03-SEP-95 17-JAN-96 4.4516129
6 rows selected.
That's better. You see that MONTHS_BETWEEN is sensitive to the way you order the months. Negative months might not be bad. For example, you could use a negative result to determine whether one date happened before another. For example, the following statement shows all the tasks that started before May 19, 1995:
SQL> SELECT * 2 FROM PROJECT 3 WHERE MONTHS_BETWEEN('19 MAY 95', STARTDATE) > 0;
TASK STARTDATE ENDDATE -------------- --------- --------- KICKOFF MTG 01-APR-95 01-APR-95 TECH SURVEY 02-APR-95 01-MAY-95 USER MTGS 15-MAY-95 30-MAY-95
If you need to adjust the time according to the time zone you are in, the New_TIME function is for you. Here are the time zones you can use with this function:
Abbreviation | Time Zone |
AST or ADT | Atlantic standard or daylight time |
BST or BDT | Bering standard or daylight time |
CST or CDT | Central standard or daylight time |
EST or EDT | Eastern standard or daylight time |
GMT | Greenwich mean time |
HST or HDT | Alaska-Hawaii standard or daylight time |
MST or MDT | Mountain standard or daylight time |
NST | Newfoundland standard time |
PST or PDT | Pacific standard or daylight time |
YST or YDT | Yukon standard or daylight time |
You can adjust your time like this:
SQL> SELECT ENDDATE EDT, 2 NEW_TIME(ENDDATE, 'EDT','PDT') 3 FROM PROJECT;
EDT NEW_TIME(ENDDATE ---------------- ---------------- 01-APR-95 1200AM 31-MAR-95 0900PM 01-MAY-95 1200AM 30-APR-95 0900PM 30-MAY-95 1200AM 29-MAY-95 0900PM 30-JUN-95 1200AM 29-JUN-95 0900PM 02-SEP-95 1200AM 01-SEP-95 0900PM 17-JAN-96 1200AM 16-JAN-96 0900PM
6 rows selected.
Like magic, all the times are in the new time zone and the dates are adjusted.
NEXT_DAY finds the name of the first day of the week that is equal to or later than another specified date. For example, to send a report on the Friday following the first day of each event, you would type
SQL> SELECT STARTDATE, 2 NEXT_DAY(STARTDATE, 'FRIDAY') 3 FROM PROJECT;
which would return
STARTDATE NEXT_DAY( --------- --------- 01-APR-95 07-APR-95 02-APR-95 07-APR-95 15-MAY-95 19-MAY-95 01-JUN-95 02-JUN-95 01-JUL-95 07-JUL-95 03-SEP-95 08-SEP-95
6 rows selected.
The output tells you the date of the first Friday that occurs after your STARTDATE.
SYSDATE returns the system time and date:
SQL> SELECT DISTINCT SYSDATE 2 FROM PROJECT;
SYSDATE ---------------- 18-JUN-95 1020PM
If you wanted to see where you stood today in a certain project, you could type
SQL> SELECT * 2 FROM PROJECT 3 WHERE STARTDATE > SYSDATE; TASK STARTDATE ENDDATE -------------- --------- --------- CODE WIDGET 01-JUL-95 02-SEP-95 TESTING 03-SEP-95 17-JAN-96
Now you can see what parts of the project start after today.
Many of the uses you have for the data you retrieve involve mathematics. Most implementations of SQL provide arithmetic functions similar to the functions covered here. The examples in this section use the NUMBERS table:
SQL> SELECT * 2 FROM NUMBERS;
A B --------- --------- 3.1415 4 -45 .707 5 9 -57.667 42 15 55 -7.2 5.3 6 rows selected.
The ABS function returns the absolute value of the number you point to. For example:
SQL> SELECT ABS(A) ABSOLUTE_VALUE 2 FROM NUMBERS;
ABSOLUTE_VALUE -------------- 3.1415 45 5 57.667 15 7.2 6 rows selected.
ABS changes all the negative numbers to positive and leaves positive numbers alone.
CEIL returns the smallest integer greater than or equal to its argument. FLOOR does just the reverse, returning the largest integer equal to or less than its argument. For example:
SQL> SELECT B, CEIL(B) CEILING 2 FROM NUMBERS;
B CEILING --------- --------- 4 4 .707 1 9 9 42 42 55 55 5.3 6
6 rows selected.
And
SQL> SELECT A, FLOOR(A) FLOOR 2 FROM NUMBERS; A FLOOR --------- --------- 3.1415 3 -45 -45 5 5 -57.667 -58 15 15 -7.2 -8
6 rows selected.
The COS, SIN, and TAN functions provide support for various trigonometric concepts. They all work on the assumption that n is in radians. The following statement returns some unexpected values if you don't realize COS expects A to be in radians.
SQL> SELECT A, COS(A) 2 FROM NUMBERS;
A COS(A) --------- --------- 3.1415 -1 -45 .52532199 5 .28366219 -57.667 .437183 15 -.7596879 -7.2 .60835131
You would expect the COS of 45 degrees to be in the neighborhood of .707, not .525. To make this function work the way you would expect it to in a degree-oriented world, you need to convert degrees to radians. (When was the last time you heard a news broadcast report that a politician had done a pi-radian turn? You hear about a 180-degree turn.) Because 360 degrees = 2 pi radians, you can write
SQL> SELECT A, COS(A* 0.01745329251994) 2 FROM NUMBERS;
A COS(A*0.01745329251994) --------- ----------------------- 3.1415 .99849724 -45 .70710678 5 .9961947 -57.667 .5348391 15 .96592583 -7.2 .9921147
Note that the number 0.01745329251994 is radians divided by degrees. The trigonometric functions work as follows:
SQL> SELECT A, COS(A*0.017453), COSH(A*0.017453) 2 FROM NUMBERS;
A COS(A*0.017453) COSH(A*0.017453) --------- --------------- ---------------- 3.1415 .99849729 1.0015035 -45 .70711609 1.3245977 5 .99619483 1.00381 -57.667 .53485335 1.5507072 15 .96592696 1.0344645 -7.2 .99211497 1.0079058
6 rows selected.
And
SQL> SELECT A, SIN(A*0.017453), SINH(A*0.017453) 2 FROM NUMBERS;
A SIN(A*0.017453) SINH(A*0.017453) --------- --------------- ---------------- 3.1415 .05480113 .05485607 -45 -.7070975 -.8686535 5 .08715429 .0873758 -57.667 -.8449449 -1.185197 15 .25881481 .26479569 -7.2 -.1253311 -.1259926
6 rows selected.
And
SQL> SELECT A, TAN(A*0.017453), TANH(A*0.017453) 2 FROM NUMBERS; A TAN(A*0.017453) TANH(A*0.017453) --------- --------------- ---------------- 3.1415 .05488361 .05477372 -45 -.9999737 -.6557867 5 .08748719 .08704416 -57.667 -1.579769 -.7642948 15 .26794449 .25597369 -7.2 -.1263272 -.1250043
6 rows selected.
EXP enables you to raise e (e is a mathematical constant used in various formulas) to a power. Here's how EXP raises e by the values in column A:
SQL> SELECT A, EXP(A) 2 FROM NUMBERS;
A EXP(A) --------- --------- 3.1415 23.138549 -45 2.863E-20 5 148.41316 -57.667 9.027E-26 15 3269017.4 -7.2 .00074659
6 rows selected.
These two functions center on logarithms. LN returns the natural logarithm of its argument. For example:
SQL> SELECT A, LN(A) 2 FROM NUMBERS;
ERROR: ORA-01428: argument '-45' is out of range
Did we neglect to mention that the argument had to be positive? Write
SQL> SELECT A, LN(ABS(A)) 2 FROM NUMBERS; A LN(ABS(A)) --------- ---------- 3.1415 1.1447004 -45 3.8066625 5 1.6094379 -57.667 4.0546851 15 2.7080502 -7.2 1.974081
6 rows selected.
Notice how you can embed the function ABS inside the LN call. The other logarith-mic function, LOG, takes two arguments, returning the logarithm of the first argument in the base of the second. The following query returns the logarithms of column B in base 10.
SQL> SELECT B, LOG(B, 10) 2 FROM NUMBERS; B LOG(B,10) ----------- --------- 4 1.660964 .707 -6.640962 9 1.0479516 42 .61604832 55 .57459287 5.3 1.3806894
6 rows selected.
You have encountered MOD before. On Day 3, "Expressions, Conditions, and Operators," you saw that the ANSI standard for the modulo operator % is sometimes implemented as the function MOD. Here's a query that returns a table showing the remainder of A divided by B:
SQL> SELECT A, B, MOD(A,B) 2 FROM NUMBERS;
A B MOD(A,B) --------- --------- --------- 3.1415 4 3.1415 -45 .707 -.459 5 9 5 -57.667 42 -15.667 15 55 15 -7.2 5.3 -1.9
6 rows selected.
To raise one number to the power of another, use POWER. In this function the first argument is raised to the power of the second:
SQL> SELECT A, B, POWER(A,B) 2 FROM NUMBERS;
ERROR: ORA-01428: argument '-45' is out of range
At first glance you are likely to think that the first argument can't be negative. But that impression can't be true, because a number like -4 can be raised to a power. Therefore, if the first number in the POWER function is negative, the second must be an integer. You can work around this problem by using CEIL (or FLOOR):
SQL> SELECT A, CEIL(B), POWER(A,CEIL(B)) 2 FROM NUMBERS;
A CEIL(B) POWER(A,CEIL(B)) --------- --------- ---------------- 3.1415 4 97.3976 -45 1 -45 5 9 1953125 -57.667 42 9.098E+73 15 55 4.842E+64 -7.2 6 139314.07
6 rows selected.
That's better!
SIGN returns -1 if its argument is less than 0, 0 if its argument is equal to 0, and 1 if its argument is greater than 0, as shown in the following example:
SQL> SELECT A, SIGN(A) 2 FROM NUMBERS;
A SIGN(A) --------- --------- 3.1415 1 -45 -1 5 1 -57.667 -1 15 1 -7.2 -1 0 0
7 rows selected.
You could also use SIGN in a SELECT WHERE clause like this:
SQL> SELECT A 2 FROM NUMBERS 3 WHERE SIGN(A) = 1;
A --------- 3.1415 5 15
The function SQRT returns the square root of an argument. Because the square root of a negative number is undefined, you cannot use SQRT on negative numbers.
SQL> SELECT A, SQRT(A) 2 FROM NUMBERS;
ERROR: ORA-01428: argument '-45' is out of range
However, you can fix this limitation with ABS:
SQL> SELECT ABS(A), SQRT(ABS(A)) 2 FROM NUMBERS; ABS(A) SQRT(ABS(A)) --------- ------------ 3.1415 1.7724277 45 6.7082039 5 2.236068 57.667 7.5938791 15 3.8729833 7.2 2.6832816 0 0
7 rows selected.
Many implementations of SQL provide functions to manipulate characters and strings of characters. This section covers the most common character functions. The examples in this section use the table CHARACTERS.
SQL> SELECT * FROM CHARACTERS;
LASTNAME FIRSTNAME M CODE --------------- --------------- - --------- PURVIS KELLY A 32 TAYLOR CHUCK J 67 CHRISTINE LAURA C 65 ADAMS FESTER M 87 COSTALES ARMANDO A 77 KONG MAJOR G 52
6 rows selected.
CHR returns the character equivalent of the number it uses as an argument. The character it returns depends on the character set of the database. For this example the database is set to ASCII. The column CODE includes numbers.
SQL> SELECT CODE, CHR(CODE) 2 FROM CHARACTERS;
CODE CH --------- -- 32 67 C 65 A 87 W 77 M 52 4
6 rows selected.
The space opposite the 32 shows that 32 is a space in the ASCII character set.
You used the equivalent of this function on Day 3, when you learned about operators. The || symbol splices two strings together, as does CONCAT. It works like this:
SQL> SELECT CONCAT(FIRSTNAME, LASTNAME) "FIRST AND LAST NAMES" 2 FROM CHARACTERS;
FIRST AND LAST NAMES ------------------------ KELLY PURVIS CHUCK TAYLOR LAURA CHRISTINE FESTER ADAMS ARMANDO COSTALES MAJOR KONG 6 rows selected.
Quotation marks surround the multiple-word alias FIRST AND LAST NAMES. Again, it is safest to check your implementation to see if it allows multiple-word aliases.
Also notice that even though the table looks like two separate columns, what you are seeing is one column. The first value you concatenated, FIRSTNAME, is 15 characters wide. This operation retained all the characters in the field.
INITCAP capitalizes the first letter of a word and makes all other characters lowercase.
SQL> SELECT FIRSTNAME BEFORE, INITCAP(FIRSTNAME) AFTER 2 FROM CHARACTERS;
BEFORE AFTER -------------- ---------- KELLY Kelly CHUCK Chuck LAURA Laura FESTER Fester ARMANDO Armando MAJOR Major
6 rows selected.
As you might expect, LOWER changes all the characters to lowercase; UPPER does just the reverse.
The following example starts by doing a little magic with the UPDATE function (you learn more about this next week) to change one of the values to lowercase:
SQL> UPDATE CHARACTERS 2 SET FIRSTNAME = 'kelly' 3 WHERE FIRSTNAME = 'KELLY';
1 row updated.
SQL> SELECT FIRSTNAME 2 FROM CHARACTERS;
FIRSTNAME --------------- kelly CHUCK LAURA FESTER ARMANDO MAJOR
6 rows selected.
Then you write
SQL> SELECT FIRSTNAME, UPPER(FIRSTNAME), LOWER(FIRSTNAME) 2 FROM CHARACTERS;
FIRSTNAME UPPER(FIRSTNAME LOWER(FIRSTNAME --------------- --------------- --------------- kelly KELLY kelly CHUCK CHUCK chuck LAURA LAURA laura FESTER FESTER fester ARMANDO ARMANDO armando MAJOR MAJOR major
6 rows selected.
Now you see the desired behavior.
LPAD and RPAD take a minimum of two and a maximum of three arguments. The first argument is the character string to be operated on. The second is the number of characters to pad it with, and the optional third argument is the character to pad it with. The third argument defaults to a blank, or it can be a single character or a character string. The following statement adds five pad characters, assuming that the field LASTNAME is defined as a 15-character field:
SQL> SELECT LASTNAME, LPAD(LASTNAME,20,'*') 2 FROM CHARACTERS;
LASTNAME LPAD(LASTNAME,20,'*' -------------- -------------------- PURVIS *****PURVIS TAYLOR *****TAYLOR CHRISTINE *****CHRISTINE ADAMS *****ADAMS COSTALES *****COSTALES KONG *****KONG
6 rows selected.
Why were only five pad characters added? Remember that the LASTNAME column is 15 characters wide and that LASTNAME includes the blanks to the right of the characters that make up the name. Some column data types eliminate padding characters if the width of the column value is less than the total width allocated for the column. Check your implementation. Now try the right side:
SQL> SELECT LASTNAME, RPAD(LASTNAME,20,'*') 2 FROM CHARACTERS;
LASTNAME RPAD(LASTNAME,20,'*' --------------- -------------------- PURVIS PURVIS ***** TAYLOR TAYLOR ***** CHRISTINE CHRISTINE ***** ADAMS ADAMS ***** COSTALES COSTALES ***** KONG KONG *****
6 rows selected.
Here you see that the blanks are considered part of the field name for these operations. The next two functions come in handy in this type of situation.
LTRIM and RTRIM take at least one and at most two arguments. The first argument, like LPAD and RPAD, is a character string. The optional second element is either a character or character string or defaults to a blank. If you use a second argument that is not a blank, these trim functions will trim that character the same way they trim the blanks in the following examples.
SQL> SELECT LASTNAME, RTRIM(LASTNAME) 2 FROM CHARACTERS;
LASTNAME RTRIM(LASTNAME) --------------- --------------- PURVIS PURVIS TAYLOR TAYLOR CHRISTINE CHRISTINE ADAMS ADAMS COSTALES COSTALES KONG KONG
6 rows selected.
You can make sure that the characters have been trimmed with the following statement:
SQL> SELECT LASTNAME, RPAD(RTRIM(LASTNAME),20,'*') 2 FROM CHARACTERS;
LASTNAME RPAD(RTRIM(LASTNAME) --------------- -------------------- PURVIS PURVIS************** TAYLOR TAYLOR************** CHRISTINE CHRISTINE*********** ADAMS ADAMS*************** COSTALES COSTALES************ KONG KONG****************
6 rows selected.
The output proves that trim is working. Now try LTRIM:
SQL> SELECT LASTNAME, LTRIM(LASTNAME, 'C') 2 FROM CHARACTERS;
LASTNAME LTRIM(LASTNAME, --------------- --------------- PURVIS PURVIS TAYLOR TAYLOR CHRISTINE HRISTINE ADAMS ADAMS COSTALES OSTALES KONG KONG
6 rows selected.
Note the missing Cs in the third and fifth rows.
REPLACE does just that. Of its three arguments, the first is the string to be searched. The second is the search key. The last is the optional replacement string. If the third argument is left out or NULL, each occurrence of the search key on the string to be searched is removed and is not replaced with anything.
SQL> SELECT LASTNAME, REPLACE(LASTNAME, 'ST') REPLACEMENT 2 FROM CHARACTERS;
LASTNAME REPLACEMENT --------------- --------------- PURVIS PURVIS TAYLOR TAYLOR CHRISTINE CHRIINE ADAMS ADAMS COSTALES COALES KONG KONG
6 rows selected.
If you have a third argument, it is substituted for each occurrence of the search key in the target string. For example:
SQL> SELECT LASTNAME, REPLACE(LASTNAME, 'ST','**') REPLACEMENT 2 FROM CHARACTERS;
LASTNAME REPLACEMENT --------------- ------------ PURVIS PURVIS TAYLOR TAYLOR CHRISTINE CHRI**INE ADAMS ADAMS COSTALES CO**ALES KONG KONG
6 rows selected.
If the second argument is NULL, the target string is returned with no changes.
SQL> SELECT LASTNAME, REPLACE(LASTNAME, NULL) REPLACEMENT 2 FROM CHARACTERS;
LASTNAME REPLACEMENT --------------- --------------- PURVIS PURVIS TAYLOR TAYLOR CHRISTINE CHRISTINE ADAMS ADAMS COSTALES COSTALES KONG KONG
6 rows selected.
This three-argument function enables you to take a piece out of a target string. The first argument is the target string. The second argument is the position of the first character to be output. The third argument is the number of characters to show.
SQL> SELECT FIRSTNAME, SUBSTR(FIRSTNAME,2,3) 2 FROM CHARACTERS;
FIRSTNAME SUB --------------- --- kelly ell CHUCK HUC LAURA AUR FESTER EST ARMANDO RMA MAJOR AJO
6 rows selected.
If you use a negative number as the second argument, the starting point is determined by counting backwards from the end, like this:
SQL> SELECT FIRSTNAME, SUBSTR(FIRSTNAME,-13,2) 2 FROM CHARACTERS;
FIRSTNAME SU --------------- -- kelly ll CHUCK UC LAURA UR FESTER ST ARMANDO MA MAJOR JO
6 rows selected.
Remember the character field FIRSTNAME in this example is 15 characters long. That is why you used a -13 to start at the third character. Counting back from 15 puts you at the start of the third character, not at the start of the second. If you don't have a third argument, use the following statement instead:
SQL> SELECT FIRSTNAME, SUBSTR(FIRSTNAME,3) 2 FROM CHARACTERS;
FIRSTNAME SUBSTR(FIRSTN --------------- ------------- kelly lly CHUCK UCK LAURA URA FESTER STER ARMANDO MANDO MAJOR JOR
6 rows selected.
The rest of the target string is returned.
SQL> SELECT * FROM SSN_TABLE;
SSN__________ 300541117 301457111 459789998
3 rows selected.
Reading the results of the preceding output is difficult--Social Security numbers usually have dashes. Now try something fancy and see whether you like the results:
SQL> SELECT SUBSTR(SSN,1,3)||'-'||SUBSTR(SSN,4,2)||'-'||SUBSTR(SSN,6,4) SSN 2 FROM SSN_TABLE;
SSN_________ 300-54-1117 301-45-7111 459-78-9998
3 rows selected.
NOTE: This particular use of the substr function could come in very handy with large numbers using commas such as 1,343,178,128 and in area codes and phone numbers such as 317-787-2915 using dashes.
Here is another good use of the SUBSTR function. Suppose you are writing a report and a few columns are more than 50 characters wide. You can use the SUBSTR function to reduce the width of the columns to a more manageable size if you know the nature of the actual data. Consider the following two examples:
SQL> SELECT NAME, JOB, DEPARTMENT FROM JOB_TBL;
NAME______________________________________________________________ JOB_______________________________DEPARTMENT______________________ ALVIN SMITH VICEPRESIDENT MARKETING 1 ROW SELECTED.
Notice how the columns wrapped around, which makes reading the results a little too difficult. Now try this select:
SQL> SELECT SUBSTR(NAME, 1,15) NAME, SUBSTR(JOB,1,15) JOB, DEPARTMENT 2 FROM JOB_TBL;
NAME________________JOB_______________DEPARTMENT_____
ALVIN SMITH VICEPRESIDENT MARKETING
Much better!
The function TRANSLATE takes three arguments: the target string, the FROM string, and the TO string. Elements of the target string that occur in the FROM string are translated to the corresponding element in the TO string.
SQL> SELECT FIRSTNAME, TRANSLATE(FIRSTNAME 2 '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ 3 'NNNNNNNNNNAAAAAAAAAAAAAAAAAAAAAAAAAA) 4 FROM CHARACTERS;
FIRSTNAME TRANSLATE(FIRST --------------- --------------- kelly kelly CHUCK AAAAA LAURA AAAAA FESTER AAAAAA ARMANDO AAAAAAA MAJOR AAAAA
6 rows selected.
Notice that the function is case sensitive.
To find out where in a string a particular pattern occurs, use INSTR. Its first argument is the target string. The second argument is the pattern to match. The third and forth are numbers representing where to start looking and which match to report. This example returns a number representing the first occurrence of O starting with the second character:
SQL> SELECT LASTNAME, INSTR(LASTNAME, 'O', 2, 1) 2 FROM CHARACTERS;
LASTNAME INSTR(LASTNAME,'O',2,1) --------------- ----------------------- PURVIS 0 TAYLOR 5 CHRISTINE 0 ADAMS 0 COSTALES 2 KONG 2
6 rows selected.
The default for the third and fourth arguments is 1. If the third argument is negative, the search starts at a position determined from the end of the string, instead of from the beginning.
LENGTH returns the length of its lone character argument. For example:
SQL> SELECT FIRSTNAME, LENGTH(RTRIM(FIRSTNAME)) 2 FROM CHARACTERS;
FIRSTNAME LENGTH(RTRIM(FIRSTNAME)) --------------- ------------------------ kelly 5 CHUCK 5 LAURA 5 FESTER 6 ARMANDO 7 MAJOR 5
6 rows selected.
Note the use of the RTRIM function. Otherwise, LENGTH would return 15 for every value.
These three conversion functions provide a handy way of converting one type of data to another. These examples use the table CONVERSIONS.
SQL> SELECT * FROM CONVERSIONS;
NAME TESTNUM --------------- --------- 40 95 13 23 74 68
The NAME column is a character string 15 characters wide, and TESTNUM is a number.
The primary use of TO_CHAR is to convert a number into a character. Different implementations may also use it to convert other data types, like Date, into a character, or to include different formatting arguments. The next example illustrates the primary use of TO_CHAR:
SQL> SELECT TESTNUM, TO_CHAR(TESTNUM) 2 FROM CONVERT;
TESTNUM TO_CHAR(TESTNUM) --------- ---------------- 95 95 23 23 68 68
Not very exciting, or convincing. Here's how to verify that the function returned a character string:
SQL> SELECT TESTNUM, LENGTH(TO_CHAR(TESTNUM)) 2 FROM CONVERT;
TESTNUM LENGTH(TO_CHAR(TESTNUM)) --------- ------------------------ 95 2 23 2 68 2
LENGTH of a number would have returned an error. Notice the difference between TO CHAR and the CHR function discussed earlier. CHR would have turned this number into a character or a symbol, depending on the character set.
TO_NUMBER is the companion function to TO_CHAR, and of course, it converts a string into a number. For example:
SQL> SELECT NAME, TESTNUM, TESTNUM*TO_NUMBER(NAME) 2 FROM CONVERT;
NAME TESTNUM TESTNUM*TO_NUMBER(NAME) --------------- -------- ----------------------- 40 95 3800 13 23 299 74 68 5032
This test would have returned an error if TO_NUMBER had returned a character.
Here are three miscellaneous functions you may find useful.
These functions find the GREATEST or the LEAST member from a series of expressions. For example:
SQL> SELECT GREATEST('ALPHA', 'BRAVO','FOXTROT', 'DELTA') 2 FROM CONVERT;
GREATEST ------- FOXTROT FOXTROT FOXTROT
Notice GREATEST found the word closest to the end of the alphabet. Notice also a seemingly unnecessary FROM and three occurrences of FOXTROT. If FROM is missing, you will get an error. Every SELECT needs a FROM. The particular table used in the FROM has three rows, so the function in the SELECT clause is performed for each of them.
SQL> SELECT LEAST(34, 567, 3, 45, 1090) 2 FROM CONVERT;
LEAST(34,567,3,45,1090) ----------------------- 3 3 3
As you can see, GREATEST and LEAST also work with numbers.
USER returns the character name of the current user of the database.
SQL> SELECT USER FROM CONVERT;
USER ------------------------------ PERKINS PERKINS PERKINS
There really is only one of me. Again, the echo occurs because of the number of rows in the table. USER is similar to the date functions explained earlier today. Even though USER is not an actual column in the table, it is selected for each row that is contained in the table.
It has been a long day. We covered 47 functions--from aggregates to conversions. You don't have to remember every function--just knowing the general types (aggregate functions, date and time functions, arithmetic functions, character functions, conversion functions, and miscellaneous functions) is enough to point you in the right direction when you build a query that requires a function.
A ANSI standards are broad strokes and are not meant to drive companies into bankruptcy by forcing all implementations to have dozens of functions. On the other hand, when company X adds a statistical package to its SQL and it sells well, you can bet company Y and Z will follow suit.
Q I thought you said SQL was simple. Will I really use all of these functions?
A The answer to this question is similar to the way a trigonometry teacher might respond to the question, Will I ever need to know how to figure the area of an isosceles triangle in real life? The answer, of course, depends on your profession. The same concept applies with the functions and all the other options available with SQL. How you use functions in SQL depends mostly on you company's needs. As long as you understand how functions work as a whole, you can apply the same concepts to your own queries.
The Workshop provides quiz questions to help solidify your understanding of the material covered, as well as exercises to provide you with experience in using what you have learned. Try to answer the quiz and exercise questions before checking the answers in Appendix F, "Answers to Quizzes and Exercises."
2. Which functions are also known by the name group functions?
3. Will this query work?
SQL> SELECT COUNT(LASTNAME) FROM CHARACTERS;
SQL> SELECT SUM(LASTNAME) FROM CHARACTERS;
6. What does the answer 6 mean from the following SELECT?
SQL> SELECT COUNT(*) FROM TEAMSTATS;OUTPUT:
COUNT(*)
SQL> SELECT SUBSTR LASTNAME,1,5 FROM NAME_TBL;
2. Using today's CHARACTERS table, write a query that will return the following:
INITIALS__________CODE K.A.P. 32 1 row selected.
© Copyright, Macmillan Computer Publishing. All rights reserved.