Teach Yourself SQL in 21 Days, Second Edition

Previous chapterNext chapterContents


- Day 4 -
Functions: Molding the Data You Retrieve

Objectives

Today we talk about functions. Functions in SQL enable you to perform feats such as determining the sum of a column or converting all the characters of a string to uppercase. By the end of the day, you will understand and be able to use all the following:

These functions greatly increase your ability to manipulate the information you retrieved using the basic functions of SQL that were described earlier this week. The first five aggregate functions, COUNT, SUM, AVG, MAX, and MIN, are defined in the ANSI standard. Most implementations of SQL have extensions to these aggregate functions, some of which are covered today. Some implementations may use different names for these functions.

Aggregate Functions

These functions are also referred to as group functions. They return a value based on the values in a column. (After all, you wouldn't ask for the average of a single field.) The examples in this section use the table TEAMSTATS:

INPUT:
SQL> SELECT * FROM TEAMSTATS;
OUTPUT:
NAME      POS  AB HITS WALKS SINGLES DOUBLES TRIPLES HR SO
--------- --- --- ---- ----- ------- ------- ------- -- --
JONES      1B 145   45 34    31       8       1       5 10
DONKNOW    3B 175   65 23    50      10       1       4 15
WORLEY     LF 157   49 15    35       8       3       3 16
DAVID      OF 187   70 24    48       4       0      17 42
HAMHOCKER  3B  50   12 10    10       2       0       0 13
CASEY      DH   1    0  0     0       0       0       0  1

6 rows selected.

COUNT

The function COUNT returns the number of rows that satisfy the condition in the WHERE clause. Say you wanted to know how many ball players were hitting under 350. You would type

INPUT/OUTPUT:
SQL> SELECT COUNT(*)
  2  FROM TEAMSTATS
  3  WHERE HITS/AB < .35;

COUNT(*)
--------
       4

To make the code more readable, try an alias:

INPUT/OUTPUT:
SQL> SELECT COUNT(*) NUM_BELOW_350
  2  FROM TEAMSTATS
  3  WHERE HITS/AB < .35;

NUM_BELOW_350
-------------
            4

Would it make any difference if you tried a column name instead of the asterisk? (Notice the use of parentheses around the column names.) Try this:

INPUT/OUTPUT:
SQL> SELECT COUNT(NAME) NUM_BELOW_350
  2  FROM TEAMSTATS
  3  WHERE HITS/AB < .35;

NUM_BELOW_350
-------------
            4

The answer is no. The NAME column that you selected was not involved in the WHERE statement. If you use COUNT without a WHERE clause, it returns the number of records in the table.

INPUT/OUTPUT:
SQL> SELECT COUNT(*)
  2  FROM TEAMSTATS;

 COUNT(*)
---------
        6

SUM

SUM does just that. It returns the sum of all values in a column. To find out how many singles have been hit, type

INPUT:
SQL> SELECT SUM(SINGLES) TOTAL_SINGLES
  2  FROM TEAMSTATS;
OUTPUT:
TOTAL_SINGLES
-------------
          174

To get several sums, use

INPUT/OUTPUT:
SQL> SELECT SUM(SINGLES) TOTAL_SINGLES, SUM(DOUBLES) TOTAL_DOUBLES,
SUM(TRIPLES) TOTAL_TRIPLES, SUM(HR) TOTAL_HR
  2  FROM TEAMSTATS;

TOTAL_SINGLES TOTAL_DOUBLES TOTAL_TRIPLES TOTAL_HR
------------- ------------- ------------- --------
          174            32             5       29

To collect similar information on all 300 or better players, type

INPUT/OUTPUT:
SQL> SELECT SUM(SINGLES) TOTAL_SINGLES, SUM(DOUBLES) TOTAL_DOUBLES,
SUM(TRIPLES) TOTAL_TRIPLES, SUM(HR) TOTAL_HR
  2  FROM TEAMSTATS
  3  WHERE HITS/AB >= .300;

TOTAL_SINGLES TOTAL_DOUBLES TOTAL_TRIPLES TOTAL_HR
------------- ------------- ------------- --------
          164            30             5       29

To compute a team batting average, type

INPUT/OUTPUT:
SQL> SELECT SUM(HITS)/SUM(AB) TEAM_AVERAGE
  2  FROM TEAMSTATS;

TEAM_AVERAGE
------------
   .33706294

SUM works only with numbers. If you try it on a nonnumerical field, you get

INPUT/OUTPUT:
SQL> SELECT SUM(NAME)
  2  FROM TEAMSTATS;

ERROR:
ORA-01722: invalid number
no rows selected

This error message is logical because you cannot sum a group of names.

AVG

The AVG function computes the average of a column. To find the average number of strike outs, use this:

INPUT:
SQL> SELECT AVG(SO) AVE_STRIKE_OUTS
  2  FROM TEAMSTATS;
OUTPUT:
AVE_STRIKE_OUTS
---------------
      16.166667

The following example illustrates the difference between SUM and AVG:

INPUT/OUTPUT:
SQL> SELECT AVG(HITS/AB) TEAM_AVERAGE
  2  FROM TEAMSTATS;

TEAM_AVERAGE
------------
   .26803448
ANALYSIS:

The team was batting over 300 in the previous example! What happened? AVG computed the average of the combined column hits divided by at bats, whereas the example with SUM divided the total number of hits by the number of at bats. For example, player A gets 50 hits in 100 at bats for a .500 average. Player B gets 0 hits in 1 at bat for a 0.0 average. The average of 0.0 and 0.5 is .250. If you compute the combined average of 50 hits in 101 at bats, the answer is a respectable .495. The following statement returns the correct batting average:

INPUT/OUTPUT:
SQL> SELECT AVG(HITS)/AVG(AB) TEAM_AVERAGE
  2  FROM TEAMSTATS;
TEAM_AVERAGE
------------
   .33706294

Like the SUM function, AVG works only with numbers.

MAX

If you want to find the largest value in a column, use MAX. For example, what is the highest number of hits?

INPUT:
SQL> SELECT MAX(HITS)
  2  FROM TEAMSTATS;
OUTPUT:
MAX(HITS)
---------
       70

Can you find out who has the most hits?

INPUT/OUTPUT:
SQL> SELECT NAME
  2  FROM TEAMSTATS
  3  WHERE HITS = MAX(HITS);

ERROR at line 3:
ORA-00934: group function is not allowed here

Unfortunately, you can't. The error message is a reminder that this group function (remember that aggregate functions are also called group functions) does not work in the WHERE clause. Don't despair, Day 7, "Subqueries: The Embedded SELECT Statement," covers the concept of subqueries and explains a way to find who has the MAX hits.

What happens if you try a nonnumerical column?

INPUT/OUTPUT:
SQL> SELECT MAX(NAME)
  2  FROM TEAMSTATS;
MAX(NAME)
---------------
WORLEY

Here's something new. MAX returns the highest (closest to Z) string. Finally, a function that works with both characters and numbers.

MIN

MIN does the expected thing and works like MAX except it returns the lowest member of a column. To find out the fewest at bats, type

INPUT:
SQL> SELECT MIN(AB)
  2  FROM TEAMSTATS;
OUTPUT:
MIN(AB)
---------
        1

The following statement returns the name closest to the beginning of the alphabet:

INPUT/OUTPUT:
SQL> SELECT MIN(NAME)
  2  FROM TEAMSTATS;

MIN(NAME)
---------------
CASEY

You can combine MIN with MAX to give a range of values. For example:

INPUT/OUTPUT:
SQL> SELECT MIN(AB), MAX(AB)
  2  FROM TEAMSTATS;

 MIN(AB)  MAX(AB)
-------- --------
       1      187

This sort of information can be useful when using statistical functions.


NOTE: As we mentioned in the introduction, the first five aggregate functions are described in the ANSI standard. The remaining aggregate functions have become de facto standards, present in all important implementations of SQL. We use the Oracle7 names for these functions. Other implementations may use different names.

VARIANCE

VARIANCE produces the square of the standard deviation, a number vital to many statistical calculations. It works like this:

INPUT:
SQL> SELECT VARIANCE(HITS)
  2  FROM TEAMSTATS;
OUTPUT:
VARIANCE(HITS)
--------------
     802.96667

If you try a string

INPUT/OUTPUT:
SQL> SELECT VARIANCE(NAME)
  2  FROM TEAMSTATS;

ERROR:
ORA-01722: invalid number
no rows selected

you find that VARIANCE is another function that works exclusively with numbers.

STDDEV

The final group function, STDDEV, finds the standard deviation of a column of numbers, as demonstrated by this example:

INPUT:
SQL> SELECT STDDEV(HITS)
  2  FROM TEAMSTATS;
OUTPUT:
STDDEV(HITS)
------------
   28.336666

It also returns an error when confronted by a string:

INPUT/OUTPUT:
SQL> SELECT STDDEV(NAME)
  2  FROM TEAMSTATS;

ERROR:
ORA-01722: invalid number
no rows selected

These aggregate functions can also be used in various combinations:

INPUT/OUTPUT:
SQL> SELECT COUNT(AB),
  2  AVG(AB),
  3  MIN(AB),
  4  MAX(AB),
  5  STDDEV(AB),
  6  VARIANCE(AB),
  7  SUM(AB)
  8  FROM TEAMSTATS;

COUNT(AB) AVG(AB) MIN(AB) MAX(AB) STDDEV(AB) VARIANCE(AB) SUM(AB)
--------- ------- ------- ------- ---------- ------------ -------
6         119.167       1     187     75.589      5712.97     715

The next time you hear a sportscaster use statistics to fill the time between plays, you will know that SQL is at work somewhere behind the scenes.

Date and Time Functions

We live in a civilization governed by times and dates, and most major implementations of SQL have functions to cope with these concepts. This section uses the table PROJECT to demonstrate the time and date functions.

INPUT:
SQL> SELECT * FROM PROJECT;
OUTPUT:
TASK           STARTDATE ENDDATE
-------------- --------- ---------
KICKOFF MTG    01-APR-95 01-APR-95
TECH SURVEY    02-APR-95 01-MAY-95
USER MTGS      15-MAY-95 30-MAY-95
DESIGN WIDGET  01-JUN-95 30-JUN-95
CODE WIDGET    01-JUL-95 02-SEP-95
TESTING        03-SEP-95 17-JAN-96
6 rows selected.


NOTE: This table used the Date data type. Most implementations of SQL have a Date data type, but the exact syntax may vary.

ADD_MONTHS

This function adds a number of months to a specified date. For example, say something extraordinary happened, and the preceding project slipped to the right by two months. You could make a new schedule by typing

INPUT:
SQL> SELECT TASK,
  2  STARTDATE,
  3  ENDDATE ORIGINAL_END,
  4  ADD_MONTHS(ENDDATE,2)
  5  FROM PROJECT;
OUTPUT:
TASK           STARTDATE ORIGINAL_ ADD_MONTH
-------------- --------- --------- ---------
KICKOFF MTG    01-APR-95 01-APR-95 01-JUN-95
TECH SURVEY    02-APR-95 01-MAY-95 01-JUL-95
USER MTGS      15-MAY-95 30-MAY-95 30-JUL-95
DESIGN WIDGET  01-JUN-95 30-JUN-95 31-AUG-95
CODE WIDGET    01-JUL-95 02-SEP-95 02-NOV-95
TESTING        03-SEP-95 17-JAN-96 17-MAR-96

6 rows selected.

Not that a slip like this is possible, but it's nice to have a function that makes it so easy. ADD_MONTHS also works outside the SELECT clause. Typing

INPUT:
SQL> SELECT TASK TASKS_SHORTER_THAN_ONE_MONTH
  2  FROM PROJECT
  3  WHERE ADD_MONTHS(STARTDATE,1) > ENDDATE;

produces the following result:

OUTPUT:
TASKS_SHORTER_THAN_ONE_MONTH
----------------------------
KICKOFF MTG
TECH SURVEY
USER MTGS
DESIGN WIDGET
ANALYSIS:

You will find that all the functions in this section work in more than one place. However, ADD MONTHS does not work with other data types like character or number without the help of functions TO_CHAR and TO_DATE, which are discussed later today.

LAST_DAY

LAST_DAY returns the last day of a specified month. It is for those of us who haven't mastered the "Thirty days has September..." rhyme--or at least those of us who have not yet taught it to our computers. If, for example, you need to know what the last day of the month is in the column ENDDATE, you would type

INPUT:
SQL> SELECT ENDDATE, LAST_DAY(ENDDATE)
  2  FROM PROJECT;

Here's the result:

OUTPUT:
ENDDATE   LAST_DAY(ENDDATE)
--------- -----------------
01-APR-95 30-APR-95
01-MAY-95 31-MAY-95
30-MAY-95 31-MAY-95
30-JUN-95 30-JUN-95
02-SEP-95 30-SEP-95
17-JAN-96 31-JAN-96
6 rows selected.

How does LAST DAY handle leap years?

INPUT/OUTPUT:
SQL> SELECT LAST_DAY('1-FEB-95') NON_LEAP,
  2  LAST_DAY('1-FEB-96') LEAP
  3  FROM PROJECT;

NON_LEAP  LEAP
--------- ---------
28-FEB-95 29-FEB-96
28-FEB-95 29-FEB-96
28-FEB-95 29-FEB-96
28-FEB-95 29-FEB-96
28-FEB-95 29-FEB-96
28-FEB-95 29-FEB-96
6 rows selected.
ANALYSIS:

You got the right result, but why were so many rows returned? Because you didn't specify an existing column or any conditions, the SQL engine applied the date functions in the statement to each existing row. Let's get something less redundant by using the following:

INPUT:
SQL> SELECT DISTINCT LAST_DAY('1-FEB-95') NON_LEAP,
  2  LAST_DAY('1-FEB-96') LEAP
  3  FROM PROJECT;

This statement uses the word DISTINCT (see Day 2, "Introduction to the Query: The SELECT Statement") to produce the singular result

OUTPUT:
NON_LEAP  LEAP
--------- ---------
28-FEB-95 29-FEB-96

Unlike me, this function knows which years are leap years. But before you trust your own or your company's financial future to this or any other function, check your implementation!

MONTHS_BETWEEN

If you need to know how many months fall between month x and month y, use MONTHS_BETWEEN like this:

INPUT:
SQL> SELECT TASK, STARTDATE, ENDDATE,MONTHS_BETWEEN(STARTDATE,ENDDATE)
     DURATION
  2  FROM PROJECT;
OUTPUT:
TASK           STARTDATE ENDDATE    DURATION
-------------- --------- --------- ---------
KICKOFF MTG    01-APR-95 01-APR-95         0
TECH SURVEY    02-APR-95 01-MAY-95 -.9677419
USER MTGS      15-MAY-95 30-MAY-95  -.483871
DESIGN WIDGET  01-JUN-95 30-JUN-95 -.9354839
CODE WIDGET    01-JUL-95 02-SEP-95 -2.032258
TESTING        03-SEP-95 17-JAN-96 -4.451613
6 rows selected.

Wait a minute--that doesn't look right. Try this:

INPUT/OUTPUT:
SQL> SELECT TASK, STARTDATE, ENDDATE,
  2  MONTHS_BETWEEN(ENDDATE,STARTDATE) DURATION
  3  FROM PROJECT;

TASK           STARTDATE ENDDATE    DURATION
-------------- --------- --------- ---------
KICKOFF MTG    01-APR-95 01-APR-95         0
TECH SURVEY    02-APR-95 01-MAY-95 .96774194
USER MTGS      15-MAY-95 30-MAY-95 .48387097
DESIGN WIDGET  01-JUN-95 30-JUN-95 .93548387
CODE WIDGET    01-JUL-95 02-SEP-95 2.0322581
TESTING        03-SEP-95 17-JAN-96 4.4516129
6 rows selected.
ANALYSIS:

That's better. You see that MONTHS_BETWEEN is sensitive to the way you order the months. Negative months might not be bad. For example, you could use a negative result to determine whether one date happened before another. For example, the following statement shows all the tasks that started before May 19, 1995:

INPUT:
SQL> SELECT *
  2  FROM PROJECT
  3  WHERE MONTHS_BETWEEN('19 MAY 95', STARTDATE) > 0;
OUTPUT:
TASK           STARTDATE ENDDATE
-------------- --------- ---------
KICKOFF MTG    01-APR-95 01-APR-95
TECH SURVEY    02-APR-95 01-MAY-95
USER MTGS      15-MAY-95 30-MAY-95

NEW_TIME

If you need to adjust the time according to the time zone you are in, the New_TIME function is for you. Here are the time zones you can use with this function:

Abbreviation Time Zone
AST or ADT Atlantic standard or daylight time
BST or BDT Bering standard or daylight time
CST or CDT Central standard or daylight time
EST or EDT Eastern standard or daylight time
GMT Greenwich mean time
HST or HDT Alaska-Hawaii standard or daylight time
MST or MDT Mountain standard or daylight time
NST Newfoundland standard time
PST or PDT Pacific standard or daylight time
YST or YDT Yukon standard or daylight time

You can adjust your time like this:

INPUT:
SQL> SELECT ENDDATE EDT,
  2  NEW_TIME(ENDDATE, 'EDT','PDT')
  3  FROM PROJECT;
OUTPUT:
EDT              NEW_TIME(ENDDATE
---------------- ----------------
01-APR-95 1200AM 31-MAR-95 0900PM
01-MAY-95 1200AM 30-APR-95 0900PM
30-MAY-95 1200AM 29-MAY-95 0900PM
30-JUN-95 1200AM 29-JUN-95 0900PM
02-SEP-95 1200AM 01-SEP-95 0900PM
17-JAN-96 1200AM 16-JAN-96 0900PM
6 rows selected.

Like magic, all the times are in the new time zone and the dates are adjusted.

NEXT_DAY

NEXT_DAY finds the name of the first day of the week that is equal to or later than another specified date. For example, to send a report on the Friday following the first day of each event, you would type

INPUT:
SQL> SELECT STARTDATE,
  2  NEXT_DAY(STARTDATE, 'FRIDAY')
  3  FROM PROJECT;

which would return

OUTPUT:
STARTDATE NEXT_DAY(
--------- ---------
01-APR-95 07-APR-95
02-APR-95 07-APR-95
15-MAY-95 19-MAY-95
01-JUN-95 02-JUN-95
01-JUL-95 07-JUL-95
03-SEP-95 08-SEP-95
6 rows selected.
ANALYSIS:

The output tells you the date of the first Friday that occurs after your STARTDATE.

SYSDATE

SYSDATE returns the system time and date:

INPUT:
SQL> SELECT DISTINCT SYSDATE
  2  FROM PROJECT;
OUTPUT:
SYSDATE
----------------
18-JUN-95 1020PM

If you wanted to see where you stood today in a certain project, you could type

INPUT/OUTPUT:
SQL> SELECT *
  2  FROM PROJECT
  3  WHERE STARTDATE > SYSDATE;

TASK           STARTDATE ENDDATE
-------------- --------- ---------
CODE WIDGET    01-JUL-95 02-SEP-95
TESTING        03-SEP-95 17-JAN-96

Now you can see what parts of the project start after today.

Arithmetic Functions

Many of the uses you have for the data you retrieve involve mathematics. Most implementations of SQL provide arithmetic functions similar to the functions covered here. The examples in this section use the NUMBERS table:

INPUT:
SQL> SELECT *
  2  FROM NUMBERS;
OUTPUT:
        A         B
--------- ---------
   3.1415         4
      -45      .707
        5         9
  -57.667        42
       15        55
     -7.2       5.3
6 rows selected.

ABS

The ABS function returns the absolute value of the number you point to. For example:

INPUT:
SQL> SELECT ABS(A) ABSOLUTE_VALUE
  2  FROM NUMBERS;
OUTPUT:
ABSOLUTE_VALUE
--------------
        3.1415
            45
             5
        57.667
            15
           7.2
6 rows selected.

ABS changes all the negative numbers to positive and leaves positive numbers alone.

CEIL and FLOOR

CEIL returns the smallest integer greater than or equal to its argument. FLOOR does just the reverse, returning the largest integer equal to or less than its argument. For example:

INPUT:
SQL> SELECT B, CEIL(B) CEILING
  2  FROM NUMBERS;
OUTPUT:
        B   CEILING
--------- ---------
        4         4
     .707         1
        9         9
       42        42
       55        55
      5.3         6
6 rows selected.

And

INPUT/OUTPUT:
SQL> SELECT A, FLOOR(A) FLOOR
  2  FROM NUMBERS;

        A     FLOOR
--------- ---------
   3.1415         3
      -45       -45
        5         5
  -57.667       -58
       15        15
     -7.2        -8
6 rows selected.

COS, COSH, SIN, SINH, TAN, and TANH

The COS, SIN, and TAN functions provide support for various trigonometric concepts. They all work on the assumption that n is in radians. The following statement returns some unexpected values if you don't realize COS expects A to be in radians.

INPUT:
SQL> SELECT A, COS(A)
  2  FROM NUMBERS;
OUTPUT:
    A        COS(A)
--------- ---------
   3.1415        -1
      -45 .52532199
        5 .28366219
  -57.667   .437183
       15 -.7596879
     -7.2 .60835131
ANALYSIS:

You would expect the COS of 45 degrees to be in the neighborhood of .707, not .525. To make this function work the way you would expect it to in a degree-oriented world, you need to convert degrees to radians. (When was the last time you heard a news broadcast report that a politician had done a pi-radian turn? You hear about a 180-degree turn.) Because 360 degrees = 2 pi radians, you can write

INPUT/OUTPUT:
SQL> SELECT A, COS(A* 0.01745329251994)
  2  FROM NUMBERS;
        A COS(A*0.01745329251994)
--------- -----------------------
   3.1415               .99849724
      -45               .70710678
        5                .9961947
  -57.667                .5348391
       15               .96592583
     -7.2                .9921147
ANALYSIS:

Note that the number 0.01745329251994 is radians divided by degrees. The trigonometric functions work as follows:

INPUT/OUTPUT:
SQL> SELECT A, COS(A*0.017453), COSH(A*0.017453)
  2  FROM NUMBERS;
        A COS(A*0.017453) COSH(A*0.017453)
--------- --------------- ----------------
   3.1415       .99849729        1.0015035
      -45       .70711609        1.3245977
        5       .99619483          1.00381
  -57.667       .53485335        1.5507072
       15       .96592696        1.0344645
     -7.2       .99211497        1.0079058
6 rows selected.

And

INPUT/OUTPUT:
SQL> SELECT A, SIN(A*0.017453), SINH(A*0.017453)
  2  FROM NUMBERS;
        A SIN(A*0.017453) SINH(A*0.017453)
--------- --------------- ----------------
   3.1415       .05480113        .05485607
      -45       -.7070975        -.8686535
        5       .08715429         .0873758
  -57.667       -.8449449        -1.185197
       15       .25881481        .26479569
     -7.2       -.1253311        -.1259926
6 rows selected.

And

INPUT/OUTPUT:
SQL> SELECT A, TAN(A*0.017453), TANH(A*0.017453)
  2  FROM NUMBERS;

        A TAN(A*0.017453) TANH(A*0.017453)
--------- --------------- ----------------
   3.1415       .05488361        .05477372
      -45       -.9999737        -.6557867
        5       .08748719        .08704416
  -57.667       -1.579769        -.7642948
       15       .26794449        .25597369
     -7.2       -.1263272        -.1250043
6 rows selected.

EXP

EXP enables you to raise e (e is a mathematical constant used in various formulas) to a power. Here's how EXP raises e by the values in column A:

INPUT:
SQL> SELECT A, EXP(A)
  2  FROM NUMBERS;
OUTPUT:
        A     EXP(A)
---------  ---------
   3.1415  23.138549
      -45  2.863E-20
        5  148.41316
  -57.667  9.027E-26
       15  3269017.4
     -7.2  .00074659
6 rows selected.

LN and LOG

These two functions center on logarithms. LN returns the natural logarithm of its argument. For example:

INPUT:
SQL> SELECT A, LN(A)
  2  FROM NUMBERS;
OUTPUT:
ERROR:
ORA-01428: argument '-45' is out of range

Did we neglect to mention that the argument had to be positive? Write

INPUT/OUTPUT:
SQL> SELECT A, LN(ABS(A))
  2  FROM NUMBERS;

        A LN(ABS(A))
--------- ----------
   3.1415  1.1447004
      -45  3.8066625
        5  1.6094379
  -57.667  4.0546851
       15  2.7080502
     -7.2   1.974081
6 rows selected.
ANALYSIS:

Notice how you can embed the function ABS inside the LN call. The other logarith-mic function, LOG, takes two arguments, returning the logarithm of the first argument in the base of the second. The following query returns the logarithms of column B in base 10.

INPUT/OUTPUT:
SQL> SELECT B, LOG(B, 10)
  2  FROM NUMBERS;

          B LOG(B,10) 
----------- ---------
          4  1.660964
       .707 -6.640962
          9 1.0479516
         42 .61604832
         55 .57459287
        5.3 1.3806894
6 rows selected.

MOD

You have encountered MOD before. On Day 3, "Expressions, Conditions, and Operators," you saw that the ANSI standard for the modulo operator % is sometimes implemented as the function MOD. Here's a query that returns a table showing the remainder of A divided by B:

INPUT:
SQL> SELECT A, B, MOD(A,B)
  2  FROM NUMBERS;
OUTPUT:
        A         B  MOD(A,B)
--------- --------- ---------
   3.1415         4    3.1415
      -45      .707     -.459
        5         9         5
  -57.667        42   -15.667
       15        55        15
     -7.2       5.3      -1.9
6 rows selected.

POWER

To raise one number to the power of another, use POWER. In this function the first argument is raised to the power of the second:

INPUT:
SQL> SELECT A, B, POWER(A,B)
  2  FROM NUMBERS;
OUTPUT:
ERROR:
ORA-01428: argument '-45' is out of range
ANALYSIS:

At first glance you are likely to think that the first argument can't be negative. But that impression can't be true, because a number like -4 can be raised to a power. Therefore, if the first number in the POWER function is negative, the second must be an integer. You can work around this problem by using CEIL (or FLOOR):

INPUT:
SQL> SELECT A, CEIL(B), POWER(A,CEIL(B))
  2  FROM NUMBERS;
OUTPUT:
       A   CEIL(B) POWER(A,CEIL(B))
--------- --------- ----------------
   3.1415         4          97.3976
      -45         1              -45
        5         9          1953125
  -57.667        42        9.098E+73
       15        55        4.842E+64
     -7.2         6        139314.07
6 rows selected.

That's better!

SIGN

SIGN returns -1 if its argument is less than 0, 0 if its argument is equal to 0, and 1 if its argument is greater than 0, as shown in the following example:

INPUT:
SQL> SELECT A, SIGN(A)
  2  FROM NUMBERS;
OUTPUT:
        A   SIGN(A)
--------- ---------
   3.1415         1
      -45        -1
        5         1
  -57.667        -1
       15         1
     -7.2        -1
        0         0
7 rows selected.

You could also use SIGN in a SELECT WHERE clause like this:

INPUT:
SQL> SELECT A
  2  FROM NUMBERS
  3  WHERE SIGN(A) = 1;
OUTPUT:
        A
---------
   3.1415
        5
       15

SQRT

The function SQRT returns the square root of an argument. Because the square root of a negative number is undefined, you cannot use SQRT on negative numbers.

INPUT/OUTPUT:
SQL> SELECT A, SQRT(A)
  2  FROM NUMBERS;
ERROR:
ORA-01428: argument '-45' is out of range

However, you can fix this limitation with ABS:

INPUT/OUTPUT:
SQL> SELECT ABS(A), SQRT(ABS(A))
  2  FROM NUMBERS;

   ABS(A) SQRT(ABS(A))
--------- ------------
   3.1415    1.7724277
       45    6.7082039
        5     2.236068
   57.667    7.5938791
       15    3.8729833
      7.2    2.6832816
        0            0
7 rows selected.

Character Functions

Many implementations of SQL provide functions to manipulate characters and strings of characters. This section covers the most common character functions. The examples in this section use the table CHARACTERS.

INPUT/OUTPUT:
SQL> SELECT * FROM CHARACTERS;
LASTNAME        FIRSTNAME       M      CODE
--------------- --------------- - ---------
PURVIS          KELLY           A        32
TAYLOR          CHUCK           J        67
CHRISTINE       LAURA           C        65
ADAMS           FESTER          M        87
COSTALES        ARMANDO         A        77
KONG            MAJOR           G        52
6 rows selected.

CHR

CHR returns the character equivalent of the number it uses as an argument. The character it returns depends on the character set of the database. For this example the database is set to ASCII. The column CODE includes numbers.

INPUT:
SQL> SELECT CODE, CHR(CODE)
  2  FROM CHARACTERS;
OUTPUT:
     CODE CH
--------- --
       32
       67 C
       65 A
       87 W
       77 M
       52 4
6 rows selected.

The space opposite the 32 shows that 32 is a space in the ASCII character set.

CONCAT

You used the equivalent of this function on Day 3, when you learned about operators. The || symbol splices two strings together, as does CONCAT. It works like this:

INPUT:
SQL> SELECT CONCAT(FIRSTNAME, LASTNAME) "FIRST AND LAST NAMES"
  2  FROM CHARACTERS;
OUTPUT:
FIRST AND LAST NAMES
------------------------
KELLY          PURVIS
CHUCK          TAYLOR
LAURA          CHRISTINE
FESTER         ADAMS
ARMANDO        COSTALES
MAJOR          KONG
6 rows selected.
ANALYSIS:

Quotation marks surround the multiple-word alias FIRST AND LAST NAMES. Again, it is safest to check your implementation to see if it allows multiple-word aliases.

Also notice that even though the table looks like two separate columns, what you are seeing is one column. The first value you concatenated, FIRSTNAME, is 15 characters wide. This operation retained all the characters in the field.

INITCAP

INITCAP capitalizes the first letter of a word and makes all other characters lowercase.

INPUT:
SQL> SELECT FIRSTNAME BEFORE, INITCAP(FIRSTNAME) AFTER
  2  FROM CHARACTERS;
OUTPUT:
BEFORE         AFTER
-------------- ----------
KELLY          Kelly
CHUCK          Chuck
LAURA          Laura
FESTER         Fester
ARMANDO        Armando
MAJOR          Major
6 rows selected.

LOWER and UPPER

As you might expect, LOWER changes all the characters to lowercase; UPPER does just the reverse.

The following example starts by doing a little magic with the UPDATE function (you learn more about this next week) to change one of the values to lowercase:

INPUT:
SQL> UPDATE CHARACTERS
  2  SET FIRSTNAME = 'kelly'
  3  WHERE FIRSTNAME = 'KELLY';
OUTPUT:
1 row updated.
INPUT:
SQL> SELECT FIRSTNAME
  2  FROM CHARACTERS;
OUTPUT:
FIRSTNAME
---------------
kelly
CHUCK
LAURA
FESTER
ARMANDO
MAJOR
6 rows selected.

Then you write

INPUT:
SQL> SELECT FIRSTNAME, UPPER(FIRSTNAME), LOWER(FIRSTNAME)
  2  FROM CHARACTERS;
OUTPUT:
FIRSTNAME       UPPER(FIRSTNAME LOWER(FIRSTNAME
--------------- --------------- ---------------
kelly           KELLY           kelly
CHUCK           CHUCK           chuck
LAURA           LAURA           laura
FESTER          FESTER          fester
ARMANDO         ARMANDO         armando
MAJOR           MAJOR           major
6 rows selected.

Now you see the desired behavior.

LPAD and RPAD

LPAD and RPAD take a minimum of two and a maximum of three arguments. The first argument is the character string to be operated on. The second is the number of characters to pad it with, and the optional third argument is the character to pad it with. The third argument defaults to a blank, or it can be a single character or a character string. The following statement adds five pad characters, assuming that the field LASTNAME is defined as a 15-character field:

INPUT:
SQL> SELECT LASTNAME, LPAD(LASTNAME,20,'*')
  2  FROM CHARACTERS;
OUTPUT:
LASTNAME       LPAD(LASTNAME,20,'*'
-------------- --------------------
PURVIS         *****PURVIS
TAYLOR         *****TAYLOR
CHRISTINE      *****CHRISTINE
ADAMS          *****ADAMS
COSTALES       *****COSTALES
KONG           *****KONG
6 rows selected.
ANALYSIS:

Why were only five pad characters added? Remember that the LASTNAME column is 15 characters wide and that LASTNAME includes the blanks to the right of the characters that make up the name. Some column data types eliminate padding characters if the width of the column value is less than the total width allocated for the column. Check your implementation. Now try the right side:

INPUT:
SQL> SELECT LASTNAME, RPAD(LASTNAME,20,'*')
  2  FROM CHARACTERS;
OUTPUT:
LASTNAME        RPAD(LASTNAME,20,'*'
--------------- --------------------
PURVIS          PURVIS         *****
TAYLOR          TAYLOR         *****
CHRISTINE       CHRISTINE      *****
ADAMS           ADAMS          *****
COSTALES        COSTALES       *****
KONG            KONG           *****
6 rows selected.
ANALYSIS:

Here you see that the blanks are considered part of the field name for these operations. The next two functions come in handy in this type of situation.

LTRIM and RTRIM

LTRIM and RTRIM take at least one and at most two arguments. The first argument, like LPAD and RPAD, is a character string. The optional second element is either a character or character string or defaults to a blank. If you use a second argument that is not a blank, these trim functions will trim that character the same way they trim the blanks in the following examples.

INPUT:
SQL> SELECT LASTNAME, RTRIM(LASTNAME)
  2  FROM CHARACTERS;
OUTPUT:
LASTNAME        RTRIM(LASTNAME)
--------------- ---------------
PURVIS          PURVIS
TAYLOR          TAYLOR
CHRISTINE       CHRISTINE
ADAMS           ADAMS
COSTALES        COSTALES
KONG            KONG
6 rows selected.

You can make sure that the characters have been trimmed with the following statement:

INPUT:
SQL> SELECT LASTNAME, RPAD(RTRIM(LASTNAME),20,'*')
  2  FROM CHARACTERS;
OUTPUT:
LASTNAME        RPAD(RTRIM(LASTNAME)
--------------- --------------------
PURVIS          PURVIS**************
TAYLOR          TAYLOR**************
CHRISTINE       CHRISTINE***********
ADAMS           ADAMS***************
COSTALES        COSTALES************
KONG            KONG****************
6 rows selected.

The output proves that trim is working. Now try LTRIM:

INPUT:
SQL> SELECT LASTNAME, LTRIM(LASTNAME, 'C')
  2  FROM CHARACTERS;
OUTPUT:
LASTNAME        LTRIM(LASTNAME,
--------------- ---------------
PURVIS          PURVIS
TAYLOR          TAYLOR
CHRISTINE       HRISTINE
ADAMS           ADAMS
COSTALES        OSTALES
KONG            KONG
6 rows selected.

Note the missing Cs in the third and fifth rows.

REPLACE

REPLACE does just that. Of its three arguments, the first is the string to be searched. The second is the search key. The last is the optional replacement string. If the third argument is left out or NULL, each occurrence of the search key on the string to be searched is removed and is not replaced with anything.

INPUT:
SQL> SELECT LASTNAME, REPLACE(LASTNAME, 'ST') REPLACEMENT
  2  FROM CHARACTERS;
OUTPUT:
LASTNAME        REPLACEMENT
--------------- ---------------
PURVIS          PURVIS
TAYLOR          TAYLOR
CHRISTINE       CHRIINE
ADAMS           ADAMS
COSTALES        COALES
KONG            KONG
6 rows selected.

If you have a third argument, it is substituted for each occurrence of the search key in the target string. For example:

INPUT:
SQL> SELECT LASTNAME, REPLACE(LASTNAME, 'ST','**') REPLACEMENT
  2  FROM CHARACTERS;
OUTPUT:
LASTNAME        REPLACEMENT
--------------- ------------
PURVIS          PURVIS
TAYLOR          TAYLOR
CHRISTINE       CHRI**INE
ADAMS           ADAMS
COSTALES        CO**ALES
KONG            KONG
6 rows selected.

If the second argument is NULL, the target string is returned with no changes.

INPUT:
SQL> SELECT LASTNAME, REPLACE(LASTNAME, NULL) REPLACEMENT
  2  FROM CHARACTERS;
OUTPUT:
LASTNAME        REPLACEMENT
--------------- ---------------
PURVIS          PURVIS
TAYLOR          TAYLOR
CHRISTINE       CHRISTINE
ADAMS           ADAMS
COSTALES        COSTALES
KONG            KONG
6 rows selected.

SUBSTR

This three-argument function enables you to take a piece out of a target string. The first argument is the target string. The second argument is the position of the first character to be output. The third argument is the number of characters to show.

INPUT:
SQL> SELECT FIRSTNAME, SUBSTR(FIRSTNAME,2,3)
  2  FROM CHARACTERS;
OUTPUT:
FIRSTNAME       SUB
--------------- ---
kelly           ell
CHUCK           HUC
LAURA           AUR
FESTER          EST
ARMANDO         RMA
MAJOR           AJO
6 rows selected.

If you use a negative number as the second argument, the starting point is determined by counting backwards from the end, like this:

INPUT:
SQL> SELECT FIRSTNAME, SUBSTR(FIRSTNAME,-13,2)
  2  FROM CHARACTERS;
OUTPUT:
FIRSTNAME       SU
--------------- --
kelly           ll
CHUCK           UC
LAURA           UR
FESTER          ST
ARMANDO         MA
MAJOR           JO
6 rows selected.
ANALYSIS:

Remember the character field FIRSTNAME in this example is 15 characters long. That is why you used a -13 to start at the third character. Counting back from 15 puts you at the start of the third character, not at the start of the second. If you don't have a third argument, use the following statement instead:

INPUT:
SQL> SELECT FIRSTNAME, SUBSTR(FIRSTNAME,3)
  2  FROM CHARACTERS;
OUTPUT:
FIRSTNAME       SUBSTR(FIRSTN
--------------- -------------
kelly           lly
CHUCK           UCK
LAURA           URA
FESTER          STER
ARMANDO         MANDO
MAJOR           JOR
6 rows selected.

The rest of the target string is returned.

INPUT:
SQL> SELECT * FROM SSN_TABLE;
OUTPUT:
SSN__________
300541117
301457111
459789998
3 rows selected.
ANALYSIS:

Reading the results of the preceding output is difficult--Social Security numbers usually have dashes. Now try something fancy and see whether you like the results:

INPUT:
SQL> SELECT SUBSTR(SSN,1,3)||'-'||SUBSTR(SSN,4,2)||'-'||SUBSTR(SSN,6,4) SSN
  2  FROM SSN_TABLE;
OUTPUT:
SSN_________
300-54-1117
301-45-7111
459-78-9998
3 rows selected.


NOTE: This particular use of the substr function could come in very handy with large numbers using commas such as 1,343,178,128 and in area codes and phone numbers such as 317-787-2915 using dashes.

Here is another good use of the SUBSTR function. Suppose you are writing a report and a few columns are more than 50 characters wide. You can use the SUBSTR function to reduce the width of the columns to a more manageable size if you know the nature of the actual data. Consider the following two examples:

INPUT:
SQL> SELECT NAME, JOB, DEPARTMENT FROM JOB_TBL;
OUTPUT:
NAME______________________________________________________________
JOB_______________________________DEPARTMENT______________________
ALVIN SMITH
VICEPRESIDENT                     MARKETING

1 ROW SELECTED.
ANALYSIS:

Notice how the columns wrapped around, which makes reading the results a little too difficult. Now try this select:

INPUT:
SQL> SELECT SUBSTR(NAME, 1,15) NAME, SUBSTR(JOB,1,15) JOB,
            DEPARTMENT
  2  FROM JOB_TBL;
OUTPUT:
NAME________________JOB_______________DEPARTMENT_____
ALVIN SMITH         VICEPRESIDENT     MARKETING

Much better!

TRANSLATE

The function TRANSLATE takes three arguments: the target string, the FROM string, and the TO string. Elements of the target string that occur in the FROM string are translated to the corresponding element in the TO string.

INPUT:
SQL> SELECT FIRSTNAME, TRANSLATE(FIRSTNAME
  2  '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ
  3  'NNNNNNNNNNAAAAAAAAAAAAAAAAAAAAAAAAAA)
  4  FROM CHARACTERS;
OUTPUT:
FIRSTNAME       TRANSLATE(FIRST
--------------- ---------------
kelly           kelly
CHUCK           AAAAA
LAURA           AAAAA
FESTER          AAAAAA
ARMANDO         AAAAAAA
MAJOR           AAAAA
6 rows selected.

Notice that the function is case sensitive.

INSTR

To find out where in a string a particular pattern occurs, use INSTR. Its first argument is the target string. The second argument is the pattern to match. The third and forth are numbers representing where to start looking and which match to report. This example returns a number representing the first occurrence of O starting with the second character:

INPUT:
SQL> SELECT LASTNAME, INSTR(LASTNAME, 'O', 2, 1)
  2  FROM CHARACTERS;
OUTPUT:
LASTNAME        INSTR(LASTNAME,'O',2,1)
--------------- -----------------------
PURVIS                                0
TAYLOR                                5
CHRISTINE                             0
ADAMS                                 0
COSTALES                              2
KONG                                  2
6 rows selected.
ANALYSIS:

The default for the third and fourth arguments is 1. If the third argument is negative, the search starts at a position determined from the end of the string, instead of from the beginning.

LENGTH

LENGTH returns the length of its lone character argument. For example:

INPUT:
SQL> SELECT FIRSTNAME, LENGTH(RTRIM(FIRSTNAME))
  2  FROM CHARACTERS;
OUTPUT:
FIRSTNAME       LENGTH(RTRIM(FIRSTNAME))
--------------- ------------------------
kelly                                  5
CHUCK                                  5
LAURA                                  5
FESTER                                 6
ARMANDO                                7
MAJOR                                  5
6 rows selected.
ANALYSIS:

Note the use of the RTRIM function. Otherwise, LENGTH would return 15 for every value.

Conversion Functions

These three conversion functions provide a handy way of converting one type of data to another. These examples use the table CONVERSIONS.

INPUT:
SQL> SELECT * FROM CONVERSIONS;
OUTPUT:
NAME              TESTNUM
--------------- ---------
40                     95
13                     23
74                     68

The NAME column is a character string 15 characters wide, and TESTNUM is a number.

TO_CHAR

The primary use of TO_CHAR is to convert a number into a character. Different implementations may also use it to convert other data types, like Date, into a character, or to include different formatting arguments. The next example illustrates the primary use of TO_CHAR:

INPUT:
SQL> SELECT TESTNUM, TO_CHAR(TESTNUM)
  2  FROM CONVERT;
OUTPUT:
  TESTNUM TO_CHAR(TESTNUM)
--------- ----------------
       95               95
       23               23
       68               68

Not very exciting, or convincing. Here's how to verify that the function returned a character string:

INPUT:
SQL> SELECT TESTNUM, LENGTH(TO_CHAR(TESTNUM))
  2  FROM CONVERT;
OUTPUT:
  TESTNUM LENGTH(TO_CHAR(TESTNUM))
--------- ------------------------
       95                        2
       23                        2
       68                        2
ANALYSIS:

LENGTH of a number would have returned an error. Notice the difference between TO CHAR and the CHR function discussed earlier. CHR would have turned this number into a character or a symbol, depending on the character set.

TO_NUMBER

TO_NUMBER is the companion function to TO_CHAR, and of course, it converts a string into a number. For example:

INPUT:
SQL> SELECT NAME, TESTNUM, TESTNUM*TO_NUMBER(NAME)
  2  FROM CONVERT;
OUTPUT:
NAME             TESTNUM TESTNUM*TO_NUMBER(NAME)
--------------- -------- -----------------------
40                    95                    3800
13                    23                     299
74                    68                    5032
ANALYSIS:

This test would have returned an error if TO_NUMBER had returned a character.

Miscellaneous Functions

Here are three miscellaneous functions you may find useful.

GREATEST and LEAST

These functions find the GREATEST or the LEAST member from a series of expressions. For example:

INPUT:
SQL> SELECT GREATEST('ALPHA', 'BRAVO','FOXTROT', 'DELTA')
  2  FROM CONVERT;
OUTPUT:
GREATEST
-------
FOXTROT
FOXTROT
FOXTROT
ANALYSIS:

Notice GREATEST found the word closest to the end of the alphabet. Notice also a seemingly unnecessary FROM and three occurrences of FOXTROT. If FROM is missing, you will get an error. Every SELECT needs a FROM. The particular table used in the FROM has three rows, so the function in the SELECT clause is performed for each of them.

INPUT:
SQL> SELECT LEAST(34, 567, 3, 45, 1090)
  2  FROM CONVERT;
OUTPUT:
LEAST(34,567,3,45,1090)
-----------------------
                      3
                      3
                      3

As you can see, GREATEST and LEAST also work with numbers.

USER

USER returns the character name of the current user of the database.

INPUT:
SQL> SELECT USER FROM CONVERT;
OUTPUT:
USER
------------------------------
PERKINS
PERKINS
PERKINS

There really is only one of me. Again, the echo occurs because of the number of rows in the table. USER is similar to the date functions explained earlier today. Even though USER is not an actual column in the table, it is selected for each row that is contained in the table.

Summary

It has been a long day. We covered 47 functions--from aggregates to conversions. You don't have to remember every function--just knowing the general types (aggregate functions, date and time functions, arithmetic functions, character functions, conversion functions, and miscellaneous functions) is enough to point you in the right direction when you build a query that requires a function.

Q&A

Q Why are so few functions defined in the ANSI standard and so many defined by the individual implementations?

A ANSI standards are broad strokes and are not meant to drive companies into bankruptcy by forcing all implementations to have dozens of functions. On the other hand, when company X adds a statistical package to its SQL and it sells well, you can bet company Y and Z will follow suit.

Q I thought you said SQL was simple. Will I really use all of these functions?

A The answer to this question is similar to the way a trigonometry teacher might respond to the question, Will I ever need to know how to figure the area of an isosceles triangle in real life? The answer, of course, depends on your profession. The same concept applies with the functions and all the other options available with SQL. How you use functions in SQL depends mostly on you company's needs. As long as you understand how functions work as a whole, you can apply the same concepts to your own queries.

Workshop

The Workshop provides quiz questions to help solidify your understanding of the material covered, as well as exercises to provide you with experience in using what you have learned. Try to answer the quiz and exercise questions before checking the answers in Appendix F, "Answers to Quizzes and Exercises."

Quiz

1. Which function capitalizes the first letter of a character string and makes the rest lowercase?

2. Which functions are also known by the name group functions?

3. Will this query work?

SQL> SELECT COUNT(LASTNAME) FROM CHARACTERS;
4. How about this one?
SQL> SELECT SUM(LASTNAME) FROM CHARACTERS;
5. Assuming that they are separate columns, which function(s) would splice together FIRSTNAME and LASTNAME?

6. What does the answer 6 mean from the following SELECT?

INPUT:
SQL> SELECT COUNT(*) FROM TEAMSTATS;
OUTPUT:
COUNT(*)
7. Will the following statement work?
SQL> SELECT SUBSTR LASTNAME,1,5 FROM NAME_TBL;

Exercises

1. Using today's TEAMSTATS table, write a query to determine who is batting under .25. (For the baseball-challenged reader, batting average is hits/ab.)

2. Using today's CHARACTERS table, write a query that will return the following:

INITIALS__________CODE
K.A.P.              32
1 row selected.


Previous chapterNext chapterContents


Macmillan Computer Publishing USA

© Copyright, Macmillan Computer Publishing. All rights reserved.