The Perl programs you have seen so far deal with scalar values, which are single units of data, and scalar variables, which can store one piece of information.
Perl also enables you to define an ordered collection of values, known as a list; this collection of values can be stored in variables known as array variables.
Today's lesson describes lists and array variables, and it shows you what you can do with them. Today, you learn about the following:
A list is a sequence of scalar values enclosed in parentheses. The following is a simple example of a list:
(1, 5.3, "hello", 2)
This list contains four elements, each of which is a scalar value: the numbers 1 and 5.3, the string hello, and the number 2.
Lists can be as long as needed, and they can contain any scalar value. A list can have no elements at all, as follows:
()
This list also is called an empty list.
NOTE |
A list with one element and a scalar value are different entities. For example, the list and the scalar value are not the same thing. This is not a severe limitation because one can be converted to or assigned to the other. See the section titled "Assigning to Scalar Variables from Array Variables" later today. |
A scalar variable name can always be included as part of a list. In this case, the current value of the scalar variable becomes the list element value. For example:
(17, $var, "a string")
If $var has been assigned the value 26, the second element of the list becomes 26. (It remains 26 even if a different value is assigned to $var.)
Similarly, you can use the value of an expression as an element of a list. For example:
(17, 26 << 2)
This list contains two elements: 17 and 104 (which is 26 left-shifted two places). Expressions in lists, like other expressions, can contain scalar variables.
(17, $var1 + $var2)
Here, the expression $var1 + $var2 is evaluated and its value becomes the second element of the list.
Because character strings are scalar values, they can be used in lists, as follows:
("my string", 24.3, "another string")
You can substitute for scalar variable names in character strings in lists, as follows:
($value, "The answer is $value")
This list contains two elements: the value of the scalar variable $value, and a string containing the name of $value. If the current value of $value is 26, the two elements of the list are 26 and The answer is 26.
Perl enables you to store lists in special variables designed for that purpose. These variables are called array variables (or arrays for short).
The following is an example of a list being assigned to an array variable:
@array = (1, 2, 3);
Here, the list (1, 2, 3) is assigned to the array variable @array.
Note that the name of the array variable starts with the character @. This enables Perl to distinguish array variables from other kinds of variables-for example, scalar variables, which start with the character $. As with scalar variables, the second character of the variable name must be a letter, while subsequent characters of the name can be letters, numbers, or underscores. Array variable names can be as long as you want.
The following are legal array-variable names:
@my_array @list2 @a_very_long_array_name_with_lots_of_underscores
The following are not legal array-variable names:
@1array # can't start with a number @_array # can't start with an underscore @a.new.array # . is not a legal variable-name character
When an array variable is first created (that is, seen for the
first time), it is assumed to contain the empty list ()
unless it is assigned to.
NOTE |
Because Perl uses @ and $ to distinguish array variables from scalar variables, the same name can be used in an array variable and in a scalar variable. For example: Here, the name var is used in both the scalar variable $var and the array variable @var. These are two completely separate variables. Normally, you won't want to use the same name in both an array and a scalar variable, because this is confusing. |
After you have assigned a list to an array variable, you can refer to any element of the array variable as if it is a scalar variable.
For example, to assign the first element of the array variable @array to the scalar variable $scalar, use the following statement:
$scalar = $array[0];
The character sequence [0] is an example of a subscript. A subscript indicates a particular element of an array. In this case, 0 refers to the first element of the array. Similarly, the subscript 1 refers to the second element of the array, as follows:
$scalar = $array[1];
Here, the second element of the array @array is assigned to $scalar. The general rule is this:
An array subscript n, where n is any non-negative integer, always refers to array element n+1.
This notation is employed to ensure compatibility with the C programming language, which also starts its array subscripting with 0.
You can assign a scalar value to an individual array element in the same way:
@array = (1, 2, 3, 4); $array[3] = 5;
After the second assignment, the value of @array becomes
(1, 2, 3, 5)
This is because the fourth element of the array has been replaced.
NOTE |
If you try to access an array element that does not exist, the Perl interpreter uses the null string (which is equivalent to zero). Here, $array[4] refers to the fifth element of @array, which does not exist. In this case, $scalar is assigned the null string. |
NOTE |
The same thing happens when the subscript is a negative number, as follows: $scalar = $array[-1]; Once again, the null string is assigned to $scalar. Note also that arrays automatically grow when a previously unreferenced element is assigned to for the first time: Because the seventh element of @array is assigned 17, the value of @array is now (1, 2, 3, 4, "", "", 17) The missing fifth and sixth elements now contain the null string. |
You can use the value of a scalar variable as a subscript, as follows:
$index = 1; $scalar = $array[$index];
Here, the value of $index, 1, becomes the subscript.
This means that the second element of @array is assigned
to $scalar.
When you use a scalar variable as a subscript, make sure that the value stored in the scalar variable corresponds to an array element that exists. For example: Here, the third statement tries to access the fifth element of @array, which does not exist. In this case, $scalar is assigned the null string, and the Perl interpreter doesn't tell you that anything went wrong. |
Note that the first character of an array-element variable name is the $ character, not the @ character. For example, to refer to the first element of the array @potato, use
$potato[0]
and not
@potato[0]
The basic rule is as follows:
Things that reference one value-such as scalar variables and array
elements-must start with a $.
NOTE |
Even though references to elements of array variables start with a $, the Perl interpreter still has no trouble distinguishing scalar variables from array-variable elements. For example, if you have defined a scalar variable $potato and an array variable @potato, the Perl interpreter uses the subscript to distinguish between the scalar variable and the array-variable element. $result = $potato; # the scalar variable $potato |
Now that you have seen how lists and array variables work, it's time to take a look at a simple program that uses them. Listing 5.1 is a simple program that prints the elements of a list.
Listing 5.1. A program that prints the elements of a list.
1: #!/usr/local/bin/perl 2: 3: @array = (1, "chicken", 1.23, "\"Having fun?\"", 9.33e+23); 4: $count = 1; 5: while ($count <= 5) { 6: print ("element $count is $array[$count-1]\n"); 7: $count++; 8: }
$ program5_1 element 1 is 1 element 2 is chicken element 3 is 1.23 element 4 is "Having fun?" element 5 is 9.3300000000000005+e23 $
Line 3 assigns a list containing five elements to the array variable @array.
Line 5 tests whether $count is less than or equal to 5. This conditional expression ensures that the while statement loops five times.
Line 6 prints the current value of $count and the corresponding element of @array. Note that the expression used in the subscript is $count-1, not $count, because subscripting starts from 0. For example, when count is 3, the subscript is 2, which means that the third element of @array is printed.
When you examine line 6, you see that Perl lets you substitute for array elements in character strings. When the Perl interpreter sees $array[$count-1] in the character string, it replaces this array element name with its corresponding value.
Listing 5.2 is another example of a program that uses arrays. This one is a little more interesting; it uses the built-in functions rand and int to generate random integers between 1 and 10.
Listing 5.2. A program that generates random integers between 1 and 10.
1: #!/usr/local/bin/perl 2: 3: # collect the random numbers 4: $count = 1; 5: while ($count <= 100) { 6: $randnum = int( rand(10) ) + 1; 7: $randtotal[$randnum] += 1; 8: $count++; 9: } 10: 11: # print the total of each number 12: $count = 1; 13: print ("Total for each number:\n"); 14: while ($count <= 10) { 15: print ("\tnumber $count: $randtotal[$count]\n"); 16: $count++; 17: }
$ program5_2 Total for each number: number 1: 11 number 2: 8 number 3: 13 number 4: 6 number 5: 10 number 6: 9 number 7: 12 number 8: 11 number 9: 11 number 10: 9 $
This program is divided into two parts: the first part collects the random numbers, and the second part prints them.
Line 5 ensures that the loop iterates (is performed) 100 times. You can just as easily have the program generate any other quantity of random numbers just by changing the value in this conditional expression.
Line 6 generates a random number between 1 and 10 and assigns it to the scalar variable $randnum. To see how it does this, first note that the code fragment
int ( rand (10) )
actually is two function calls, one inside another. When the Perl interpreter sees this, it first calls the inner one, which is rand. The value returned by rand becomes the argument to the library function int.
Here's how line 6 generates a random number:
Line 7 now adds 1 to the element of the array @randtotal
corresponding to the number generated. For example, if the random
number is 7, the array element $randtotal[7] has 1 added
to it.
NOTE |
As you can see, line 7 works even though @randtotal is not initialized. When the program refers to an array element for the first time, the Perl interpreter assumes that the element has an initial value of the null string "". This null string is converted to 0, which means that adding 1 for the first time produces the result 1, which is what you want. |
The second part of the program, which prints the total of each random number, starts with lines 12 and 13. These lines get things started by resetting the counter variable $count to 1 and printing an introductory message.
The conditional expression in line 14 ensures that the loop iterates 10 times-once for each possible random number.
Line 15 prints the total for a particular random number.
As you have just seen, Perl lets you substitute for array-element variable names in strings, as follows:
print ("element $count is $array[ $count-1]\n");
This might lead to problems if you want to include the characters [ and ] in character strings. For example, suppose that you have defined the scalar variable $var and the array variable @var. The character string
"$var[0]"
substitutes the value of the first element of @var in the string. To substitute the value of $var and keep the [0] as it is, you must use one of the following:
"${var}[0]" "$var\[0]" "$var" . "[0]"
The character string
"${var}[0]"
uses the brace characters { and } to keep var
and [ separate; this tells the Perl interpreter to substitute
for the variable $var, not $var[0]. After the
substitution, the brace characters are not included in the string.
NOTE |
To include a brace character after a $, use a backslash, as follows: "$\{var}" This character string contains the text ${var}. |
The character string
"$var\[0]"
uses \ to indicate that the [ character is to be given a different meaning than normal; in this case, this means that [ is to be treated as a printable character and not as part of the variable name to be substituted.
The expression
"$var" . "[0]"
consists of two character strings joined together by the . operator. Here, the Perl interpreter replaces the first character string with the current value of $var.
Suppose that you want to define a list consisting of the numbers 1 through 10, inclusive. You can do this by typing each of the numbers in turn.
(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
However, there is a simpler way to do it: Use the list-range operator, which is .. (two consecutive period characters). The following is an example of a list created using the list-range operator:
(1..10)
This tells Perl to define a list that has a first value of 1, a second value of 2, and so on up to 10.
The list-range operator can be used to define part of a list.
(2, 5..7, 11)
This list consists of five elements: the numbers 2, 5, 6, 7, and 11.
List-range operators can be used with floating-point values. For example:
(2.1..5.3)
This list consists of four elements: 2.1, 3.1,
4.1, and 5.1. Each element of the list is one
greater than the previous element, and the last element of the
list is the largest possible number less than or equal to the
number to the right of the .. operator. Here, 5.1
is less than 5.3, so it is included in the list; however,
6.1 is greater than 5.3, so it is not included.
NOTE |
If the value to the left of the .. operator is greater than the value to the right, an empty list is created. (4.5..1.6) Because 4.5 is greater than 1.6, this list is empty. If the two values are equal, a one-element list is created. (3..3) This is equivalent to the list (3). |
List-range operators can specify ranges of strings. For example, the list ("aaa", "aab", "aac", "aad") can be expressed as ("aaa".."aad"). Similarly, the list ("BCY", "BCZ", "BDA", "BDB") is equivalent to ("BCY".."BDB"), and the statement @alphabet = ("a".."z"); creates a list consisting of the 26 lowercase letters of the alphabet and assigns this list to the array variable @alphabet.
List ranges also enable you to use strings to specify numbers that contain leading zeros.
@day_of_month = ("01".."31");
This statement creates a list consisting of the strings 01, 02, 03 and so on, up to 31, and then assigns this list to @day_of_month. Because each string contains two characters, this array is suitable for use when you are printing a date in a format such as 08-June-1960.
The values that define the range of a list-range operator can be expressions, and these expressions can contain scalar variables. For example:
($var1..$var2+5)
This list consists of all values between the current value of $var1 and the current value of the expression $var2+5.
Listing 5.3 is an example of a program that uses list ranges. This program asks for a start number and an end number, and it prints all the numbers between them.
Listing 5.3. A program that uses list ranges to print a list of numbers.
1: #!/usr/local/bin/perl 2: 3: print ("Enter the start number:\n"); 4: $start = <STDIN>; 5: chop ($start); 6: print ("Enter the end number:\n"); 7: $end = <STDIN>; 8: chop ($end); 9: @list = ($start..$end); 10: $count = 0; 11: print ("Here is the list:\n"); 12: while ($list[$count] != 0 || $list[$count-1] == -1 || 13: $list[$count+1] == 1) { 14: print ("$list[$count]\n"); 15: $count++; 16: }
$ program5_3 Enter the start number: -2 Enter the end number: 2 Here is the list: -2 -1 0 1 2 $
Lines 3 through 5 retrieve the start of the range to be printed. Line 3 retrieves the number from the standard input file. Line 4 assigns the resulting number to the scalar variable $start. Line 5 chops the trailing newline character.
Lines 6 through 8 repeat the same process for the end of the range, assigning the end of the range to the scalar variable $end.
Line 9 creates a list that consists of the numbers between $start and $end, and stores the list in the array variable @list.
Line 10 initializes the counter variable $count to 0.
Line 11 is a print statement that indicates that the list is about to be printed.
Lines 12 and 13 are the start of the loop that prints the range. The conditional expression to be evaluated consists of three subexpressions that are operands for the logical or operator ||. If any of these subexpressions are true, the loop continues.
The first subexpression tests for the end of the range. To do this, it takes advantage of the fact that an unidentified list element is equal to the null string and that the null string is equivalent to 0. When the list element $list[$count] is undefined, the following subexpression is false:
$list[$count] != 0
The second and third subexpressions cover the cases in which 0 is actually a part of the list. If the list to be printed contains 0, one or both of the following conditions must be true:
The second and third subexpressions test for these conditions. If either or both of these conditions is true, at least one of the following subexpressions also must be true:
$list[$count-1] == -1 $list[$count+1] == 1
This ensures that the loop continues. Of course, this doesn't cover the case in which the list consists of just 0; however, that's not a meaningful case. (If you want to be finicky, you can add a special chunk of code that prints 0 if $start and $end are both 0, but that's not really worth bothering with.)
After this, the rest of the program is straightforward. Line 14
prints a number in the range, line 15 adds one to the counter
variable $count, and line 16 ends the while
statement.
TIP |
One of the problems with Perl is that it is sometimes difficult to distinguish the following scalar variable or array-element values: There are several ways of dealing with this confusion: |
So far, you've seen that you can assign lists to array variables.
@array = (1, 2, 3, 4, 5);
You've also seen that you can assign an element of an array to a scalar variable.
$scalar = $array[3];
The following sections describe the other ways you can use assignment with lists and array variables.
You also can assign one array variable to another.
@result = @original;
Here, the list currently stored in the array variable @original is copied to the array variable @result. Each element of the new array @result is the same as the corresponding element of the array @original. Listing 5.4 shows that this is true.
Listing 5.4. A program that copies an array and compares the elements of the two arrays.
1: #!/usr/local/bin/perl 2: 3: @array1 = (14, "cheeseburger", 1.23, -7, "toad"); 4: @array2 = @array1; 5: $count = 1; 6: while ($count <= 5) { 7: print("element $count: $array1[$count-1] "); 8: print("$array2[$count-1]\n"); 9: $count++; 10: }
$ program5_4 element 1: 14 14 element 2: cheeseburger cheeseburger element 3: 1.23 1.23 element 4: -7 -7 element 5: toad toad $
Line 3 assigns the list
(14, "cheeseburger", 1.23, -7, "toad")
to the array variable @array1. Line 4 then copies this array into a second array variable, @array2.
The rest of the program prints the elements of each array, as follows:
NOTE |
You can assign to multiple arrays in one statement. For example: @array1 = @array2 = (1, 2, 3); This assigns a copy of the list (1, 2, 3) to both @array1 and @array2. |
As you've already seen, lists can contain scalar variables. For example:
@list = (1, $scalar, 3);
Here, the value of the scalar variable $scalar becomes the second element of the list assigned to @list.
You also can specify that the value of an array variable is to appear in a list, as follows:
@list1 = (2, 3, 4); @list2 = (1, @list1, 5);
Here, the value of the array variable @list1-the list (2, 3, 4)-is substituted for the name @list1, and the resulting list (1, 2, 3, 4, 5) is assigned to @list2.
Listing 5.5 shows an example of a list being contained in another list.
Listing 5.5. A program that assigns a list as part of another list.
1: #!/usr/local/bin/perl 2: 3: @innerlist = " never "; 4: @outerlist = ("I", @innerlist, "fail!\n"); 5: print @outerlist;
$ program5_5 I never fail! $
Although this program is quite simple, it contains a couple of new tricks. The first of these is in line 3. Here, a scalar value, " never " (note the surrounding spaces), is assigned to the array variable @innerlist. This works because the Perl interpreter automatically converts the scalar value into a one-element list before assigning it to the array variable.
Line 4 assigns a list to the array variable @outerlist. This list is assembled by taking the following list:
("I", @innerlist, "fail!\n")
and substituting in the current value of the array variable @innerlist. As a result, the list assigned to @outerlist is
("I", " never ", "fail!\n")
Line 5 prints the list. To do this, it calls the library function print and passes it the array variable @outerlist. When print is given an array variable or a list to print, it prints each element in turn. This means that the following is written to the standard output file:
I never fail!
Note that print doesn't leave any spaces between the elements of the list when it prints them. The only reason the output is readable is because the character string contains spaces around never. This means that print isn't usually used to print a list of numbers in this way:
@list = (1, 2, 3); print @list;
This prints the following, which isn't quite what you want:
123
TIP |
In Listing 5.5, the argument passed to the print function is not enclosed in parentheses. This is perfectly acceptable. In Perl, the parentheses enclosing arguments to functions are optional. For example, when you call the library function chop, instead of writing chop ($number); you can write chop $number; Although this saves a few extra keystrokes, it makes things a little less readable (in this author's opinion) Besides, eliminating the parentheses can lead to problems. Consider the following example $fred = "Fred"; This code prints Hello, Fred! In this case, the parentheses enclosing the arguments to print are absolutely necessary. Without them, you have print ("Hello, " . $fred . "!\n") x 2; When the Perl interpreter sees this statement, it assumes that print is being called with the following argument, which is not what you want: "Hello, " . $fred . "!\n" As always in programming, the basic rule to follow is this: Do whatever makes your program easier to work with, and use your best judgment. |
As you have seen, Perl does not leave spaces if you pass an array variable to print:
@array = (1, 2, 3); print (@array, "\n");
This prints the following on your screen:
123
To get around this problem, put the array you want to print into a string:
print ("@array\n");
When the Perl interpreter sees the array variable inside the string, it substitutes the values of the list assigned to the array variables, and leaves a space between each pair of elements. For example:
@array = (1, 2, 3); print ("@array\n");
This prints the following on your screen:
1 2 3
Consider the following assignment, which you've already seen:
@array = ($var1, $var2);
Here, the values of the scalar variables $var1 and $var2 are used to form a two-element list that is assigned to the array variable @array.
Perl also enables you to take the current value of an array variable and assign its components to a group of scalar variables. For example:
@array = (5, 7); ($var1, $var2) = @array;
Here, the first element of the list currently stored in @array, 5, is assigned to $var1. The second element, 7, is assigned to $var2.
Additional elements in an array, if they exist, are ignored. For example:
@array = (5, 7, 11); ($var1, $var2) = @array;
Here, 5 is assigned to $var1, 7 is assigned to $var2, and 11 is not assigned to anything.
If there are more scalar variables than elements in an array variable, the excess scalar variables are assigned the null string, as follows:
@array = (5, 7); ($var1, $var2, $var3) = @array;
This assigns 5 to $var1 and 7 to $var2.
Because there are not enough elements in @array to assign
anything to $var3, $var3 is assigned the null
string "".
NOTE |
You also can assign to several scalar variables using a list. For example: ($var1, $var2, $var3) = ("one", "two", "three"); This assigns one to $var1, two to $var2, and three to $var3. As with array variables, extra values in the list are ignored and extra scalar variables are assigned the null string, as follows: ($var1, $var2) = (1, 2, 3); # 3 is ignored |
As you've seen, lists and array variables can be any length you want. As a consequence, Perl provides a way of determining the length of the list assigned to an array variable.
Here's how it works: If an array variable (or list) appears anywhere that a scalar value is expected, the Perl interpreter obtains a scalar value by calculating the length of the list assigned to the array variable.
Consider the following example:
@array = (1, 2, 3); $scalar = @array;
In the assignment to $scalar, the Perl interpreter replaces
@array with the length of the list currently assigned
to @array, which is 3. $scalar, therefore,
is assigned the value 3.
NOTE |
Note that the following two statements are not equivalent: $scalar = @array; In the first statement, the length of the list in @array is assigned to $scalar. In the second statement, the first element of @array is assigned to $scalar. It is always important to remember that $scalar and ($scalar) are not the same thing. $scalar is a scalar variable, and ($scalar) is a one-element list containing $scalar. |
Being able to access the length of an array is useful if you want to write a loop that performs an operation on every element of an array. Listing 5.6 is an example of a program that does just that.
Listing 5.6. A program that prints every element of an array.
1: #!/usr/local/bin/perl 2: 3: @array = (14, "cheeseburger", 1.23, -7, "toad"); 4: $count = 1; 5: while ($count <= @array) { 6: print("element $count: $array[$count-1]\n"); 7: $count++; 8: }
$ program5_6 element 1: 14 element 2: cheeseburger element 3: 1.23 element 4: -7 element 5: toad $
The only new feature of this program is line 5, which compares the counter variable $count to the length of the array @array. Because the list assigned to @array contains five elements, the conditional expression
$count <= @array
ensures that the loop iterates five times.
Once again, note that the subscript in line 6 is $count-1, not $count. This caution bears repeating: It is very easy to forget to subtract 1 when you use a value as a subscript.
If you like, you can write your loop in a different way and use $count as a subscript. For example:
$count = 0; while ($count < @array) { print ("element $count+1: $array[ $count]\n"); }
As you can see, this isn't any easier to follow because you now have to remember these two things:
As you can see, there is no intuitive or obvious way of writing
programs that loop through arrays. Generally, it's best to pick
the way that is easiest for you to remember.
You cannot retrieve the length of a list without first assigning the list to an array variable. For example: @array = (10, 20, 30); This assigns 3 to $scalar. Compare this with the following statement: $scalar = (10, 20, 30); This statement actually assigns 30 to $scalar, not 3. In this statement, the subexpression (10, 20, 30) is treated as three scalar values separated by comma operators. For more details on the comma operator, refer to "The Comma Operator" in Day 4. |
As you've seen, array subscripting enables you to change or access one element of an array. For example:
$var = $array[2]; $array[2] = $var;
Perl enables you to access more than one element of an array at a time in much the same way. Following is a simple example:
@subarray = @array[0,1];
Here, the code fragment
@array[0,1]
refers to the first two elements of the list stored in the array variable. This portion of the array is known as an array slice. An array slice is treated just like any other list. In the statement
@subarray = @array[0,1];
the list consisting of the first two elements of @array is assigned to the array variable @subarray.
Here is another example:
@slice = @array[1,2,3];
This statement assigns the array slice consisting of the second,
third, and fourth elements of @array to the array variable
@slice.
Although single elements of an array are referenced using the $ character, array slices are referenced using @: $var = $array[0]; The basic rules are as follows:
|
Listing 5.7 shows a simple example of an array slice.
Listing 5.7. A program that demonstrates the use of an array slice.
1: #!/usr/local/bin/perl 2: 3: @array = (1, 2, 3, 4); 4: @subarray = @array[1,2]; 5: print ("The first element of subarray is $subarray[0]\n"); 6: print ("The second element of subarray is $subarray[1]\n");
$ program5_7 The first element of subarray is 2 The second element of subarray is 3 $
Line 3 of this program assigns the following list to the array variable @array:
(1, 2, 3, 4)
Line 4 assigns a slice of the array variable @array to the array variable @subarray. The array slice
@array[1,2]
specifies that the second and third elements of the array are
to be treated as a list and assigned to @subarray.
NOTE |
In array slices, as in references to single elements of an array, subscripts start from zero. For example, the array slice @array[1,2] refers to the second and third elements of an array. |
The final two lines of the program print the two elements of the array variable @subarray. As you can see, these elements are identical to the second and third elements of @array.
Perl provides a convenient way to refer to large array slices. Instead of writing
@array[0,1,2,3,4]
to refer to the first five elements of array @array, you can use the list range operator, as follows:
@array[0..4]
This enables you to assign large array slices easily:
@subarray = @array[0..19];
This assigns the first 20 elements of @array to @subarray.
You can use the value of a scalar variable in a list range in an array slice subscript. The following is an example:
$endrange = 19; @subarray = @array[0..$endrange];
Here, the scalar variable $endrange contains the upper limit of the array slice, which in this case is 19. This means that the array slice to assign is
@array[0..19]
which assigns the first 20 elements of @array to @subarray.
You can also use the list stored in an array variable to define an array slice. Listing 5.8 shows how this works.
Listing 5.8. A program that uses an array variable as an array-slice subscript.
1: #!/usr/local/bin/perl 2: 3: @array = ("one", "two", "three", "four", "five"); 4: @range = (1, 2, 3); 5: @subarray = @array[@range]; 6: print ("The array slice is: @subarray\n");
$ program5_8 The array slice is: two three four $
Line 3 of this program assigns the following list to the array variable @array:
("one", "two", "three", "four", "five")
Line 4 assigns the list (1, 2, 3) to the array variable @range, which is to serve as the list range.
Line 5 uses the value of @range as the array subscript for an array slice. Because @range contains (1, 2, 3), the slice of @array that is selected consists of the second, third, and fourth elements. These elements are then assigned to the array variable @subarray.
Line 6 prints the selected array slice. When the Perl interpreter sees the variable name @subarray in the character string to be printed, it substitutes the value of @subarray for its name. Because @subarray is inside a character string, the Perl interpreter leaves a space between each pair of elements when printing.
Compare line 6 with the following:
print (@subarray, "\n");
Here, print leaves no spaces between the elements of @subarray, which means that it prints
twothreefour
Which outcome you want depends, of course, on what you want your program to do.
You can assign to array slices using the notation you have just seen. The following is an example:
@array[0,1] = ("string", 46);
Here, the first two elements of the array @array become string and 46, respectively.
You can use list-range operators and variables when you assign to array slices as well. The following is an example:
@array[0..3] = (1, 2, 3, 4); @array[0..$endrange] = (1, 2, 3, 4);
If there are more items in the array slice than in the list, the extra items in the array slice are assigned the null string, as follows:
@array[0..2] = ("string1", "string2");
The third element of @array now holds the null string.
If there are fewer items in the array slice than in the list, the extra items in the list are ignored, as in the following:
@array[0..2] = (1, 2, 3, 4);
In this assignment, the fourth element in the list, 4, is not assigned to anything.
When an array slice is assigned to, the remainder of the array is not changed. Listing 5.9 shows how this works.
Listing 5.9. A program that assigns to an array slice.
1: #!/usr/local/bin/perl 2: 3: @array = ("old1", "old2", "old3", "old4"); 4: @array[1,2] = ("new2", "new3"); 5: print ("@array\n");
$ program5_9 old1 new2 new3 old4 $
In the preceding program, the only statement that did not appear in previous programs is line 4, which assigns the list ("new2", "new3") to the array slice of @array consisting of the second and third elements. This assignment changes the value of @array from
("old1", "old2", "old3", "old4")
to
("old1", "new2", "new3", "old4")
Line 5 then prints the changed array.
As you've seen, Perl enables you to use array slices on either side of an assignment statement. The following is an example:
@newarray = @array[2,3,4]; @array[2,3,4] = @newarray;
This means that you can assign from one array slice to another, even if the two slices overlap, as in the following:
@array[1,2,3] = @array[2,3,4];
The Perl interpreter has no problem with this statement because
it copies the list stored
in @array[2,3,4] into a temporary location (invisible
to you) before assigning it to @array[1,2,3].
Listing 5.10 provides an example of overlapping array slices in use.
Listing 5.10. A program containing overlapping array slices.
1: #!/usr/local/bin/perl 2: 3: @array = ("one", "two", "three", "four", "five"); 4: @array[1,2,3] = @array[2,3,4]; 5: print ("@array\n");
$ program5_10 one three four five five $
Line 4 is an example of an assignment with overlapping array slices. At the time of assignment, the array slice @array[2,3,4] contains the list
("three", "four", "five")
This list consists of the last three elements of @array. Assigning this list to @array[1,2,3] means that the list stored in @array changes from
("one", "two", "three", "four", "five")
to
("one", "three", "four", "five", "five")
NOTE |
Overlapping array slices of varying lengths are dealt with in the same way as other array slice assignments of non-matching lengths. For example: @array = (1, 2, 3, 4, 5); This assignment assigns the array slice @array[3,4], which is the list (4, 5), to the array slice @array[0..2]. After this assignment, the value of @array is the list (4, 5, "", 4, 5) The third element of @array is now the null string because there are only two elements in the array slice being assigned. |
So far, I've been using the following array-slice notation to refer to consecutive elements of an array:
@array[0,1]
In Perl, however, there is no real difference between an array slice and a list containing consecutive elements of the same array. For example, the following statements are equivalent:
@subarray = @array[0,1]; @subarray = ($array[0], $array[1]);
Because of this, you can use the array-slice notation to refer to any elements of an array, regardless of whether they are in order. For example, the following two statements are equivalent:
@subarray = ($array[4], $array[1], $array[3]); @subarray = @array[4,1,3];
In both cases, the array variable @subarray is assigned a list consisting of three elements: the fifth, second, and fourth elements of @array.
You can use this array-slice notation in a variety of ways. For example, you can assign one element of an array multiple times:
@subarray = @array[0,0,0];
This creates a list consisting of three copies of the first element of @array, and then assigns this list to @subarray.
The array-slice notation provides an easy way to swap elements in a list. The following is an example:
@array[1,2] = @array[2,1];
This statement swaps the second and third elements of @array. As with the overlapping array slices you saw earlier, the Perl interpreter copies @array[2,1] into a temporary location before assigning it, which ensures that the assignment takes place properly.
For an example of a program that swaps array elements, look at Listing 5.11, which sorts the elements in an array using a simple sort algorithm.
Listing 5.11. A program that sorts an array.
1: #!/usr/local/bin/perl 2: 3: # read the array from standard input one item at a time 4: print ("Enter the array to sort, one item at a time.\n"); 5: print ("Enter an empty line to quit.\n"); 6: $count = 1; 7: $inputline = <STDIN>; 8: chop ($inputline); 9: while ($inputline ne "") { 10: @array[$count-1] = $inputline; 11: $count++; 12: $inputline = <STDIN>; 13: chop ($inputline); 14: } 15: 16: # now sort the array 17: $count = 1; 18: while ($count < @array) { 19: $x = 1; 20: while ($x < @array) { 21: if ($array[$x - 1] gt $array[$x]) { 22: @array[$x-1,$x] = @array[$x,$x-1]; 23: } 24: $x++; 25: } 26: $count++; 27: } 28: 29: # finally, print the sorted array 30: print ("@array\n");
$ program5_11 Enter the array to sort, one item at a time. Enter an empty line to quit. foo baz dip bar bar baz dip foo $
This program is divided into three parts:
Lines 3-14 read the array into the variable @array. The conditional expression in line 9, $inputline ne "", is true as long as the line is not empty. (Recall that an empty line consists of just the newline character, which the library function chop removes.) In this example, the list foo baz dip bar is read into the array variable @array.
Lines 17-27 perform the sort. The sort consists of two loops, one inside the other. The inner loop works like this:
At this point, the largest element in the list is at the far end of the list:
baz dip bar foo
The largest value in the list, foo, has been moved to the far right end of the list, where it belongs. The other elements have been displaced to make room.
Lines 17-19 and 26-27 contain the outer loop. This outer loop just makes sure that the inner loop is repeated n-1 times, where n is the number of elements in the list. When the inner loop is repeated a second time, the second-largest element moves up to the second position from the right:
baz bar dip foo
The final pass through the inner loop sorts the final two elements:
bar baz dip foo
Line 30 then prints the sorted list.
NOTE |
You'll never need to write a program that sorts values in a list because Perl has a library function, sort, that does it for you. See the section "Array Library Functions" later today for more details. |
In the programs you have seen so far, single lines of input are read from the standard input file and stored in scalar variables. For example:
$var = <STDIN>;
In this case, every appearance of <STDIN> means that another line of input is obtained from the standard input file.
Perl also provides a quicker approach: If you assign <STDIN> to an array variable instead of a scalar variable, the Perl interpreter reads in all of the data from the standard input file at once and assigns it. For example, the statement
@array = <STDIN>;
reads everything typed in and assigns it all to the array variable @array. The variable @array now contains a list; each element of the list is a line of input.
Listing 5.12 is an example of a simple program that reads its input data into an array.
Listing 5.12. A program that reads data into an array and writes the array.
1: #!/usr/local/bin/perl 2: 3: @array = <STDIN>; 4: print (@array);
$ program5_12 Here is my first line of data. Here is another line. Here is the last line. ^D Here is my first line of data. Here is another line. Here is the last line. $
As you can see, this program is very short. Line 3 reads the input from the standard input file. In this example, the input that is entered consists of the three lines
Here is my first line of data. Here is another line. Here is the last line.
followed by the Ctrl+D key combination. Ctrl+D produces a special
character that indicates end of file; when the Perl interpreter
sees this, it knows that there is no more input.
NOTE |
A blank line is perfectly acceptable input and does not terminate the reading of input from the standard input file. Only the Ctrl+D character can do that. Also note that the Ctrl+D character is a non-printing character. When you type it, nothing appears on the screen. In the examples in this book, control characters that are part of the input, such as Ctrl+D, are represented by the ^ character followed by the letter typed. For example, Ctrl+D is represented as ^D This representation is the standard one used in the computing world. |
After line 3 is executed, the array variable @array contains a list comprising three elements: the three lines of input you just entered. The last character of each input line is the newline character (because you didn't call chop to get rid of it).
Line 4 prints the lines of input you just read. Note that you
do not need to separate the lines with spaces or newline characters
because each line in @array is terminated by a newline
character.
When you use the following statement: @array = <STDIN>; every line of input you enter is stored in @array all at once. If you enter a lot of input, @array can get very large. Use this statement only when you really need to work with the entire input file at once. |
Perl provides a number of library functions that work on lists and array variables. You can use them to do the following:
The following sections describe these array library functions.
The library function sort sorts the elements of an array in alphabetical order and returns the sorted list.
The syntax for the sort library function is
retlist = sort (array);
In this syntax, array is the list to sort, and retlist is the sorted list.
Here are some examples:
@array = ("this", "is", "a", "test"); @array2 = sort (@array);
After sort is called, the value of @array2 is the list
("a", "is", "test", "this")
Note that sort does not modify the original list. The statement
@array2 = sort (@array);
does not change the value of @array. To replace the contents of an array variable with the sorted list, put the array variable on both sides of the assignment, as follows:
@array = sort (@array);
Here, the sorted list is put back in @array.
The sorted list must be assigned to an array variable in order to be used. The statement sort (@array); doesn't do anything useful because the sorted list is not assigned to anything. |
Note that sort treats its items as strings, not integers; items are sorted in alphabetical, not numeric, order. For example:
@array = (70, 100, 8); @array = sort (@array);
In this case, sort produces
(100, 70, 8)
not
(8, 70, 100)
Because sort is treating the elements of the list as strings, the strings to be sorted are 70, 100, and 8. When sorting characters that are not alphabetic, sort looks at the internal representation of the characters to be sorted. If you are not familiar with ASCII (which will be described shortly), this might sound complicated, but it's not too difficult to understand.
Here's how it works: When Perl (or any other programming language) stores a character such as r or 1, what it actually does is store a unique eight-bit number that corresponds to this character. For example, the letter r is represented by the number 114, and 1 is represented by the number 49. Every possible character has its own unique number.
The sort function uses these unique numbers to determine how to sort character strings. When sorting 70, 100, and 8, sort looks at the unique numbers corresponding to 7, 1, and 8, which are the first characters in each of the strings. As it happens, the unique number for 1 is less than that for 7, which is less than that for 8 (which makes sense when you think of it). This means that 100 is "less than" 70, and 70 is "less than" 8.
Of course, if two strings have identical first characters, sort
then compares the second characters. For example, when sort
sorts 72 and 7$, the first characters are identical;
sort then compares the unique number representing 2
with the number representing $. As it happens, the number
for $ is smaller, so 7$ is "less than"
72.
NOTE |
The set of unique numbers that correspond to the characters understood by the computer is known as the ASCII character set. Most computers today use the ASCII character set, with a couple of exceptions as follows: You don't really need to worry about what character set your machine uses, except to take note of the sorting order. A complete listing of the ASCII characters can be found in Appendix B, "ASCII Character Set." |
Normally, sort sorts in alphabetical order. You can tell the Perl interpreter to sort using any criterion you like. To learn more about sort keys, refer to Day 9, "Using Subroutines."
The library function reverse reverses the order of the elements of a list or array variable, and returns the reversed list.
The syntax for the reverse library function is
retlist = reverse (array);
array is the list to reverse, and retlist is the reversed list.
Here is an example:
@array = ("backwards", "is", "array", "this"); @array2 = reverse(@array);
The value assigned to @array2 is the list
("this", "array", "is", "backwards")
As with sort, reverse does not change the original array.
If you like, you can sort and reverse the same list by passing the list returned by sort to reverse. Listing 5.13 shows an example of this. It reads lines of data from the standard input file and sorts them in reverse order.
Listing 5.13. A program that sorts input lines in reverse order.
1: #!/usr/local/bin/perl 2: 3: @input = <STDIN>; 4: @input = reverse (sort (@input)); 5: print (@input);
$ program5_13 foo bar dip baz ^D foo dip baz bar $
Line 3 reads all the input lines from the standard input file into the array variable @input. Each element of input consists of a single line of input terminated with a newline character.
Line 4 sorts and reverses the input line. First, sort is called to sort the input lines in alphabetical order. (Recall that when one library function appears inside another, the innermost one is called first.) The list returned by sort is then passed to reverse, which reverses the order of the elements of the list. The result is a list sorted in reverse order, which is then assigned to @input.
Line 5 prints the sorted lines. Because each line is terminated
by a newline character, no extra spaces or newline characters
need to be added to make the output readable.
NOTE |
If you like, you can omit the parentheses to the call to reverse. This gives you the following statement: @input = reverse sort (@input); Here is a case where eliminating a set of parentheses actually makes the code more readable; it is obvious that the statement sorts @input in reverse order. |
As you've seen, the chop library function removes the last character from a character string. The following is an example:
$var = "bathe"; chop ($var); # $var now contains "bath"
The chop function also can work on lists in array variables. If you pass an array variable to chop, it removes the last character from every element in the list stored in the array variable. For example:
@list = ("rabbit", "12345", "quartz"); chop (@list);
After chop is called, the list stored in @list is
("rabbi", "1234", "quart")
The chop function often is used on arrays read from the standard input file, as shown in the following:
@array = <STDIN>; chop (@array);
This call to chop removes the newline character from each input line. In the following section, you will see programs in which this is helpful.
The library function join creates a single string from a list of strings, which then can be assigned to a scalar variable.
The syntax for the join library function is
string = join (array);
array is the list to join together, and string is the resulting character string.
The following is an example using join:
$string = join(" ", "this", "is", "a", "string");
The first element of the list supplied to join contains the characters that are to be used to join the parts of the created string together. In this example, $string becomes this is a string.
join can specify other join strings besides " ". For example, the following statement uses a pair of colons to join the strings:
$string = join("::", "words", "and", "colons");
In this statement, $string becomes words::and::colons.
You can use any list or array variable as part or all of the argument to join. For example:
@list = ("here", "is", "a"); $string = join(" ", @list, "string");
This assigns here is a string to $string.
Listing 5.14 is a simple program that uses join. It joins together all the input lines from the standard input file.
Listing 5.14. A program that takes its input and joins it into a single string.
1: #!/usr/local/bin/perl 2: 3: @input = <STDIN>; 4: chop (@input); 5: $string = join(" ", @input); 6: print ("$string\n");
$ program5_14 This is my input ^D This is my input $
Line 3 reads all of the input lines into the array variable @input. Each element of @input is a single line of input terminated by a newline character.
Line 4 passes the array variable @input to the library function chop, which removes the last character from each element of the list stored in @input. This removes all of the trailing newline characters.
Line 5 calls join, which joins all the input lines into a single string. The first argument passed to join is " ", which tells join to put one space between each pair of lines. This turns the list
("This", "is", "my", "input")
into the string
This is my input
Line 6 prints the string produced by join. Note that the call to print has to specify a newline character because all the newline characters in the input lines have been removed by the call to chop.
As you've seen, the library function join creates a character string from a list. To undo the effects of join-to split a character string into separate items-call the function split.
The syntax for the library function split is
array = split (string);
string is the character string to split, and array is the resulting array.
The following is a simple example of the use of split:
$string = "words::separated::by::colons"; @array = split(/::/, $string);
The first argument passed to split tells it where to break the string into separate parts. In this example, the first argument is :: (two colons); because there are three pairs of colons in the string, split breaks the string into four separate parts. The result is the list
("words", "separated", "by", "colons")
which is assigned to the array variable @array.
NOTE |
The / characters surrounding the :: in the call to split indicate that the :: is a pattern to be matched. Perl supports a wide variety of special pattern-matching sequences, which you will learn about on Day 7, "Pattern Matching." |
The split function is used in a variety of applications. Listing 5.15 uses split to count the number of words in the standard input file.
Listing 5.15. A simple word-count program.
1: #!/usr/local/bin/perl 2: 3: $wordcount = 0; 4: $line = <STDIN>; 5: while ($line ne "") { 6: chop ($line); 7: @array = split(/ /, $line); 8: $wordcount += @array; 9: $line = <STDIN>; 10: } 11: print ("Total number of words: $wordcount\n");
$ program5_15 Here is some input. Here are some more words. Here is my last line. ^D Total number of words: 14 $
When you enter a Ctrl+D (End-of-File) character and read it using <STDIN>, the resulting line is the null string. Line 5 of this program tests for this null string.
Note that line 5 has no problem distinguishing the end of file from a blank input line because a blank input line contains the newline character, and chop has not yet been called. Once the Perl interpreter knows that the program is not at the end of file, line 6 can be called; it chops the newline character off the end of the input line.
Line 7 splits the input line into words. The first argument to split, / /, indicates that the line is to be broken whenever the Perl interpreter sees a space. The resulting list is stored in @array.
Because each element of the list in @array is one word in the input line, the total number of words in the line is equivalent to the number of elements in the array. Line 8 takes advantage of this to count the number of words in the input line. Here's how line 8 works:
NOTE |
Listing 5.15 does not work properly if an input line contains more than one space between words. The following is an example: This is a line Because there are two spaces between This and is, the split function breaks This is into three words: This, an empty word "", and is. Because of this, the line This is a line appears to contain five words when it really contains only four. To get around this problem, what you need is a pattern that matches one or more spaces. To learn about special patterns such as this, see Day 7. |
Listing 5.16 is an example of a program that uses split, join, and reverse to reverse the word order of the input read from the standard input file.
Listing 5.16. A program that reverses the word order of the input file.
1: #!/usr/local/bin/perl 2: 3: @input = <STDIN>; 4: chop (@input); 5: 6: # first, reverse the order of the words in each line 7: $currline = 1; 8: while ($currline <= @input) { 9: @words = split(/ /, $input[$currline-1]); 10: @words = reverse(@words); 11: $input[$currline-1] = join(" ", @words, "\n"); 12: $currline++; 13: } 14: 15: # now, reverse the order of the input lines and print them 16: @input = reverse(@input); 17: print (@input);
$ program5_16 This sentence is in reverse order. ^D order. reverse in is sentence This $
Line 3 reads all of the standard input file into the array @input. Line 4 then removes the trailing newline characters from the input lines.
Lines 7-13 reverse each individual line. Line 7 compares the current line number, stored in $currline, with the number of lines of input. (Recall that the number of elements in the list is used whenever an array variable appears where a scalar value is expected.)
Line 9 splits a line of input into words. The first argument to split, / /, indicates that a split is to occur every time a space is seen. The list of words is stored in the array variable @words.
Line 10 reverses the order of the list of words stored in @words. After the list has been reversed, line 11 joins the input line back together again. Note that line 11 appends a newline character to the input line.
Now that the words in each individual line have been reversed, all that the program needs to do is reverse the order of the lines themselves. Line 16 accomplishes this.
Line 17 prints the reversed input file. Note that the period character (.) appears at the end of the first word; this is because the reversing program isn't smart enough to detect and get rid of it. (You can use split to get rid of this, too, if you want.)
Perl provides several other list-manipulation functions also. To learn about these, refer to Day 14, "Scalar-Conversion and List-Manipulation Functions."
In today's lesson, you learned about lists and array variables. A list is an ordered collection of scalar values. A list can consist of any number of scalar values.
Lists can be stored in array variables, which are variables whose names begin with the character @.
Individual elements of array variables can be accessed using subscripts. The subscript 0 refers to the first element of the list stored in the array variable, the subscript 1 refers to the second element, and so on. If an array element is not defined, it is assumed to hold the null string "". If a previously undefined array element is assigned to, the array grows appropriately.
The list-range operator provides a convenient way to create a list containing consecutive numbers.
You can copy lists from one array variable to another. In addition, you can include an array variable in a list, which means that the list stored in the array variable is copied into the list containing the array-variable name.
Array-variable names can appear in character strings; in this case, the elements of the list are included in place of the variable name, with a space separating each pair of elements.
You can assign values to scalar variables from array variables, and vice versa.
If an array variable appears in a place where a scalar variable is expected, the length of the list stored in the array variable is used.
You can access any part of a list stored in an array variable by using the array-slice notation. You can assign values to array slices, and they can be used anywhere a list is expected.
The entire contents of the standard input file can be stored in a single array variable.
The library functions sort and reverse sort and reverse lists, respectively. The function chop removes the last character from each element of a list. The function split breaks a single string into a collection of list elements. The function join takes a collection of list elements and joins them into a single string.
Q: | How can I tell whether a reference to an array variable such as @array refers to the stored list or to the length of the list? |
A: | It's usually pretty easy to tell. In a lot of places, using a list makes no sense:
$result = $number + @array; For example, it makes no sense here to add a list to $number, so the length of the list stored in @array is used. |
Q: | Why do array elements use $ for the first character of the element name, and not @? Wouldn't it make more sense to refer to an array element as
@array[2] because we all know that the @ indicates an array variable? |
A: | This relates to the first question. The Perl interpreter needs to know as soon as possible whether a variable reference is a scalar value or a list. The $ indicates right away that the upcoming item
is a scalar value.
Eventually, you'll get used to this notation. |
Q: | Is there a difference between an undefined array variable and an array variable containing the empty list? |
A: | No. By default, all array variables contain the empty list. Note, however, that the empty list is not the same as a list containing the null string: @array = (""); This list contains one element, which happens to be a null string. |
Q: | How large an input file can I read in using the following statement? @array = <STDIN>; |
A: | Perl imposes no limit on the size of arrays. Your computer, however, has a finite amount of memory, which limits how large your arrays can be. |
Q: | Why does Perl add spaces when you substitute for an array variable in a string? |
A: | The most common use of string substitution is in the print statement. Normally, when you print a list you don't want to have the elements of the list running together, because you want to see where
one element stops and the next one starts.
To print the elements of a string without spaces between them, pass the list to print without enclosing it in a string, as follows: print ("Here is my list", @list, "\n"); |
Q: | Why does $ appear before 1 in the ASCII character set? |
A: | The short answer is: Just because. (This reasoning occurs more often in computing than you might think.)
Here's a more detailed explanation: On early machines that used the ASCII character set, performance was more efficient if there was a relationship between, for instance, the location of the uppercase alphabetic characters and the lowercase alphabetic characters. (In fact, if you add 0x20, or 20 hexadecimal, to the ASCII representation of an uppercase letter, you get the corresponding lowercase letter.) Establishing relationships such as these meant that gaps existed between, for example, the representation of Z (which is 90) and the representation of a (which is 97). These gaps are filled by printable non-alphanumeric characters; for example, the representation of [ is 91. As for why $ appears before 1, as opposed to ?, which appears after 1, the explanation is: Just because. |
The Workshop provides quiz questions to help you solidify your understanding of the material covered and exercises to give you experience in using what you've learned. Try and understand the quiz and exercise answers before you go on to tomorrow's lesson.