foreach my $color (@rainbow) {This is the kind of loop we've seen so far. There's a different way to do the same thing:
print "$color\n";
}
for (my $j = 0; $j < @rainbow; $j = $j+1) {What's going on here? If @rainbow holds, for instance,
print "$rainbow[$j]\n";
}
("red", "orange", "green", "blue", "indigo", "violet")then we can use numbers to pick out any item in the @rainbow array, counting from 0: $rainbow[0] is "red", $rainbow[1] is "orange", and $rainbow[5] is"violet". The number in brackets is called an index, or a subscript.
print "$rainbow[0]\n";That's a lot of typing, and we can see a simple pattern: on each line we increase the index by 1. We could have written the above as
print "$rainbow[1]\n";
print "$rainbow[2]\n";
print "$rainbow[3]\n";
print "$rainbow[4]\n";
print "$rainbow[5]\n";
my $j = 0;Well, that's even more typing; but now the pattern is even simpler. A for loop packages up the identical parts so we don't have to type them all out. When Perl sees a for loop like this:
print "$rainbow[$j]\n";
$j = $j+1;
print "$rainbow[$j]\n";
$j = $j+1;
print "$rainbow[$j]\n";
$j = $j+1;
print "$rainbow[$j]\n";
$j = $j+1;
print "$rainbow[$j]\n";
$j = $j+1;
print "$rainbow[$j]\n";
for (my $j = 0; ... ; $j = $j + 1) {it does the first section of the first line (my $j = 0) exactly once, before the rest of the loop. It does the third part of the first line ($j = $j + 1) between every execution of the main body of the loop (print "$rainbow[$j]\n").
print "$rainbow[$j]\n";
}
for (my $j = 0; $j < @rainbow; $j = $j+1) {we keep going as long as $j < @rainbow is true. In our example @rainbowhas six elements. Since $j is 0 the first time through the loop and we increase it by one each time through, $j takes on the values 0, 1, 2, 3, 4, and 5.
print "$rainbow[$j]\n";
}
foreach my $color (@rainbow) {Why bother? One reason is that sometimes we want to look at only part of an array. For instance we could print all but the endpoint of of the rainbow with:
print "$color\n";
}
for (my $j = 1; $j < @rainbow-1; $j = $j+1) {SQ1: What are the endpoints of the rainbow? How does the loop avoid printing them?
print "$rainbow[$j]\n";
}
@a = ("red", "red", "green", "blue", "blue", "red", "orange")has two: $a[0] and $a[1] are both "red", and $a[3] and $a[4] are both "blue". (We don't count $a[5] as a duplicate because it's not next to another "red" item.)
sub duplicate_pairs {SQ2: Why does this loop start from 1? (my $j = 1)? What would happen if that was $j = 0 instead?
my (@subject) = @_;
my $count = 0;
for (my $j = 1; $j < @subject; $j = $j + 1) {
if ($subject[$j] eq $subject[$j-1]) {
$count = $count + 1;
}
}
return $count;
}
#!/usr/bin/perl -wIt prints:
use strict;
my @rainbow = ("red", "orange", "green", "blue", "indigo", "violet");
my @a = ("red", "red", "green", "blue", "blue", "red", "orange");
print "The rainbow has ", duplicate_pairs(@rainbow), " duplicate pairs\n";
print "The array \@a has ", duplicate_pairs(@a), " duplicate pairs\n";
The rainbow has 0 duplicate pairsNow suppose we want to answer a slightly different question: how many duplicates do we have that are separated by a distance of 2? Or 3? Or some other distance? Our duplicate_pairs subroutine answers the question for a distance of 1.
The array @a has 2 duplicate pairs
sub distant_pairs {We would call this as distant_pairs(4, @a), for instance, to calculate the number of duplicate pairs in @a with a distance of 4; and distant_pairs(1, @a) should give the same answer as duplicate_pairs(@a).
my ($distance, @subject) = @_;
my $count = 0;we see that there is a 1 in three places: in my $j = 1, in $j = $j + 1, and in $subject[$j - 1]. Since this loop finds duplicate pairs with a distance of 1 between them, and the distant_pairs routine is to find pairs that are$distance apart, where $distance may or may not be 1, it seems like changing1 to $distance in some or all of those three places might do the trick.
for (my $j = 1; $j < @subject; $j = $j + 1) {
if ($subject[$j] eq $subject[$j-1]) {
$count = $count + 1;
}
}
return $count;
#!/usr/bin/perl -wReplace some or all of the 1's in distant_pairs with $distance. The correct program will print
use strict;
my @a = ("red", "red", "green", "blue", "blue", "red", "orange");
print "Distance: 1; pairs: ", distant_pairs(1, @a), "\n";
print "Distance: 2; pairs: ", distant_pairs(2, @a), "\n";
print "Distance: 3; pairs: ", distant_pairs(3, @a), "\n";
print "Distance: 4; pairs: ", distant_pairs(4, @a), "\n";
print "Distance: 5; pairs: ", distant_pairs(5, @a), "\n";
print "Distance: 6; pairs: ", distant_pairs(6, @a), "\n";
print "Distance: 7; pairs: ", distant_pairs(7, @a), "\n";
print "Distance: 8; pairs: ", distant_pairs(8, @a), "\n";
sub distant_pairs {
my ($distance, @subject) = @_;
my $count = 0;
for (my $j = 1; $j < @subject; $j = $j + 1) {
if ($subject[$j] eq $subject[$j-1]) {
$count = $count + 1;
}
}
return $count;
}
Distance: 1; pairs: 2Which places should you replace? (Test by running the changed program. Your humble author got it wrong on his first try.) Why are those the right ones to change?
Distance: 2; pairs: 0
Distance: 3; pairs: 0
Distance: 4; pairs: 1
Distance: 5; pairs: 1
Distance: 6; pairs: 0
Distance: 7; pairs: 0
Distance: 8; pairs: 0
There is much repetition in the Distance... lines. Perhaps we can help that with a for loop also.
SQ5: Replace the eight lines print "Distance:..., etc. with a for loop that does the same thing.
Here's a possible use for such a program: what are
duplicate pairs in a list of colors are repeated sequences in DNA. Suppose
that you had looked at the beginning of a large number of genes, each, of
course, beginning with a start codon:
atpA: ATGAGCATTTCAATTAGACCTGACGAAATCAGCAGTATTATTCAGCAGCA . . .You're curious whether there's a pattern of nucleotides within genes. Do A's tend to come in clusters? Or are they spaced in a patterned fashion? You might write a program that counts nucleotides at each position:
atpC: ATGCCTAATCTCAAATCAATACGCGATCGCATTCAGTCGGTCAAAAACAC . . .
atpD: ATGACAAGTAAAGTAGCAAACACTGAGGTAGCTCAACCTTACGCTCAGGC . . .
. . .
zam: ATGGAATTTTCAATCGCTACACTCCTTGCCAATTTCACCGATGATAAATT . . .
You also might wonder if A's tend to follow each other directly or with a spacing of 1, or 2 or 3, or 27, or ...?
1
2
3
4
5
6
7
8
9
. . .
A
74
0
0
27
25
34
26
30
39
C
0
0
0
23
22
18
20
19
15
G
18
0
0
22
24
13
24
22
14
T
8
100
100
38
29
35
29
28
32