Avoiding Small ORFs


We want to ignore ORFs that are smaller than $threshold. Since we calculate $orf_length in orfs_in_direction, it probably makes sense to look there:
sub orfs_in_direction {
my ($direction, $sequence) = @_;
my @orfs_found;
while ($sequence =~ /((ATG|GTG|TTG)(...)*(TGA|TAG|TAA))/g) {

my $orf_length = length($1);
my $orf_end = pos($sequence) - 1;
my $orf_start = pos($sequence) - $orf_length;

push @orfs_found, [$orf_start, $orf_end, $direction];

}

return @orfs_found;
}
The routine as written pushes every ORF onto @orfs_found. To change that we can use an if statement:

      if ($orf_length >= $threshold) {
push @orfs_found, [$orf_start, $orf_end, $direction];
}

which tells Perl to execute the push only when $orf_length is large enough.

We could also go on to the next iteration of the loop whenever $orf_length is too short:
   while ($sequence =~ /((ATG|GTG|TTG)(...)*?(TGA|TAG|TAA))/g) {   

my $orf_length = length($1);
my $orf_end = pos($sequence) - 1;
my $orf_start = pos($sequence) - $orf_length;

next if $orf_length < $threshold;

push @orfs_found, [$orf_start, $orf_end, $direction];

}
The next statement tells Perl to  skip any remaining statements and skip directly to the while,  looking for the next pattern.