I've recently taken more direct ownership of a big Perl application at work, and in the process of adding a new feature noticed a lot of this type of looping code:
# loop 1
my %foo_types = ();
foreach my $entry_data_ref (@entry_data) {
$foo_types{ $entry_data_ref->[19] } = $entry_data_ref->[20];
}
my $foo_types_regex = '(' . join('|', keys %foo_types) . ')';
# loop 2
foreach my $data_ref (@data) {
my $id = $data_ref->[0];
push(@ids_to_disable, $id) if $type_cd =~ m/^$foo_types_regex$/;
my $type_cd = $data_ref->[21];
}I managed to simplify it down to this:
# loop 1
my %foo_types = map { $_->[19] => $_->[20] } @entry_data;
# loop 2
my @ids_to_disable = map $_->[0], grep exists $foo_types{ $_->[21] }, @data;
See the difference? Both loops are concerned with generating one list from another -- the first transforms a list into a hash, the second uses the keys of that hash to determine the contents of another list. This is what
map (and its cousin, grep) are built for. The foreach loop in the original hides the intent of the code, and it's likely slower as well since map knows how many elements it needs to process and can preallocate space on the list it returns.The original
loop 1 also populated a string with hash keys to use as a regular expression later in order to find out what entries to get out of @data in loop 2. But again, we have a much more appropriate function for this: exists. Checking for a hash hit is near-instantaneous compared to working through a long regular expression, and again it makes the code smaller and easier to understand and debug.map and grep come from the more functional-programming side of Perl, and they seem to be very underused and under-appreciated by people used to more traditional procedural languages like C and Java. It's unfortunate, because it leads to much more bloated code that takes longer to debug, and doesn't use the language's full capabilities.


0 comments:
Post a Comment