Day the Third: Badger::Constants

Badger::Constants

Top Close Open

Badger::Constants defines a number of constant values that often crop up in Perl programs. These constants are used throughout the Badger code base. You can also import them into your own programs to help make your code more robust, easier to read and comprehend, or simply to save yourself from typing a few extra characters.

Importing constants from Badger::Constants is easy. Just specify the names of the constants when you use Badger::Constants.

use Badger::Constants 'TRUE', 'FALSE';

Perl provides the qw( ) operator which will automatically quote the words for you.

use Badger::Constants qw( TRUE FALSE );

Badger::Constants goes on step further in allowing you to specify multiple items as single string. It's easier to type and easier to read.

use Badger::Constants 'TRUE FALSE';

Some of the constants are grouped into "tag sets". This allows you to import all of the constants in a group in one go. Tag sets are prefixed with a : colon character. For example, the :values tag set defines TRUE, FALSE and various other values.

use Badger::Constants ':values';

Types and Typos

Top Close Open

The :types tag set defines constants for each of Perl's core data types: SCALAR, ARRAY, HASH, CODE, GLOB, REGEX. All but the last are direct representations of the values they represent. e.g. SCALAR defines the string 'SCALAR', ARRAY is 'ARRAY' and so on. The only exception is REGEX which defines the string Regexp, which is the type name Perl uses for references to regular expression.

So if you've ever written code that looks something like this:

if (ref $data eq 'ARRAY') {
     # do something with the ARRAY reference
}
elsif (ref $data eq 'HASH') {
     # do something with the ARRAY reference
}
...etc..

Then you can instead write it like this:

use Badger::Constants ':types';

if (ref $data eq ARRAY) {
     # do something with the ARRAY reference
}
elsif (ref $data eq HASH) {
     # do something with the ARRAY reference
}
...etc..

Why bother? Well apart from the fact that you save yourself from writing (and subsequently, reading) the additional quote characters around 'ARRAY', 'HASH' and so on, it also offers you protection against mis-spelling a word. If you accidentally type 'ARRRAY' instead of 'ARRAY', or 'Regex' instead of 'Regexp' then you'll be none the wiser until you notice that your program isn't behaving as expected. However, if you write ARRRAY or REGEXP when you really meant ARRAY or REGEX then you'll get a compile time error telling you that you're using an undefined value.

Incidentally, I chose to standardise my code to use REGEX in preference to REGEXP because it's easier to type and say. Most of the Perl programmers I know call them "regexes" (plural of "regex") rather than "regexps". It's certainly what I call them when I'm reading code through in my head, so that's what I write.

Say What You Mean

Top Close Open

Constants can be used to make your code more self-documenting. For example, the FIRST and LAST constants define the values 0 and -1 respectively. They can be used to access the first and last items in an array. For example:

if (ref $array[LAST] eq HASH) {
    # do something
}

In this case we're actually giving ourselves more work by using a constant like LAST instead of just typing -1. The benefit here is that we're being much more explicit about the intent of the operation, rather than the specific implementation. Although most Perl programmers will be familiar with the use of negative array indexing to count backwards from the end of the array, it doesn't hurt to spell it out in plain english words for any non-Perl programmers who might be visiting your code for whatever reason.

I call this semantic code in homage to the semantic web (that's "semantic" with a small 's' - not to be confused with the Semantic Web pipe dream). Whereas -1 is just a number that happens to have a certain meaning when used as the index to an array, the word LAST has a semantic meaning that all English speaking people can agree on. OK, that's not strictly true as "last" can mean "previous" (as in "our last drummer exploded on stage"), "ultimate" (as in "the last train to Clarksville"), and "endure" (as in "the money in the kitty isn't going to last all night with a pub full of thirsty Perl programmers"). But in the context of an accessing an array element, the correct meaning should be obvious.

Constants can also be used to hide complexity that has little or no value in being exposed. For example, the DELIMITER constant defines a regular expression which splits a single string into separate words. It's used in a number of places in Badger including the Badger::Exporter module which Badger::Constants uses to export its constants. We saw earlier how you can write something like this:

use Badger::Constants 'TRUE FALSE';

Behind the scenes this is handled by a bit of code like this:

use Badger::Constants 'DELIMITER';

my @items = split(DELIMITER, $text);

The DELIMITER constant is defined to contain a regular expression that looks like this:

qr/(?:,\s*)|\s+/        # match a comma or whitespace

In addition to whitespace, we also allow commas to be used as delimiters, either with or without trailing whitespace. It's not a particularly complex regular expression, but without looking at it and mentally parsing it there's no indication of what it actually does. Using a regex like this would usually warrant a simple comment (like that above) to help anyone skimming through your code. But why bother with a comment when you can replace the whole regex with a single word constant which describes what it does? This is the essence of self-documenting code.

Symbol Table Manipulation

Top Close Open

The final constant I want to show you is PKG. This one fits more into the "neat hack" category than the others, but I've found it invaluable. It relates to the rather advanced topic of manipulating Perl's symbol tables so don't worry about skipping this section if it means nothing to you.

Let's say we've got an $object of a particular class and we want to lookup a package variable in the correct package for that object. Perl's ref gives us the type of the object which equates to the package name (aka symbol table).

my $pkg = ref $object;

If we want to look up the $DEBUG variable in that package, then we can do it like this:

no strict 'refs';
my $val = ${"$pkg\::DEBUG"};

Or if you prefer, like this:

no strict 'refs';
my $var = ${$pkg.'::DEBUG'}

This is an example of a symbolic reference. We're delving a little further into the guts of Perl than is usual so we need to disable the strict 'refs' safety catch before we start.

The second example above is slightly more efficient than the first. Concatenating several values is faster than interpolating variables into a string, so we'll go with that. We might also want to define the name of the variable in a constant, like so:

use constant DEBUG_VAR => 'DEBUG';
no strict 'refs';
my $var = ${$pkg.'::'.DEBUG_VAR}

And now, the only thing remaining is (drum roll), to use the PKG constant in place of '::'.

my $var = ${$pkg.PKG.DEBUG_VAR}

We end up with something looking more like a dotted variable than a symbolic reference. I find it easier to both read and write.

Tomorrow we'll look at how you can define your own constants library using Badger::Exporter.

Fork Me on Github