2012-03-04

GetSet Methods vs. Public Properties

I was recently having a debate with a coworker over the utility of writing getter and setter methods for protected properties of classes. On the one hand, having getters and setters seems like additional boilerplate and programming overhead for very little gain. On the other hand, exposing the value properties of a class seems like bad encapsulation and will overall lead to code that is more difficult to maintain.

I come down firmly on the get/set method side of the fence. It takes very little extra code, defines the interface to the class explicitly and makes the class easier to extend and maintain in the long run. Let's try an example (in PHP, though the premise holds for any language with public/private properties.)
class PublicValues {
    public $foo;
    public $bar;
}

class GetSet {
    protected $foo;
    protected $bar;

    public function getFoo() {
        return $this->foo;
    }

    public function setFoo($value) {
        $this->foo = $value;
    }

    public function getBar() {
        return $this->bar;
    }

    public function setBar($value) {
        $this->bar = $value;
    }
}

$public = new PublicValues();
$public->foo = array('some'=>'value', 'data'=>'structure');
$public->bar = 123;

$getset = new GetSet();
$getset->setFoo(array('some'=>'value', 'data'=>'structure'));
$getset->setBar(123);

echo "Public is: ".$public->bar." and ".print_r($public->foo, true)."\n";
echo "GetSet is: ".$getset->getBar()." and ".print_r($public->getFoo(), true)."\n";
Obviously, class `PublicValues` is much shorter than class `GetSet`. Shorter is better, right? Less code to maintain, fewer places for bugs to creep in. And it didn't take me nearly as long to type (though, truthfully, `GetSet` didn't take that long to write, either, as it was mostly cut-and-paste.)

Now let's say that now we have decided to wrap the array data structure in a new `Widget` class and we need to ensure that `foo` can only be of type `Widget`. Modify the classes to make this so:
class PublicValues {
    // ... add a whole new method
    public function setFoo(Widget $value) {
        $this->foo = $value;
    }
}

class GetSet {
    // ... only change here is type-hinting
    public function setFoo(Widget $value) {
        $this->foo = $value;
    }
    // ...
}

// ...
$public->setFoo(new Widget(array('some'=>'value', 'data'=>'structure')));
// ...
$getset->setFoo(new Widget(array('some'=>'value', 'data'=>'structure')));
// ...
`PublicVars` had an entirely new method added to it. `GetSet` only had a small modification to an existing method.

Actually, this still isn't perfect; application code can still say `$public->foo = 'whateverIWant';` without going through the `PublicVars::setFoo` method. In fact, while writing this example, I forgot until my third revision to go back and change the sample code to use the set method. Let's make sure that doesn't happen in the future:
class PublicValues {
    // ... change foo so that client code can't set it to the wrong thing
    protected $foo;

    // ... still need to be able to read foo, so now we add a getter
    public function getFoo() {
        return $this->foo;
    }
    // ...
}

// GetSet remains the same ...

// ...
$public->setFoo(new Widget(array('some'=>'value', 'data'=>'structure')));
// ...
echo "Public is: ".$public->bar." and ".print_r($public->getFoo(), true)."\n";
Notice that `GetSet` didn't have to change at all in this case, and the only client code that had to change for it was searching for anywhere calling `GetSet::setFoo` and making sure it provided a `Widget`.

As for `PublicValues`, we'll have to go through all the code making sure that everywhere that was setting `foo` with assignment now calls the set method (with a proper `Widget`), and anywhere that was reading it directly calls the get method.

My future self is screaming at my current self. Now we have two ways of accessing the PublicValues class: one way through direct value reading and writing (bar), and the other through getset methods (foo). As a consumer of this class, I have no way of knowing which properties are openly accessible and which are guarded through getset methods, other than going back and looking through the code.

To solve this problem, I am going to use a method that I see in almost every PHP project of non-trivial size that I have ever worked with or seen the source code of: use magic __get and __set to determine if the property is directly accessible or not:
class PublicValues {
    // ...

    public __get($name) {
        $method = 'get'.$name;
        if (method_exists($this,$method)) {
            return $this->$method();
        } else {
            return $this->$name;
        }
    }

    public __set($name, $value) {
        $method = 'set'.$name;
        if (method_exists($this,$method)) {
            return $this->$method($value);
        } else {
            $this->$name = $value;
        }
    }
}
As before, `GetSet` doesn't change one bit. For `PublicValues`, the rest of the code can continue to use what looks like straight assignment, and if I want to know if the assignment is direct modification or a magic set, I only need to look in only two places to figure it out: look for a named getset method, or assume I am using the magic __get and __set. Perfect...

...until we have to add something to this class that absolutely must not be modified by client code.
class PublicValues {
    // ...
    protected $modificationToThisCrashesTheApp;
    // ...
}

class GetSet {
    // ...
    protected $modificationToThisCrashesTheApp;
    // ...
}
Now we have a big problem. Clients of `GetSet` are protected, because nothing in `GetSet` can be modified except through a class method. We don't provide a method for accessing the dangerous property, so clients can't hurt themselves.

But clients of `PublicValues` can modify any property they want. The magic __set method makes public any protected value that does not have an explicit set method. We could fix this by adding a set method that throws an exception. Or we could solve it by blacklisting that property in the magic __set. Or we could make sure everyone (including new hires six months from now) knows not to touch this property. Or we could solve it by...

Truthfully, it doesn't matter how you solve it. Encapsulation of `PublicValues` has been broken from the beginning. We haven't even tried adding validation to either class, or making the setting of one value affect another, or adding read- or write-only methods. And over the course of just a few changes, we've had to add get/set methods anyway. We've just done it later in the application lifecycle after client code is already using the class, the time when it is most costly to make those sorts of changes.

In a blog post, it's easy to see these changes coming in a few paragraphs and account for them mentally. But imagine this in a large system, one that is constantly having features added, bugs fixed and maintenance performed over the course of months or years. Every change in this example could have come months apart. I don't expect myself or my fellow programmers to keep track of the intricacies of a single class over that timeframe, especially if that class is modified rarely, is one class in hundreds and we've been working in thousands of lines of other code in the meantime.

So do yourself a favor and take the 20 seconds to wrap your properties in boilerplate getset methods from the start. Not because it's a best practice, but because future you will be grateful.

Here is the meme-pic that started the debate: