2012-03-04

GetSet Methods vs. Public Properties

I was recently having a debate with a coworker over the utility of writing getter and setter methods for protected properties of classes. On the one hand, having getters and setters seems like additional boilerplate and programming overhead for very little gain. On the other hand, exposing the value properties of a class seems like bad encapsulation and will overall lead to code that is more difficult to maintain.

I come down firmly on the get/set method side of the fence. It takes very little extra code, defines the interface to the class explicitly and makes the class easier to extend and maintain in the long run. Let's try an example (in PHP, though the premise holds for any language with public/private properties.)
class PublicValues {
    public $foo;
    public $bar;
}

class GetSet {
    protected $foo;
    protected $bar;

    public function getFoo() {
        return $this->foo;
    }

    public function setFoo($value) {
        $this->foo = $value;
    }

    public function getBar() {
        return $this->bar;
    }

    public function setBar($value) {
        $this->bar = $value;
    }
}

$public = new PublicValues();
$public->foo = array('some'=>'value', 'data'=>'structure');
$public->bar = 123;

$getset = new GetSet();
$getset->setFoo(array('some'=>'value', 'data'=>'structure'));
$getset->setBar(123);

echo "Public is: ".$public->bar." and ".print_r($public->foo, true)."\n";
echo "GetSet is: ".$getset->getBar()." and ".print_r($public->getFoo(), true)."\n";
Obviously, class `PublicValues` is much shorter than class `GetSet`. Shorter is better, right? Less code to maintain, fewer places for bugs to creep in. And it didn't take me nearly as long to type (though, truthfully, `GetSet` didn't take that long to write, either, as it was mostly cut-and-paste.)

Now let's say that now we have decided to wrap the array data structure in a new `Widget` class and we need to ensure that `foo` can only be of type `Widget`. Modify the classes to make this so:
class PublicValues {
    // ... add a whole new method
    public function setFoo(Widget $value) {
        $this->foo = $value;
    }
}

class GetSet {
    // ... only change here is type-hinting
    public function setFoo(Widget $value) {
        $this->foo = $value;
    }
    // ...
}

// ...
$public->setFoo(new Widget(array('some'=>'value', 'data'=>'structure')));
// ...
$getset->setFoo(new Widget(array('some'=>'value', 'data'=>'structure')));
// ...
`PublicVars` had an entirely new method added to it. `GetSet` only had a small modification to an existing method.

Actually, this still isn't perfect; application code can still say `$public->foo = 'whateverIWant';` without going through the `PublicVars::setFoo` method. In fact, while writing this example, I forgot until my third revision to go back and change the sample code to use the set method. Let's make sure that doesn't happen in the future:
class PublicValues {
    // ... change foo so that client code can't set it to the wrong thing
    protected $foo;

    // ... still need to be able to read foo, so now we add a getter
    public function getFoo() {
        return $this->foo;
    }
    // ...
}

// GetSet remains the same ...

// ...
$public->setFoo(new Widget(array('some'=>'value', 'data'=>'structure')));
// ...
echo "Public is: ".$public->bar." and ".print_r($public->getFoo(), true)."\n";
Notice that `GetSet` didn't have to change at all in this case, and the only client code that had to change for it was searching for anywhere calling `GetSet::setFoo` and making sure it provided a `Widget`.

As for `PublicValues`, we'll have to go through all the code making sure that everywhere that was setting `foo` with assignment now calls the set method (with a proper `Widget`), and anywhere that was reading it directly calls the get method.

My future self is screaming at my current self. Now we have two ways of accessing the PublicValues class: one way through direct value reading and writing (bar), and the other through getset methods (foo). As a consumer of this class, I have no way of knowing which properties are openly accessible and which are guarded through getset methods, other than going back and looking through the code.

To solve this problem, I am going to use a method that I see in almost every PHP project of non-trivial size that I have ever worked with or seen the source code of: use magic __get and __set to determine if the property is directly accessible or not:
class PublicValues {
    // ...

    public __get($name) {
        $method = 'get'.$name;
        if (method_exists($this,$method)) {
            return $this->$method();
        } else {
            return $this->$name;
        }
    }

    public __set($name, $value) {
        $method = 'set'.$name;
        if (method_exists($this,$method)) {
            return $this->$method($value);
        } else {
            $this->$name = $value;
        }
    }
}
As before, `GetSet` doesn't change one bit. For `PublicValues`, the rest of the code can continue to use what looks like straight assignment, and if I want to know if the assignment is direct modification or a magic set, I only need to look in only two places to figure it out: look for a named getset method, or assume I am using the magic __get and __set. Perfect...

...until we have to add something to this class that absolutely must not be modified by client code.
class PublicValues {
    // ...
    protected $modificationToThisCrashesTheApp;
    // ...
}

class GetSet {
    // ...
    protected $modificationToThisCrashesTheApp;
    // ...
}
Now we have a big problem. Clients of `GetSet` are protected, because nothing in `GetSet` can be modified except through a class method. We don't provide a method for accessing the dangerous property, so clients can't hurt themselves.

But clients of `PublicValues` can modify any property they want. The magic __set method makes public any protected value that does not have an explicit set method. We could fix this by adding a set method that throws an exception. Or we could solve it by blacklisting that property in the magic __set. Or we could make sure everyone (including new hires six months from now) knows not to touch this property. Or we could solve it by...

Truthfully, it doesn't matter how you solve it. Encapsulation of `PublicValues` has been broken from the beginning. We haven't even tried adding validation to either class, or making the setting of one value affect another, or adding read- or write-only methods. And over the course of just a few changes, we've had to add get/set methods anyway. We've just done it later in the application lifecycle after client code is already using the class, the time when it is most costly to make those sorts of changes.

In a blog post, it's easy to see these changes coming in a few paragraphs and account for them mentally. But imagine this in a large system, one that is constantly having features added, bugs fixed and maintenance performed over the course of months or years. Every change in this example could have come months apart. I don't expect myself or my fellow programmers to keep track of the intricacies of a single class over that timeframe, especially if that class is modified rarely, is one class in hundreds and we've been working in thousands of lines of other code in the meantime.

So do yourself a favor and take the 20 seconds to wrap your properties in boilerplate getset methods from the start. Not because it's a best practice, but because future you will be grateful.

Here is the meme-pic that started the debate:



7 comments:

  1. Your example only shows thatloosely typed languages have limits. It has nothing to do with the intend of using accessors.

    This accessor debate occured in the Java community over 10 years ago. Remember that the accessors were only invented to fight a problem in some distributed object frameworks where the "internal state" of an object could be changed during "transportation".

    Sorry to say it, but accessors have no meaning in PHP and a well designed project do not need them.

    ReplyDelete
  2. No need to apologize :) You shouldn't be sorry to say something you are prepared to stand behind.

    Public/private properties on objects have been around in OOP languages since before Java. The claim that accessors and mutators were a Java invention seems dubious to me, but I admit I have no historical reference on the subject. Do you have any sources for that?

    Let's talk about PHP. You say that a well-designed PHP project does not need accessors. I came up with an example above that shows a project evolving over time. In one case, a class uses accessor methods to control its state, and the modifications to support each changed requirement are minimal. In the other case, we start with a class that uses public properties to manage state, and for each change, the class is heavily modified as is any client code that uses it. In the beginning, the public properties seemed like a good design decision, but that didn't hold up over time. And we ended up with a class where some of the state was exposed via accessors/mutators and some of it through direct property access. That doesn't seem like a very good interface to me. 

    What would you have done to keep modifications of the public properties class to a minimum and still managed to meet the new requirements of the project? Where do you draw the line between what is read/written via an accessor and what can be accessed directly? And does that line move as the project evolves?

    My hypothesis is that as soon as you have one public property on a class that requires being wrapped in a getset, you might as well wrap the remaining public properties as well. I have seen enough projects do it that way and benefit, and do it the other way and cost themselves more in the long run.

    ReplyDelete
  3. These are 2 different religions (Accessors vs ???). It is impossible to make a clear winner.

    I have been an accessor fan during the first 10 years of my dev career. However, after a wise discussion with an "famous expert", I finally realized that it was wrong.

    The problem was that I was not properly understanding what an object is. Therefore I was using properties, methods and its visibility (for both) not for what they was created for. I was simply trying to solve inexisting problems using the wrong method.

    At that time, everybody wanted to go into j2ee. J2ee developpers were kings... and without understanding why, people started to copy what we were reading on the web. But it was a trend, a real solution to a non-problem (outsied j2ee).

    Actually your article is good, but to demonstrate the weaknesses of a loosely typed language and that refactoring is a pain in the a** if you work with these languages. If php would allow the typing of object properties, the problem would be solved. :)

    Here is an interesting link: http://www.martinhunter.co.nz/articles/OnTheUseAndMisuseOfAccessorMethodsInOO.pdf

    ReplyDelete
  4. One option in favor of "you don't need accessors" is that instead of asking object's data from it, you'll ask it to do the your work. But this is quite different from the approach that PHP developers have been used to. A popular thing going on right now is to favor light models and use service layers to do the work for you. Model should not contain any logic, but only the logical separation of its data. But now, from where should to working unit get it's data if not from the light model itself so we'll favor setters and getters.

    I'm in favor of accessors and strong encapsulation. I'll even define all attributes for class private and not protected and public accessors for it. If it's acquired, i'll changes the visibility from private to protected - such as you would changed visibility from public to protected. Quite inverted ;)

    ReplyDelete
  5. I completely agree with having an object to do work for you, "tell, don't ask." But at some point, you need to get data into and out of the object so you can present it to users or other consuming clients. Data is pretty worthless if no one can read it. I do the same thing as you mentioned when I design a class: make everything protected/private and don't bother writing a public accessor for it unless absolutely necessary.

    Thanks for the response!

    ReplyDelete
  6. Based on your own linked PDF, you shouldn't be using public properties either because it breaks OOP encapsulation. (I also don't totally agree with all of their reasons against mutator methods.) Imagine if you need to change the type of the public property to something else internally. Anything using that public properties are now broken. Using mutator methods can help keep this from happening by encapsulating how the object internally represents the values.

    As a silly example of this, imagine if the variables ($foo and $bar) in this post were  "options" to the object. Imagine now you have several other "options" all in different variables. If you were to refactor and put all of these "options" into an options array, anything accessing these as public properties are now broken and must be updated in every single place that uses them. If you were using mutator methods, you only have to change the original object itself and everything using this object and the mutator methods still work. No major refactoring needed except for the object itself.

    ReplyDelete