2011-10-04

A Humbling Reference

Sometimes I have an experience that reminds me of how near-sighted I can get when I'm head-down in a problem.

What do you think the output of the following code will be (no cheating by running it first!):
<?php
$a = array(array("result" => "a"));
for ($i=0; $i < 2; $i++) {
    foreach ($a as $j => $v) {
        $v["result"] .= "a";
        echo $v["result"] ."\n";
    }
}
foreach ($a as $k => $v) echo $v["result"] ."\n";
If the goal is to add the character "a" twice onto each result element of each sub-array, at first glance this seems like a perfectly reasonable way to go about it.

But hold up! Each time through, we seem to have only added one "a" onto the result. And in the second foreach loop, it seems like we didn't modify the value at all! What gives?

Well, it might be obvious to most of you, but when it was costing me an hour and a half of my work day it was buried in a nest of process forking and stream handling, part of which looked something like this:
while ($currentExecution) {
    foreach ($currentExecution as $i => $handle) {
        $handle['output'] .= stream_get_contents($handle['outstream']);
        $status = proc_get_status($handle['process']);
        if (!$status['running']) {
            fclose($handle['outstream']);
            proc_terminate($handle['process']);
            echo $handle['output']."\n";
            unset($currentExecution[$i]);
        }
    }
}
Same concept as before: I have an array of process handles (opened with proc_open) and I'm looping through, gathering the output of each process as it is generated, until the process is finished running, at which point I close the process and display the output. The symptom was that, sometimes, I would only see the end of the output and not the beginning.

The times when this happened were the times where a process handle went through the outer while loop more than once, the same as in the contrived example above. The reason for this is because when you are working with a value inside a foreach loop, you are actually working with a copy of that value (technically a copy-on-write). So modifications made to the value do not persist through multiple passes through the while loop.

As it turns out, the solution here is quite simple. Actually, there are several solutions. Here's the easiest one, applied to the original example:
<?php
$a = array(array("result" => "a"));
for ($i=0; $i < 2; $i++) {
    foreach ($a as $j => &$v) {
        $v["result"] .= "a";
        echo $v["result"] ."\n";
    }
}
foreach ($a as $k => $v) echo $v["result"] ."\n";
The difference is subtle: it's the "&" prepended to the $v in the foreach loop. This tells the loop to use a reference to the value instead of a copy. Here's another possible solution:
<?php
$a = array(array("result" => "a"));
for ($i=0; $i < 2; $i++) {
    foreach ($a as $j => $v) {
        $a[$j]["result"] .= "a";
        echo $a[$j]["result"] ."\n";
    }
}
foreach ($a as $k => $v) echo $v["result"] ."\n";
In this case, we're not even using the variable $v. We're using the array index to refer to the value inside of the array directly. Because we're modifying the actual array value and not the copy, the changes persist through the loop.

The final solution (and the one I ended up going with) is to make the value an object instead of an array:
<?php
$obj->result = "a";
$a = array($obj);
for ($i=0; $i < 2; $i++) {
        foreach ($a as $j => $v) {
                $v->result .= "a";
                echo $v->result ."\n";
        }
}
foreach ($a as $k => $v) echo $v->result ."\n";
Objects in PHP are always passed around by reference, including in foreach loops. So modifying the object's property inside the loop modifies the actual object, not a copy.

The most annoying thing about this whole experience? I had solved this problem for someone else a few weeks ago, telling them what the issue was just from hearing them describe the situation and without even seeing their code run. So the deeper lesson here is this: when you've been staring at code for so long that it's making you cross-eyed and frustrated, walk away. And most importantly, get someone else who hasn't been working on it to look it over.

...something something forest...something something trees...

Sample code available at http://gist.github.com/1263522