2012-08-31

Plumbing PHP

When building a project with Neo4j or most other graph databases, it is impossible to avoid learning about Tinkerpop's excellent Gremlin graph processing language. The processing layer of Gremlin is built on top of Pipes, a dataflow programming library.

I was inspired by the syntax and ease-of-use of Gremlin to build a simple processing pipeline library in PHP. The result is Plumber, a library for easily building extensible deferred-processing pipelines.

Plumber is built on top of PHP's native Iterator library. The idea is simple: instantiate a new processing pipeline, attach processing pipes to it, then send an iterator through the pipeline and iterate over the results in a foreach. Each element that comes out the other end of the pipeline has been passed through an processed by each pipe. And because Plumber uses iterators, it natively supports lazy-loading and Just-In-Time evaluation.

A simple example would be reading a set of records from a database, formatting them in some manner, then echo'ing them out to the screen:
$users = // code to retrieve user records from a database as an array or Iterator...

$names = array();
foreach ($users as $user) {
    if (!$user['first_name'] || !$user['last_name']) {
        continue;
    }

    $name = $user['first_name'] . ' ' . $user['last_name'];
    $name = ucwords($name);
    $name = htmlentities($name);
    $names[] = $name;
}

// later on, display the names
foreach ($names as $name) {
    echo "$name<br>";
}
There are a few obvious downsides to doing things this way: the entire set of records is looped through more than once; all the records must be in memory at the same time (twice even, once for $users and once for $names); and the processing steps in the foreach are executed immediately on every record. These may not seem like a big deal if the record set is small and the processing steps are trivial, but they can become big problems if you are not careful.

Here is the same code using Plumber:
$users = // code to retrieve user records from a database as an array or Iterator...

$names = new Everyman\Plumber\Pipeline();
$names->filter(function ($user) {
        return $user['first_name'] && $user['last_name'];
    })
    ->transform(function ($user) {
        return $user['first_name'] . ' ' . $user['last_name'];
    })
    ->transform('ucwords')
    ->transform('htmlentities');

// later on, display the names
foreach ($names($users) as $name) {
    echo "$name<br>";
}
The list of $users is only looped through one time, and there is no need to keep a separate list of $names in sync with the $users list. Each $user is transformed into a $name on-demand, keeping resources free.

This can all be accomplished using Iterators, but there is quite a bit of boilerplate code involved. Plumber is meant to remove most of the boilerplate and let the developer concentrate on writing their business logic.

There is more to Plumber, including several built-in pipe types, and the ability to extend the library with your own custom pipes. It is also not necessary to use the fluent interface, if that is not your style. More usage information can be found in the README file in the Plumber github repo. Constructive feedback is always welcome!

5 comments:

  1. José Antonio García DíazOctober 13, 2012 at 9:56 AM

    I think i'ts a great idea and a good software. Congratulations

    ReplyDelete
  2. Knowledge giving
    Article! I appreciate you. I completely agree with you. If we talk about
    current scenario then it is must be update. I enjoyed reading. I would like
    to visit more for more queries.

    http://changemacaddress.com

    ReplyDelete
  3. I think this is one amongst them who provides good and neat services.
    Plumber Alabaster, Al

    ReplyDelete
  4. Nice sharing, Thanks for share with us, I will be come back to your next post, Good luck!!!


    Plumber Birmingham, Al

    ReplyDelete
  5. I guess how this entire plumber setup really works is that it forms a chain which is made up of a series of processes that could be something continuous or a few steps that vary from one another depending on the requirement of the overall setup. I am not very familiar with this type of programming language but I can say one thing for sure is that this set of explanation works well and is quite clear in clarifying doubts for those who are new in this field just like me. The supporting texts also add on to the clarification well.

    ReplyDelete