2013-05-01

Serializing Data Like a PHP Session

PHP has several built-in ways of serializing/unserializing data. The most cross-platform is json_encode; pretty much every programming stack can JSON decode data that has been encoded by any other stack. There's also PHP's native serialize function, which is not as cross-platform, but has the added benefit of being able to store and restore PHP objects to their original class.

There's a third, and much lesser known PHP serialization format: the format that PHP uses to store session data. If you have ever popped open a PHP session file, or stored session data in a database, you may have noticed that this serialization looks very similar to the serialize function's output, but it is not the same.

Recently, I needed to serialize data so that it looked like PHP session data (don't ask why; I highly suggest not doing this if it can be avoided.) It turns out, PHP has a function that encodes data in this format: session_encode. Great! I'll just pass my array of data to it and...

Oh wait. session_encode doesn't accept any arguments. You can't pass data to it. It just takes whatever is in the $_SESSION superglobal and serializes it. There is no built-in function in all of PHP that will serialize arbitrary data for you the same way that it would be serialized into a session.

There are a few userland implementations of PHP's built-in session serialization, mainly built around string splitting and regexes. All of them handle scalar values; some handle single-level arrays. None of them handle nested arrays and objects, and some have trouble if your data contains certain characters that are used in the encoding.

So I came up with my own functions for reading/writing arbitrary session data without overwriting the existing session. (Edit: I later noticed that a few people suggest a similar method in the comments on the PHP manual pages for session_encode and session_decode.):


Using these functions requires there to be an active session (session_start must have already been called.) Edit: Thanks to Rasmus Schultz for also pointing out that session_encode might be disabled on some systems due to security concerns.

PHP already has a built-in way to serialize and unserialize session data. The problem is that it only serializes from and unserializes into the PHP $_SESSION global. We probably don't want to overwrite the current $_SESSION. We hold a copy of whatever data is already in $_SESSION, then use it to perform our data serialization, then restore it afterwards. And because we're using PHP's built-in session serialization, we get nested array and object serialization for free, and we didn't have to write our own parser.

4 comments:

  1. Note that for security-reasons, session_encode() may be turned off on some systems.

    Another interesting serialization technique is to use var_export() and write to a flat .php file - then include/require that file to "unserialize". With a bytecode cache configured, this also happens to perform very well. Not a good idea for objects necessarily, but for data (arrays/strings/numbers) it may be an option.

    Also consider this library if you need to serialize entire object graphs to JSON:

    https://github.com/mindplay-dk/jsonfreeze

    100 ways to skin a cat :-)

    ReplyDelete
  2. There are definitely better ways to serialize data than PHP session format. And I recommend exhausting each and every one of them before falling back to this method. Unfortunately, for my specific use-case, there was no other option. The jsonfreeze library looks pretty neat; I'll put it on my list of libraries to play with.

    ReplyDelete
  3. Your projects exceeds a lot of other articles I have seen online. You're very gifted at that which you do and I think you'll continue. Congrats!

    ReplyDelete
  4. While reviewing Joomla! Vulnerabilities I felt a glitch in
    the matrix. For more info, please visit dotlogics.com

    ReplyDelete