2011-05-16

Logging User Sessions Across Requests

When tracking down a bug, few things aid the process more than good application logs. And this is especially true when the bug has already escaped to your production systems. In these cases, you may not be able to inject test data into your database, cowboy-code in some `var_dump`s or perform the same action over and over again. Logs can save a lot of time and effort when tracking down issues in a live application.

But logs also have a few shortcomings. In an active application with multiple users, all the log messages for every user are piled together in the same log file. If the application is multi-user (like most web applications), log messages from different requests can be interwoven. This makes it a nightmare to try and track the log messages for a single user request, let alone messages across several requests by the same user.

One way to handle this is to put a request-specific identifier in every log message. But I shouldn't have to remember to append or prepend the identifier to their log messages. I'd rather have it happen automatically, without me or my teammates having to think about it.

Here's a method we've been using to try and untangle the mess and retain the usefulness of our logs. The code uses Zend's logging component, but can easily be adapted to other log systems.

First, we start up a log and tell it how to format each log line.
$format = "%timestamp% [%logId%]: %message%" . PHP_EOL;
$formatter = new Zend_Log_Formatter_Simple($format);

$writer = new Zend_Log_Writer_Stream('/path/to/application.log');
$writer->setFormatter($formatter);

$log = new Zend_Log();
$log->addWriter($writer);
In the format string, names wrapped in "%" signs are log variables that will be filled in when a message is logged. "timestamp" and "message" are provided by Zend and will be filled with the appropriate values when the logging call is made. "logId" is a custom symbol, which means we need to tell the log object what value to give it.

In order to track all the log messages across a single user request, we need to pick a unique value for the log id:
// The last 8 characters of a uniqid are unique enough for our purposes
$logId = substr(uniqid(), -8);
$log->setEventItem('logId', $logId);
Now we wait for a few users to make requests at the same time:
// A web request
$log->info('The user made some request');
$log->info('Here is some output of that request: '.$output);

// A different, simultaneous request from another user
$log->info('The user made some request');
$log->info('Here is some output of that request: '.$output);

// Another request from the first user
$log->info('Another user request');
Here's the log output:
2011-05-16 19:04 [a73fe12a]: The user made some request
2011-05-16 19:04 [24f90ae1]: The user made some request
2011-05-16 19:04 [24f90ae1]: Here is some output of that request: Hello, World
2011-05-16 19:04 [a73fe12a]: Here is some output of that request: Hello, World
2011-05-16 19:04 [feb40a52]: Another user request
Now we can easily track a single request's messages in the log, even if multiple requests happen at the same time.

But we can do better. Often times, bugs are not the result of a single user request, but of a sequence of requests. If we only track logs on a per request basis, we haven't helped ourselves track request-spanning bugs. Luckily, we already have a value that uniquely identifies a user across multiple requests: their PHP session id. We can use it, in addition to a request identifier:
// The first 8 characters of the session id are unique enough
session_start();
$sessId = substr(session_id(), 0, 8);
$requestId = substr(uniqid(), -8);
$logId = $sessId.'-'.$requestId;
$log->setEventItem('logId', $logId);
And here's what our logs look like now:
2011-05-16 19:04 [b74be12f-a73fe12a]: The user made some request
2011-05-16 19:04 [075aef5e-24f90ae1]: The user made some request
2011-05-16 19:04 [075aef5e-24f90ae1]: Here is some output of that request: Hello, World
2011-05-16 19:04 [b74be12f-a73fe12a]: Here is some output of that request: Hello, World
2011-05-16 19:04 [b74be12f-feb40a52]: Another user request
And there it is. We can use the log id in useful ways, such as displaying it to the user with a "contact tech support" message. When a user calls or emails and says their "support id" is "b74be12f-a73fe12a" we not only have the log messages for the request that gave them the error, but all the messages for their entire session:
# All the request log messages
> grep 'b74be12f-a73fe12a' /path/to/application.log
# All the session log messages
> grep 'b74be12f-' /path/to/application.log

Here's all the code together:
$format = "%timestamp% [%logId%]: %message%" . PHP_EOL;
$formatter = new Zend_Log_Formatter_Simple($format);

$writer = new Zend_Log_Writer_Stream('/path/to/application.log');
$writer->setFormatter($formatter);

$log = new Zend_Log();
$log->addWriter($writer);

session_start();
$sessId = substr(session_id(), 0, 8);
$requestId = substr(uniqid(), -8);
$logId = $sessId.'-'.$requestId;
$log->setEventItem('logId', $logId);

$log->info('The user made some request');
$log->info('Here is some output of that request: '.$output);

2011-05-03

Interview with Bulat Shakirzyanov

Today on Cal Evans' "Voices of the ElePHPant" there was an interview with Bulat Shakirzyanov. His comments about mocking PHP's built-in functions reminded me of this PHP built-in mocking library I had made a little while ago, then forgotten about. Bulat's example talks about `mkdir`, which the built-in mock library would handle, but for filesystem specific functionality, I also highly recommend vfStream. It allows you to create a virtual filesystem, comeplete with directory hierarchy and files with contents, all from within a unit test, and all without needing an actual filesystem.