2011-04-22

Tropo-Turing Collision: Voice Chat Bots with Tropo

On the first evening of PHPComCon this year, Tropo sponsored a beer-and-pizza-fueled hackathon. Having never heard of Tropo before, I decided to check it out.

Tropo provides a platform for bridging voice, SMS, IM and Twitter data to create interactive applications in Javascript, PHP, Python and a couple other languages. An example would be the call-center navigation menus with which many people are familiar, or a voicemail system that sends a text message when a new message is received. But those are only the tip of the iceberg.

To explore a little deeper, I thought it would be neat to create a voice chat-bot. The idea would be that a caller could talk to an automated voice in a natural way, and the voice would respond in a relevant way and push the conversation along. Since I wasn't aiming for Turing test worthiness, a good starting point was ELIZA, one of the first automated chat-bots. I thought it was a pretty ambitious project, but it turns out that Tropo's system handles all of the functionality right out of the gate.

The docs do a great job of explaining setting up an account and creating an application, so I'm going to jump right into the code (in PHP).

I like to keep my code relatively clean and organized, with functions and classes in their own files. So the first thing I did was create a hosted file called "Eliza.php" with the following contents:
class Eliza
{
  public function respondsTo($statement="")
  {
     $responses = array(
       "0" => "one",
       "1" => "two",
       "2" => "three",
       "3" => "four",
       "4" => "five",
       "5" => "six",
       "6" => "seven",
       "7" => "eight",
       "8" => "nine",
       "9" => "zero",
     );

     $response = "Please pick a number from 0 to 9";
     if (isset($responses[$statement])) {
       $response = $responses[$statement];
     }

     return $response;
  }

  public function hears($prompt)
  {
    $result = ask($prompt, array(
      "choices" => "[1 DIGIT]"
    ));
    $statement = strtolower($result->value);
    return $statement;
  }
}
It's important to note that the opening <?php and closing ?> should be left out of this file, or the next bit will fail.

The main functionality of the application is in the `Eliza` class. Eliza will translate a user input string into a response, which it will then use to prompt the user. The `ask()` function in the `hears()` method is functionality provided by Tropo's system that takes care of the text-to-speech and speech-to-text aspect of prompting the caller, and then waits for the caller to respond. The `choices => [1 DIGIT]` option to `ask()` hints that we expect the user to respond to our prompt with a single 0-9 character.

Next, I created a file called "chatbot.php" with the following contents:
<?php
$url = "http://hosting.tropo.com/00001/www/Eliza.php";
$ElizaFile = file_get_contents($url);
eval($ElizaFile);

$eliza = new Eliza();
$statement = "";
do {
  $prompt = $eliza->respondsTo($statement);
  $statement = $eliza->hears($prompt);
  _log("They said ".$statement);
} while (true);
?>
This is the entry script for the application, which can be set on the "Application Settings" page. Unfortunately, it does not look like Tropo supports `require` and `include` in their system. Fortunately, what they do provide are URLs to download the contents of any hosted file via simple `file_get_contents`. So we "inlcude" our Eliza.php file by downloading its contents, then `eval`ing them into the running scripts scope. Note: Yes I could have just written the contents of Eliza.php into the chatbot.php file, but a) it was more fun to try and find a way around that limitation :-) and b) many developers separate their code this way to keep it clean, encapsulated and reusable and this demonstrates a way to accomplish that.

The code is fairly self-explanatory: "include" and instantiate Eliza, then enter a prompt-respond loop which will last until the caller hangs up. `_log()` outputs to Tropo's built in application debugger.

I have to say congratulations to Tropo for creating a platform that makes all this easy. I had this up and running (except the file include portion) in about 15 minutes. You can try this out for yourself by calling (919) 500-7747.

So now that prompt-response was working, I could get started on making a real Eliza chat-bot that would parse the caller's speech and provide an Eliza-like response:

Caller: I'm building a voice chat application.
Eliza: Tell me more about voice chat application.
Caller: You're it!
Eliza: How do you feel about that?
Caller: I think it's pretty neat.

There are probably a hundred Eliza implementations on the web, and at least a dozen are written in PHP. I grabbed the first one I found, shoved it into the `Eliza::respondsTo()` method, and called up my application.

And this is where I hit the iceberg. In order to accomplish what I wanted, I needed the `ask()` function to be able to accept and parse any spoken words into a string. This meant getting rid of the `choices => [1 DIGIT]` line. As soon as I called, Eliza started prompting me over and over for input, until Tropo's system killed the loop and hung up.

Luckily, a Tropo guy (who was great to talk to, but who's name I have unfortunately forgotten) informed me that `ask()` works by using the `choices` option as training for the speech-to-text parser. There is a way to do generalized speech-to-text but it is incredibly processor intensive, and would have to make use of an asynchronous call. Later that evening, @akalsey confirmed that the current state of the technology (everyone's, not just Tropo's) makes generalized real-time speech-to-text processing impossible. So the dream of speaking with Eliza instead of just IMing with her dies unfulfilled, or is at least put on the shelf until technology catches up with ideas.

I did learn some about Tropo's service, though, so I count the hackathon successful. Thanks again, Tropo, for the great new tool!

No comments:

Post a Comment