Abe Shultz
December 12, 2011

My project is a chatbot which attempts to impersonate me on instant messenger (IM). It attempts to determine the important words in messages that people send to my IM account and respond appropriately. To do this, it uses a Bayesian model trained on two years of my chat logs.

Concepts Demonstrated

  • Bayesian Classification is used to determine which word of an incoming message is likely to be the most important word in the message.
  • TF-IDF, or Text Frequency - Inverse Document Frequency, is used to generate training data for the Bayesian classifier from my IM logs.
  • A Markov Chain text generator was implemented, but not integrated with the final project. It was intended to be used to generate the response messages, but was too incoherent to be useful.


The main innovation in the project is the tools for tagging chat conversations. TF-IDF is usually used for longer, more coherent texts, such as books, rather than conversations. Despite this, the TF-IDF tagger does a reasonable job of detecting important words and tagging sentences accordingly.

The Markov chain text generator is based heavily on a Perl version of the same algorithm that I wrote 8-10 years ago. It is useful for producing amusing output, such as "Arctic white owl, has less value than cowdung. Its power is the gun!", but it rarely generates anything that could be easily passed off as the product of a sane, sober human writer.

The Bayesian classifier is a simple implementation of Bayes' Rule, calculated from the word frequencies of other people's chat messages. There are also two heuristics run before and after the classifier to try to make the chatbot more convincing. First, if someone sends a message that exactly matches one that someone has sent before, the chatbot sends my logged reply to that message. This heuristic is based on the assumption that whatever I said immediately after receiving the incoming message was an appropriate response, which holds in cases such as greetings and other social call-response pairs. It also saves the computational time of calculating the keyword of the message.

Second, if calculating the keyword of the message fails, that is, the keyword does not match any word for which there is a response available, the chatbot selects a random response from the messages that the TF-IDF scoring algorithm had no high-scoring word. These messages are typically short utterances that work in many contexts, such as "Hmm." or "Um, yeah.". Unfortunately, "*hug*" and "*nuzzle*" also appear in this list, so the chatbot is randomly affectionate. Prior to the addition of this heuristic, the chatbot had exactly one response if it did not find a good keyword:

(11:37:05 AM) Girlfriend: Hi. *curls up in lap, nuzzles*

(11:37:11 AM) Me: Puny mortal, I have no good response!

(11:37:22 AM) Girlfriend: . . . ? XD

Evaluation of Results

The project succeeded in creating a chatbot that can frequently recognize the most important word of a incoming IM message. However, the project does not include word sense detection, and so it cannot tell the difference between "apple"; the fruit, "apple"; the record company, and "apple"; the computer manufacturer. Further, because it uses my old chat messages, it tends to send responses that are outdated or inaccurate. Because my IM chatting is mostly with my romantic partners, the chatbot's responses may be inappropriately familiar or affectionate.

Additional Remarks

This is my writeup:


This is the code for the chatbot, log processing stuff, and Markov chain text generation: